
The test passed locally.
It failed in CI.
It passed on retry.
Then it failed again when the release was tagged.
Everyone blamed “timing”.
It wasn’t timing.
I’ve seen this story play out in teams migrating from Selenium to Playwright. The assumption is simple:
“Playwright auto-waits. Flakiness is solved.”
It isn’t.
Auto-waiting removes a class of problems. Modern flakiness lives elsewhere.
Let’s break down where it really comes from — and how to fix it properly.
Playwright Is Not Magic
Playwright does a lot for you:
- Waits for elements to be actionable
- Waits for navigation
- Retries assertions
- Handles detached DOM internally
B ut none of this guarantees business-state synchronization.
This test looks correct:
await page.click('button#submit');
await expect(page.locator('.success')).toBeVisible();It can still flake.
Why?
Because:
- The UI shows a success state optimistically.
- The backend responds slightly slower in CI.
- A React re-render detaches the node mid-assertion.
- A background API call fails silently.
Auto-waiting handles DOM readiness.
It does not handle system correctness.
That distinction is everything.
The 5 Real Causes of Flakiness in Modern Playwright Suites
1️⃣ UI State Race Conditions
Modern frontends (React, Vue, Next.js) render before async state stabilizes.
Example scenario:
- Button click triggers API call.
- UI immediately shows “Processing…”.
- Backend returns 500.
- UI rolls back.
Your test may assert too early.
Fix:
Wait for a business signal — not just visibility.
await Promise.all([
page.waitForResponse(resp =>
resp.url().includes('/api/order') && resp.status() === 200
),
page.click('#submit')
]);Now you’re synchronizing with system behavior, not UI illusion.
2️⃣ Network Mock Drift
Teams mock APIs heavily in Playwright.
At first, it feels powerful:
await page.route('**/api/user', route =>
route.fulfill({ status: 200, body: JSON.stringify(mockUser) })
);But months later:
- The real API contract changes.
- The frontend logic evolves.
- Your mocks stay frozen in time.
Tests pass. Production breaks.
That’s not flakiness — that’s false confidence.
Fix:
- Validate mocks against real schemas.
- Run hybrid suites (mocked + real backend).
- Periodically replay production traffic against tests.
3️⃣ Parallel Test Data Collisions
Playwright encourages parallelism.
In CI:
Workers: 6Suddenly:
- Two tests modify the same user.
- One deletes a record another expects.
- Tests pass locally (single worker).
- Fail in CI.
Classic.
Fix:
- Generate unique test data per worker.
- Use isolated tenants.
- Avoid shared state.
- Clean up aggressively.
Parallel execution exposes architectural weaknesses. That’s a feature — not a bug.
4️⃣ Detached Elements (Still Happens)
Yes, Playwright handles stale elements better than Selenium.
But this still fails:
const button = page.locator('#submit');
await someAsyncOperation();
await button.click();If the component re-renders between lines, you may hit unexpected behavior.
Fix:
- Avoid long gaps between locator creation and action.
- Use locators directly at action time.
- Assert state before interacting.
Playwright reduces stale element pain. It doesn’t eliminate reactive UI complexity.
5️⃣ Retries Masking Real Problems
Playwright retries assertions automatically.
You can also configure test retries.
That’s useful.
But overused retries create a silent decay:
- The suite “mostly passes.”
- CI occasionally flickers.
- Engineers ignore red builds.
- Trust erodes.
Retries are a diagnostic tool.
They are not a stability strategy.
If a test needs 3 retries to pass, it is telling you something about your system.
Listen to it.
The Bigger Truth: Flaky Tests Often Reveal System Design Problems
In most cases, persistent Playwright flakiness is caused by:
- Weak state management
- Inconsistent API contracts
- Shared mutable test data
- Poor observability
- Lack of deterministic backend behavior
Testing doesn’t create these issues.
It exposes them.
When you fix flakiness properly, you often improve:
- API consistency
- Idempotency
- Error handling
- Logging
- Frontend state control
That’s why mature SDETs don’t “fix flaky tests.”
They stabilize systems.
Advanced Stabilization Patterns That Actually Work
Here’s what consistently works in real-world CI pipelines:
✅ Wait for Business Signals, Not DOM State
Use waitForResponse, waitForRequest, or domain-specific markers.
✅ Assert Transitions, Not End States
Instead of:
await expect(success).toBeVisible();Do:
await expect(loader).toBeVisible();
await expect(loader).toBeHidden();
await expect(success).toBeVisible();You validate the journey, not just the destination.
✅ Control Test Data Deterministically
- Unique identifiers per worker
- Clean teardown
- API-driven setup instead of UI setup
✅ Use Tracing Properly
Playwright’s trace viewer is underused.
Enable it:
--trace onAnalyze:
- Network timing
- DOM snapshots
- Actionability logs
Most flakiness leaves fingerprints there.
✅ Separate UI Instability from Backend Instability
If the API is unstable:
- That’s not a UI test problem.
- That’s a system reliability problem.
Tests should surface it — not hide it.
Final Thought
If your Playwright suite is flaky, it’s probably not because you forgot await.
It’s because:
- Your UI lies before backend confirmation.
- Your mocks drifted from reality.
- Your data layer isn’t isolated.
- Your system isn’t deterministic.
Playwright is a powerful framework.
But stability isn’t a framework feature.
It’s an engineering discipline.
And once you start fixing flakiness at that level, you stop being “the automation person.”
You become the engineer who makes the system reliable.
Writer : Sourojit Das
— Bhuwan Chettri
Editor, CodeToDeploy
CodeToDeploy Is a Tech-Focused Publication Helping Students, Professionals, And Creators Stay Ahead with AI, Coding, Cloud, Digital Tools, And Career Growth Insights.