Dynamic Playwright Sharding in GitHub Actions

Dynamic Playwright Sharding in GitHub Actions

6 min read

Playwright's sharding feature splits your test suite across multiple runners for parallel execution. Most tutorials show you something like --shard=1/4, hardcoding four shards and calling it a day.

That approach works until it doesn't. Add 50 tests and suddenly each shard is responsible for a larger chunk of work. Remove tests and you're paying for sparsely utilized runners. These hardcoded values can easily drift as your test suite evolves.

There's a better approach, which is calculating the shard count dynamically based on your actual test suite.


The Problem with Static Sharding

Consider a typical Playwright workflow with hardcoded shards:

strategy:
  matrix:
    shardIndex: [1, 2, 3, 4]
    shardTotal: [4]
steps:
  - run: pnpm exec playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}

To be honest, this approach is perfectly fine for well defined test suites that don't change often. However, there are some notable downsides which come into play for more active projects:

Manual maintenance - Every time your test count changes significantly, someone needs to remember to update the shard count. They won't. If they do, it'll likely be further down the line when they start to notice tests are taking longer to complete, already suffering from the inefficient feedback loop.

One size fits none - Four shards might be perfect for your current test count, but if you start testing against multiple browsers or add more tests, the balance is thrown off.


Dynamic Sharding

Here's an abbreviated workflow demonstrating the dynamic shard calculation:

name: E2E Tests

on:
  pull_request:
    branches: [main]

jobs:
  generate-shards-matrix:
    name: Generate Shards Matrix
    runs-on: ubuntu-latest
    env:
      TESTS_PER_SHARD: 25
      MIN_SHARDS: 1
      MAX_SHARDS: 12
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}
    steps:
      - uses: actions/checkout@v6
        with:
          sparse-checkout: |
            e2e
            playwright.config.ts

      - name: Generate matrix
        id: set-matrix
        run: |
          TEST_COUNT=$(grep -r "test(" e2e --include="*.spec.ts" --include="*.test.ts" | wc -l)
          PROJECT_COUNT=$(grep -cE "name:\s*['\"]" playwright.config.ts)
          TOTAL_TESTS=$((TEST_COUNT * PROJECT_COUNT))

          SHARD_COUNT=$(( (TOTAL_TESTS + TESTS_PER_SHARD - 1) / TESTS_PER_SHARD ))
          SHARD_COUNT=$(( SHARD_COUNT < MIN_SHARDS ? MIN_SHARDS : SHARD_COUNT ))
          SHARD_COUNT=$(( SHARD_COUNT > MAX_SHARDS ? MAX_SHARDS : SHARD_COUNT ))

          MATRIX=$(jq -cn --argjson count "$SHARD_COUNT" \
            '[range(1; $count + 1) | { "shardIndex": ., "shardTotal": $count }]')

          echo "Tests: $TEST_COUNT x $PROJECT_COUNT projects = $TOTAL_TESTS total"
          echo "Shards: $SHARD_COUNT"
          echo "matrix=$MATRIX" >> "$GITHUB_OUTPUT"

  test:
    name: Test (${{ matrix.shardIndex }}/${{ matrix.shardTotal }})
    runs-on: ubuntu-latest
    needs: [generate-shards-matrix]
    strategy:
      fail-fast: false
      matrix:
        include: ${{ fromJSON(needs.generate-shards-matrix.outputs.matrix) }}
    steps:
      - uses: actions/checkout@v6
      # ... setup steps ...
      - run: pnpm exec playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
      - if: always()
        uses: actions/upload-artifact@v4
        with:
          name: blob-report-${{ matrix.shardIndex }}
          path: blob-report/
          retention-days: 1

Counting Tests

The first step is counting how many tests exist. You could use playwright test --list, but that requires installing dependencies and can easily waste time a lot of time. A simple grep is faster:

TEST_COUNT=$(grep -r "test(" e2e --include="*.spec.ts" --include="*.test.ts" | wc -l)

This searches for test( patterns in your test files. It's not perfect, as it might catch commented tests or false positives, but for shard calculation, approximate is fine. You don't really need exact precision to decide between 7 and 8 shards.

Accounting for Projects

Playwright projects multiply your test count. If you have 39 test definitions running across chromium and mobile-chrome, that's 78 total tests:

PROJECT_COUNT=$(grep -cE "name:\s*['\"]" playwright.config.ts)
TOTAL_TESTS=$((TEST_COUNT * PROJECT_COUNT))

Calculating Optimal Shards

With the total test count, we can calculate how many shards to use:

SHARD_COUNT=$(( (TOTAL_TESTS + TESTS_PER_SHARD - 1) / TESTS_PER_SHARD ))

Then, clamp the result to reasonable bounds:

SHARD_COUNT=$(( SHARD_COUNT < MIN_SHARDS ? MIN_SHARDS : SHARD_COUNT ))
SHARD_COUNT=$(( SHARD_COUNT > MAX_SHARDS ? MAX_SHARDS : SHARD_COUNT ))

MAX_SHARDS can be adjusted based on your needs, but having an upper limit prevents unexpected cost spikes if someone decides it's an excellent idea to open a PR adding hundreds of tests.

Generating the Matrix

GitHub Actions matrices require JSON. Use jq to generate the array:

MATRIX=$(jq -cn --argjson count "$SHARD_COUNT" \
  '[range(1; $count + 1) | { "shardIndex": ., "shardTotal": $count }]')

For 8 shards, this produces:

[
  {"shardIndex": 1, "shardTotal": 8},
  {"shardIndex": 2, "shardTotal": 8},
  {"shardIndex": 3, "shardTotal": 8},
  {"shardIndex": 4, "shardTotal": 8},
  {"shardIndex": 5, "shardTotal": 8},
  {"shardIndex": 6, "shardTotal": 8},
  {"shardIndex": 7, "shardTotal": 8},
  {"shardIndex": 8, "shardTotal": 8}
]

The test job can now consume this:

matrix:
  include: ${{ fromJSON(needs.generate-shards-matrix.outputs.matrix) }}

Additional Optimizations

Separate to dynamic sharding, there are several other optimizations that can be layered on top to further speed up your workflow runs.

Browser Caching

Cache Playwright browsers to skip re-downloading on every run:

- name: Cache Playwright browsers
  id: playwright-cache
  uses: actions/cache@v5
  with:
    path: ~/.cache/ms-playwright
    key: ${{ runner.os }}-playwright-${{ hashFiles('pnpm-lock.yaml') }}

- if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: pnpm exec playwright install chromium

Build Artifact Sharing

If your tests require a build step, run it once and share the artifact:

build:
  runs-on: ubuntu-latest
  steps:
    - run: pnpm build
    - run: tar -czf build-output.tar.gz .next .contentlayer public
    - uses: actions/upload-artifact@v6
      with:
        name: build-output
        path: build-output.tar.gz

test:
  needs: [build]
  steps:
    - uses: actions/download-artifact@v6
      with:
        name: build-output
    - run: tar -xzf build-output.tar.gz

Lightweight Report Merging

Merge blob reports without installing your project's dependencies:

merge-reports:
  needs: [test]
  if: ${{ always() && !cancelled() && needs.test.result != 'skipped' }}
  steps:
    - uses: actions/download-artifact@v6
      with:
        path: all-blob-reports
        pattern: blob-report-*
        merge-multiple: true
    - run: npx --yes @playwright/test merge-reports --reporter html ./all-blob-reports

Conclusion

Dynamic sharding eliminates the maintenance burden of hardcoded parallel counts. It's also important to note that this pattern extends beyond Playwright. Any parallelizable workload, whether test suites, build steps, or data processing, benefits from dynamic work distribution. The workflow structure remains the same: a generation job outputs JSON, and downstream jobs consume it.

Computers can do math. Let them.

NOTE

This post is inspired by an article I found by Lewis Nelson while researching this topic. I follow a similar concept, but with some differences in implementation and additional performance optimizations.