Playwright Orchestration

Balancing Playwright tests between CI machines for even load distribution.

Orchestration for Playwright can help to decrease the duration of Playwright tests in CI pipelines. We've seen up to 40% decrease in CI runs duration, compared to the traditional Playwright Sharding.

Playwright Orchestration vs Sharding

There are various approaches to running playwright tests in parallel on CI. Let's try to understand the differences.

Playwright Sharding

Sharding is natively supported by Playwright, for example, if you have want to run the tests in parallel on two machines, you would run the following command on each:

  • machine #1: playwright test --shard 1/2

  • machine #2: playwright test --shard 2/2

Playwright will use the internal heuristic to split the tests between the available shards, so that each machine runs a different set of tests. The traditional Playwright sharding method splits spec files between available shards (machines) based on filenames. It often leads to unbalanced workloads because spec files execution time can vary significantly.

Take, for example, a testing suite consisting of 4 spec files with the following durations:

  • spec01: 10 minutes

  • spec02: 10 minutes

  • spec03: 3 minutes

  • spec04: 2 minutes

With 2 machines, traditional sharding might distribute the files as follows, leading to an inefficient total execution time of 20 minutes due to one shard being heavily loaded while the other finishes quickly:

  • Shard 1: spec01, spec02 (20 minutes total)

  • Shard 2: spec03, spec04 (5 minutes total)

This imbalance means that some shards may end up processing only long-running spec files, while others quickly finish with shorter tests, leading to idle resources.

Playwright Orchestration

An alternative approach to sharding is orchestration - i.e. using an external service to act as an "orchestrator" and instruct each machine what tests to run. The "orchestrator" can make decisions on how to split the tests between the available machines based on various criteria - for example,

  • balancing tests in the most optimal way - e.g. reducing duration, prioritizing certain tests

  • dynamically changing assigned tests based on the available machines

  • rerouting tests from crashed CI containers to healthy containers (see spot instances use case below)

The most popular problem is reducing CI duration and utilizing the CI resources effectively, so let's focus on it.

An optimal assignment would consider the duration of spec files and can significantly improve the efficiency of test execution. By balancing the workload across CI machines based on their expected durations, you minimize overall execution time and make better use of resources.

For the earlier example:

  • Shard 1: spec01 (10 minutes) and spec03 (3 minutes), totaling 13 minutes.

  • Shard 2: spec02 (10 minutes) and spec04 (2 minutes), also totaling 12 minutes.

The difference in execution time 13 vs 20 minutes, which is a 35% improvement. Read more about Load Balancing.

How to enable Playwright Orchestration?

To enable Playwrigth Orchestration, you need to have an account with Currents:

After creating an account, install the most recent version of @currents/playwright.

The package has a CLI tool pwc-p, which is a small wrapper that implements Playwright Orchestration:

  • it scans the testing suite

  • it establishes an orchestration session with Currents servers

  • it runs Playwright, executing spec files in the optimal order

  • the results are recorded to Currents for troubleshooting and analysis

For example:

npm i @currents/playwright
npx pwc-p --key <record-key> --project-id <project-id> --ci-build-id <ci-build-id>

If you are not already familiar, read more about CI Build ID .

A successfully created orchestration will print an output similar to this

$ npx pwc-p --key **redacted** --project-id **redacted** --ci-build-id `date +%s`  -c ./or8n/playwright.config.ts
🚀 Starting orchestration session...
📦 Currents reporter: 1.1.2 recording CI build 1712134904 for project JJzd65, orchestration id 260264cfa16950ab4dc98d5c54333136
🎭 Playwright: 1.42.1 5 tests in 1 project [chromium]

🌐 Executing orchestrated task: [chromium] spec-or8n-e.spec.ts
🌐 Run URL: https://app.currents.dev/run/9b93659915fe653f
# ...start executing the tests in an optimal order.

pwc-p accepts additional playwright arguments and flags (see @currents/playwright) for example:

# Add additional playwright arguments and flags:
pwc-p --key <record-key> --project-id <id> --ci-build-id <build-id> -- --workers 2 --timeout 10000

Dynamically switching test between machines

An additional benefit of an external service like Currents for balancing tests is the ability to automatically reassign tests tests from one machine to another, depencing on certain conditions.

For example - many cloud providers have an option to use Spot Instances for running workloads. Using Spot Instances can cost up to 90% lower, compared to the traditionally allocated resources.

Read more CI Tests on Spot Instances.

Limitations and Nuances

  • Orchestration is only effective for suites with relatively large number of spec files (not tests)

  • Orchestration works on a file level - i.e. it balances spec files (rather than tests) between the available machines

  • Orchestration runs one spec file at a time on each machine - it is recommended to enable fullyParallel mode to fully utilize the available CPUs

Beware of the following limitations

  • Playwright Project dependencies are not currently supported - i.e. if you have projects that depend one on another, orchestration will not consider the dependencies. As workaround you can run the projects in the desired order explicitly by defining separate CI steps with --project <name> specification.

  • Global Setup and Teardown. An orchestrated execution will run a playwright command for each individual file of your testing suite. Beware, that the global setup or teardown routines will run for each spec file, accordingly.

  • Rerunning a failed CI execution requires generating a new CI Build ID also a rerun will include all the tests - not only the failed ones.

Last updated