Test Explorer
Test Cases Health and Performance Dashboard - Flakiness, Failure Rate, Duration.
Last updated
Was this helpful?
Test Cases Health and Performance Dashboard - Flakiness, Failure Rate, Duration.
Last updated
Was this helpful?
The Test Explorer displays performance and health metrics for individual tests. The metrics are:
Failure Rate
Failure Volume
Flakiness Rate
Flakiness Volume
Duration
Duration Volume
Use it to identify underperforming tests, explore the trends and changes in test behavior.
Currents calculates the metrics by aggregating the test results. You can fine-tune the aggregations by applying various filters, for example:
What are the 30-day flakiest tests from the main
branch?
What are the 14-fay most failing tests tagged onboarding
?
What are the longest tests for mobile
viewport?
The Test Explorer metrics help you to evaluate the health and performance of your testing suite.
The average execution time for fully completed tests, excluding cancelled or skipped tests during execution.
Duration Volume measures how much total time a test contributes to the overall runtime of the test suite. It’s not just about how long a test takes per run, but also how often it runs.
Duration Volume = Avg.Duration × Executions
The raw number isn’t important on its own — it helps prioritize which tests are the biggest time sinks across all runs.
It measures the percentage of times a test fails when it is executed and provides insights into its reliability and stability. A higher failure rate may indicate potential issues or bugs within the test or the system under test.
Failure Volume measures how much a test contributes to the total number of failures in your test suite — combining how often it runs with how likely it is to fail. It’s calculated as:
Failure Volume = Failure Rate × Executions
This metric helps you spot which tests are the most significant contributors to failure noise, even if their failure rate isn’t high.
It measures the percentage of times a test produces inconsistent pass/fail results. Analyzing the Flakiness Rate metric allows users to focus on improving the reliability and stability of flaky tests, reducing false positives or negatives, and enhancing the overall trustworthiness of the test suite.
Quantify how much a test’s flakiness impacts the overall stability of your test suite. It combines how often a test runs with how flaky it is, giving a sense of how much the test is likely to cause inconsistencies or unreliable results. A test that runs frequently with a low flakiness rate could cause more issues overall than a test that rarely runs but is highly flaky.
Flakiness Volume = Flakiness Rate × Executions
How many recordings were included for calculating the metrics — i.e. matched the period and the filters.
Metrics like Duration Volume, Flakiness Volume and Failure Volume measure the impact of the associated test on overall suite performance. The scores are calculated by multiplying the corresponding metric by the number of samples. The actual number has no real meaning - it's just a numerical expression.
For example, consider two tests:
Test A runs rarely, reported 10
samples, with a 15%
flakiness rate.
Test B runs often, reported 40
samples, with a 5%
flakiness rate.
Test A Flakiness Volume is 10 x 0.15 = 1.5
Test B Flakiness Volume is 50 x 0.05 = 2
Test B has a higher Flakiness Volume because it affects the overall test suite flakiness more, although its rate is lower.
In short, a test that’s a little flaky but runs a lot can be a bigger problem than a very flaky test that rarely runs. The actual number doesn’t matter on its own — it’s just useful to compare tests and see which ones are dragging down reliability the most.
Only test recordings that match the active filters are included in the metric calculations.
Date Range - include items recorded within the specified period.
Tag - include items with the matching Playwright Tags.
Author - include items with the matching Git Author (see Commit Information).
Branch - include items with the matching Git Branch (see Commit Information).
Group - include items recorded for a particular group (e.g. Firefox
or Chromium
).
Search by spec name - narrow down the results by test spec name.
Search by test title - narrow down the results by test title.
Click on a column header to sort the tests by the corresponding column, and then click again to change the sorting order.
Include Failed Executions - include or exclude the failed execution in calculating the average duration
Include Skipped Executions - include or exclude skipped tests from being included in metrics
Here are a few examples of what information you can get from the Test Explorer:
The flakiest tests from the last months for a specific branch.
The top failing tests in the suite.
The failure rate change for specific branches for the past months.
The lowest tests and how they changed their duration over time.
Clicking on a test title reveals a dedicated view of the specific test's performance, including a detailed History of Executions, Performance Metrics and Top Errors analysis.
Use Metrics tab to analyze the test health trends.
See the Reference section for more details on the Performance Charts and History of the tests.
Schedule Automated Reports with the top items from the Test Explorer view to automatically arrive to your inbox for proactive monitoring of test suite health.
View Reference for detailed information on test History and Performance Charts.
Click on the Export icon to download the data in JSON format
Click on the Settings icon to customize the view: