Overview

The GPU bots run a different set of tests than the majority of the Chromium test machines. The GPU bots specifically focus on tests which exercise the graphics processor, and whose results are likely to vary between graphics card vendors.

Most of the tests on the GPU bots are run via the Telemetry framework. Telemetry was originally conceived as a performance testing framework, but has proven valuable for correctness testing as well. Telemetry directs the browser to perform various operations, like page navigation and test execution, from external scripts written in Python. The GPU bots launch the full Chromium browser via Telemetry for the majority of the tests. Using the full browser to execute tests, rather than smaller test harnesses, has yielded several advantages: testing what is shipped, improved reliability, and improved performance.

A subset of the tests, called “pixel tests”, grab screen snapshots of the web page in order to validate Chromium's rendering architecture end-to-end. Where necessary, GPU-specific results are maintained for these tests. Some of these tests verify just a few pixels, using handwritten code, in order to use the same validation for all brands of GPUs.

The GPU bots use the Chrome infrastructure team‘s recipe framework, and specifically the chromium and chromium_trybot recipes, to describe what tests to execute. Compared to the legacy master-side buildbot scripts, recipes make it easy to add new steps to the bots, change the bots’ configuration, and run the tests locally in the same way that they are run on the bots. Additionally, the chromium and chromium_trybot recipes make it possible to send try jobs which add new steps to the bots. This single capability is a huge step forward from the previous configuration where new steps were added blindly, and could cause failures on the tryservers. For more details about the configuration of the bots, see the GPU bot details.

Scan down through the steps looking for the text “GPU”; that identifies those tests run on the GPU bots. For each test the “trigger” step can be ignored; the step further down for the test of the same name contains the results.

It's usually not necessary to explicitly send try jobs just for verifying GPU tests. If you want to, you must invoke “git cl try” separately for each tryserver master you want to reference, for example:

git cl try -b linux-rel
git cl try -b mac-rel
git cl try -b win7-rel

Alternatively, the Gerrit UI can be used to send a patch set to these try servers.

Three optional tryservers are also available which run additional tests. As of this writing, they ran longer-running tests that can't run against all Chromium CLs due to lack of hardware capacity. They are added as part of the included tryservers for code changes to certain sub-directories.

Tryservers for the ANGLE project are also present on the tryserver.chromium.angle waterfall. These are invoked from the Gerrit user interface. They are configured similarly to the tryservers for regular Chromium patches, and run the same tests that are run on the chromium.gpu.fyi waterfall, in the same way (e.g., against ToT ANGLE).

If you find it necessary to try patches against other sub-repositories than Chromium (src/) and ANGLE (src/third_party/angle/), please file a bug with component Internals>GPU>Testing.

Running the GPU Tests Locally

All of the GPU tests running on the bots can be run locally from a Chromium build. Many of the tests are simple executables:

angle_unittests

gl_tests

gl_unittests

tab_capture_end2end_tests

Some run only on the chromium.gpu.fyi waterfall, either because there isn‘t enough machine capacity at the moment, or because they’re closed-source tests which aren't allowed to run on the regular Chromium waterfalls:

angle_deqp_gles2_tests

angle_deqp_gles3_tests

angle_end2end_tests

audio_unittests

The remaining GPU tests are run via Telemetry. In order to run them, just build the chrome target and then invoke src/content/test/gpu/run_gpu_integration_test.py with the appropriate argument. The tests this script can invoke are in src/content/test/gpu/gpu_tests/. For example:

If you're testing on Android and have built and deployed ChromePublic.apk to the device, use --browser=android-chromium to invoke it.

Note: If you are on Linux and see this test harness exit immediately with **Non zero exit code**, it‘s probably because of some incompatible Python packages being installed. Please uninstall the python-egenix-mxdatetime and python-logilab-common packages in this case; see Issue 716241. This should not be happening any more since the GPU tests were switched to use the infra team’s vpython harness.

Figuring out the exact command line that was used to invoke the test on the bots can be a little tricky. The bots all run their tests via Swarming and isolates, meaning that the invocation of a step like [trigger] webgl_conformance_tests on NVIDIA GPU... will look like:

You can figure out the additional command line arguments that were passed to each test on the bots by examining the trigger step and searching for the argument separator ( -- ). For a recent invocation of webgl_conformance_tests, this looked like:

The Maps test requires you to authenticate to cloud storage in order to access the Web Page Reply archive containing the test. See Cloud Storage Credentials for documentation on setting this up.

Running the pixel tests locally

The pixel tests run in a few different modes:

The waterfall bots generate reference images into cloud storage, and pass the --upload-refimg-to-cloud-storage command line argument.

The trybots use the reference images that were generated by the waterfall bots. They pass the --download-refimg-from-cloud-storage command line argument, as well as other needed ones like --refimg-cloud-storage-bucket and --os-type.

When run locally, the first time the pixel tests are run, generated reference images are placed into src/content/test/data/gpu/gpu_reference/. The second and subsequent times, if tests fail, failure images will be placed into src/content/test/data/gpu/generated.

It's possible to make your local pixel tests download the reference images from cloud storage, if your workstation has the same OS and GPU type as one of the bots on the waterfall, and you pass the --download-refimg-from-cloud-storage, --refimg-cloud-storage-bucket, --os-type and --build-revision command line arguments.

Example command line for running the pixel tests locally on a desktop platform, where the Chrome build is in out/Release:

run_gpu_integration_test.py pixel --browser=release

Running against a connected Android device where ChromePublic.apk has already been deployed:

run_gpu_integration_test.py pixel --browser=android-chromium

You can run a subset of the pixel tests via the --test-filter argument, which takes a regex:

The task ID can be found in the stdio for the “trigger” step for the test. For example, look at a recent build from the Mac Release (Intel) bot, and look at the gl_unittests step. You will see something like:

There is a difference between the isolate‘s hash and Swarming’s task ID. Make sure you use the task ID and not the isolate's hash.

As of this writing, there seems to be a bug when attempting to re-run the Telemetry based GPU tests in this way. For the time being, this can be worked around by instead downloading the contents of the isolate. To do so, look more deeply into the trigger step's log:

As of this writing, the isolate hash appears twice in the command line. To download the isolate‘s contents into directory foo (note, this is in the “Help” section associated with the page for the isolate’s task, but I‘m not sure whether that’s accessible only to Google employees or all members of the chromium.org organization):

isolateserver.py will tell you the approximate command line to use. You should concatenate the TEST_ARGS highlighted in red above with isolateserver.py's recommendation. The ISOLATED_OUTDIR variable can be safely replaced with /tmp.

Note that isolateserver.py downloads a large number of files (everything needed to run the test) and may take a while. There is a way to use run_isolated.py to achieve the same result, but as of this writing, there were problems doing so, so this procedure is not documented at this time.

Before attempting to download an isolate, you must ensure you have permission to access the isolate server. Full instructions can be found here. For most cases, you can simply run:

The above link requires that you log in with your @google.com credentials. It‘s not known at the present time whether this works with @chromium.org accounts. Email kbr@ if you try this and find it doesn’t work.

Running Locally Built Binaries on the GPU Bots

See the Swarming documentation for instructions on how to upload your binaries to the isolate server and trigger execution on Swarming.

Moving Test Binaries from Machine to Machine

To create a zip archive of your personal Chromium build plus all of the Telemetry-based GPU tests' dependencies, which you can then move to another machine for testing:

Then copy telemetry_gpu_integration_test.zip to another machine. Unzip it, and cd into the resulting directory. Invoke content/test/gpu/run_gpu_integration_test.py as above.

This workflow has been tested successfully on Windows with a statically-linked Release build of Chrome.

Note: on one macOS machine, this command failed because of a broken strip-json-comments symlink in src/third_party/catapult/common/node_runner/node_runner/node_modules/.bin. Deleting that symlink allowed it to proceed.

Note also: on the same macOS machine, with a component build, this command failed to zip up a working Chromium binary. The browser failed to start with the following error:

In a pinch, this command could be used to bundle up everything, but the “out” directory could be deleted from the resulting zip archive, and the Chromium binaries moved over to the target machine. Then the command line arguments --browser=exact --browser-executable=[path] can be used to launch that specific browser.

Adding New Tests to the GPU Bots

The goal of the GPU bots is to avoid regressions in Chrome‘s rendering stack. To that end, let’s add as many tests as possible that will help catch regressions in the product. If you see a crazy bug in Chrome's rendering which would be easy to catch with a pixel test running in Chrome and hard to catch in any of the other test harnesses, please, invest the time to add a test!

There are a couple of different ways to add new tests to the bots:

Adding a new test to one of the existing harnesses.

Adding an entire new test step to the bots.

Adding a new test to one of the existing test harnesses

Adding new tests to the GTest-based harnesses is straightforward and essentially requires no explanation.

As of this writing it isn‘t as easy as desired to add a new test to one of the Telemetry based harnesses. See Issue 352807. Let’s collectively work to address that issue. It would be great to reduce the number of steps on the GPU bots, or at least to avoid significantly increasing the number of steps on the bots. The WebGL conformance tests should probably remain a separate step, but some of the smaller Telemetry based tests (context_lost_tests, memory_test, etc.) should probably be combined into a single step.

If you are adding a new test to one of the existing tests (e.g., pixel_test), all you need to do is make sure that your new test runs correctly via isolates. See the documentation from the GPU bot details on adding new isolated tests for the gn args and authentication needed to upload isolates to the isolate server. Most likely the new test will be Telemetry based, and included in the telemetry_gpu_test_run isolate. You can then invoke it via:

This script is documented in testing/buildbot/README.md. The JSON files are parsed by the chromium and chromium_trybot recipes, and describe two basic types of tests:

GTests: those which use the Googletest and Chromium's base/test/launcher/ frameworks.

Isolated scripts: tests whose initial entry point is a Python script which follows a simple convention of command line argument parsing.

The majority of the GPU tests are however:

Telemetry based tests: an isolated script test which is built on the Telemetry framework and which launches the entire browser.

A prerequisite of adding a new test to the bots is that that test run via isolates. Once that is done, modify test_suites.pyl to add the test to the appropriate set of bots. Be careful when adding large new test steps to all of the bots, because the GPU bots are a limited resource and do not currently have the capacity to absorb large new test suites. It is safer to get new tests running on the chromium.gpu.fyi waterfall first, and expand from there to the chromium.gpu waterfall (which will also make them run against every Chromium CL by virtue of the linux-rel, mac-rel, win7-rel and android-marshmallow-arm64-rel tryservers' mirroring of the bots on this waterfall – so be careful!).

Tryjobs which add new test steps to the chromium.gpu.json file will run those new steps during the tryjob, which helps ensure that the new test won't break once it starts running on the waterfall.

Tryjobs which modify chromium.gpu.fyi.json can be sent to the win_optional_gpu_tests_rel, mac_optional_gpu_tests_rel and linux_optional_gpu_tests_rel tryservers to help ensure that they won't break the FYI bots.

Debugging Pixel Test Failures on the GPU Bots

If pixel tests fail on the bots, the stdout will contain text like:

See http://chromium-browser-gpu-tests.commondatastorage.googleapis.com/view_test_results.html?[HASH]

This link contains all of the failing tests' generated and reference images, and is useful for figuring out exactly what went wrong. Issue 898649 tracks improving this user interface, so that the failures can be surfaced directly in the build logs rather than having to dig through stdout.

Updating and Adding New Pixel Tests to the GPU Bots

Adding new pixel tests which require reference images is a slightly more complex process than adding other kinds of tests which can validate their own correctness. There are a few reasons for this.

The reference images must be generated by the main waterfall. The try servers are not allowed to produce new reference images, only consume them. The reason for this is that a patch sent to the try servers might cause an incorrect reference image to be generated. For this reason, the main waterfall bots upload reference images to cloud storage, and the try servers download them and verify their results against them.

The try servers will fail if they run a pixel test requiring a reference image that doesn't exist in cloud storage. This is deliberate, but needs more thought; see Issue 349262.

If a reference image based pixel test‘s result is going to change because of a change in a third party repository (e.g. in ANGLE), updating the reference images is a slightly tricky process. Here’s how to do it:

Commit the change to the third party repository, etc. which will change the test's results

Note that without the failure expectation, this commit would turn some bots red, e.g. an ANGLE change will turn the chromium.gpu.fyi bots red

Wait for the third party repository to roll into Chromium

Commit a change incrementing the revision number associated with the test in the test pages

Commit a second change removing the failure expectation, once all of the bots on the main waterfall have generated new reference images. This change should go through the commit queue cleanly.

When adding a brand new pixel test that uses a reference image, the steps are similar, but simpler:

In the same commit which introduces the new test, mark the pixel test as failing without platform condition in the pixel test's test expectations

Wait for the reference images to be produced by all of the GPU bots on the waterfalls (see [chromium-gpu-archive/reference-images])

Commit a change un-marking the test as failing

When making a Chromium-side (including Blink which is now in the same Chromium repository) change which changes the pixel tests' results:

In your CL, both mark the pixel test as failing without platform condition in the pixel test‘s test expectations and increment the test’s version number associated with the test in the test pages

After your CL lands, land another CL removing the failure expectations. If this second CL goes through the commit queue cleanly, you know reference images were generated properly.

In general, when adding a new pixel test, it's better to spot check a few pixels in the rendered image rather than using a reference image per platform. The GPU rasterization test is a good example of a recently added test which performs such spot checks.

Stamping out Flakiness

It‘s critically important to aggressively investigate and eliminate the root cause of any flakiness seen on the GPU bots. The bots have been known to run reliably for days at a time, and any flaky failures that are tolerated on the bots translate directly into instability of the browser experienced by customers. Critical bugs in subsystems like WebGL, affecting high-profile products like Google Maps, have escaped notice in the past because the bots were unreliable. After much re-work, the GPU bots are now among the most reliable automated test machines in the Chromium project. Let’s keep them that way.

Flakiness affecting the GPU tests can come in from highly unexpected sources. Here are some examples:

Intermittent pixel_test failures on Linux where the captured pixels were black, caused by the Display Power Management System (DPMS) kicking in. Disabled the X server's built-in screen saver on the GPU bots in response.

A change to Blink's memory purging primitive which caused intermittent timeouts of WebGL conformance tests on all platforms (Issue 840988).

If you notice flaky test failures either on the GPU waterfalls or try servers, please file bugs right away with the component Internals>GPU>Testing and include links to the failing builds and copies of the logs, since the logs expire after a few days. GPU pixel wranglers should give the highest priority to eliminating flakiness on the tree.