Test UI performance

User interface (UI) performance testing ensures that your app not only meets its functional
requirements, but that user interactions with your app are buttery smooth, running at a
consistent 60 frames per second (why
60fps?), without any dropped or delayed frames, or as we like to call it, jank. This
document explains tools available to measure UI performance, and lays out an approach to
integrate UI performance measurements into your testing practices.

Measure UI performance

In order to improve performance you first need the ability to measure the performance of
your system, and then diagnose and identify problems that may arrive from various parts of your
pipeline.

dumpsys is an
Android tool that runs on the device and dumps interesting information about the status of system
services. Passing the gfxinfo command to dumpsys provides an output in logcat with
performance information relating to frames of animation that are occurring during the recording
phase.

> adb shell dumpsys gfxinfo <PACKAGE_NAME>

This command can produce multiple different variants of frame timing data.

Aggregate frame stats

With Android 6.0 (API level 23) the command prints out aggregated analysis of frame data to logcat, collected
across the entire lifetime of the process. For example:

These high level statistics convey at a high level the rendering performance of the app, as well
as its stability across many frames.

Precise frame timing info

With Android 6.0 comes a new command for gfxinfo, and that’s framestats which provides
extremely detailed frame timing information from recent frames, so that you can track down and
debug problems more accurately.

>adb shell dumpsys gfxinfo <PACKAGE_NAME> framestats

This command prints out frame timing information, with nanosecond timestamps, from the last 120
frames produced by the app. Below is example raw output from adb dumpsys gfxinfo
<PACKAGE_NAME> framestats:

Each line of this output represents a frame produced by the app. Each line has a fixed number of
columns describing time spent in each stage of the frame-producing pipeline. The next section
describes this format in detail, including what each column represents.

Framestats data format

Since the block of data is output in CSV format, it's very straightforward to paste it to your
spreadsheet tool of choice, or collect and parse with a script. The following table explains the
format of the output data columns. All timestamps are in nanoseconds.

FLAGS

Rows with a ‘0’ for the FLAGS column can have their total frame time computed by
subtracting the INTENDED_VSYNC column from the FRAME_COMPLETED column.

If this is non-zero the row should be ignored, as the frame has been determined as being
an outlier from normal performance, where it is expected that layout & draw take longer
than 16ms. Here are a few reasons this could occur:

The window layout changed (such as the first frame of the application or after a
rotation)

It is also possible the frame was skipped in which case some of the values will have
garbage timestamps. A frame can be skipped if for example it is out-running 60fps or if
nothing on-screen ended up being dirty, this is not necessarily a sign of a problem in
the app.

INTENDED_VSYNC

The intended start point for the frame. If this value is different from VSYNC, there
was work occurring on the UI thread that prevented it from responding to the vsync signal
in a timely fashion.

VSYNC

The time value that was used in all the vsync listeners and drawing for the frame
(Choreographer frame callbacks, animations, View.getDrawingTime(), etc…)

To understand more about VSYNC and how it influences your application, check out the
Understanding VSYNC video.

OLDEST_INPUT_EVENT

The timestamp of the oldest input event in the input queue, or Long.MAX_VALUE if
there were no input events for the frame.

This value is primarily intended for platform work and has limited usefulness to app
developers.

NEWEST_INPUT_EVENT

The timestamp of the newest input event in the input queue, or 0 if there were no
input events for the frame.

This value is primarily intended for platform work and has limited usefulness to app
developers.

However it’s possible to get a rough idea of how much latency the app is adding by
looking at (FRAME_COMPLETED - NEWEST_INPUT_EVENT).

HANDLE_INPUT_START

The timestamp at which input events were dispatched to the application.

By looking at the time between this and ANIMATION_START it is possible to measure how
long the application spent handling input events.

If this number is high (>2ms), this indicates the app is spending an unusually
long time processing input events, such as View.onTouchEvent(), which may indicate this
work needs to be optimized, or offloaded to a different thread. Note that there are some
scenarios, such as click events that launch new activities or similar, where it is
expected and acceptable that this number is large.

ANIMATION_START

The timestamp at which animations registered with Choreographer were run.

By looking at the time between this and PERFORM_TRANVERSALS_START it is possible to
determine how long it took to evaluate all the animators (ObjectAnimator,
ViewPropertyAnimator, and Transitions being the common ones) that are running.

If this number is high (>2ms), check to see if your app has written any custom
animators or what fields ObjectAnimators are animating and ensure they are appropriate
for an animation.

If you subtract out DRAW_START from this value, you can extract how long the layout
& measure phases took to complete. (note, during a scroll, or animation, you would
hope this should be close to zero..)

This marks the point at which a message to start the sync
phase was sent to the RenderThread. If the time between this and
SYNC_START is substantial (>0.1ms or so), it means that
the RenderThread was busy working on a different frame. Internally
this is used to differentiate between the frame doing too much work
and exceeding the 16ms budget and the frame being stalled due to
the previous frame exceeding the 16ms budget.

SYNC_START

The time at which the sync phase of the drawing started.

If the time between this and ISSUE_DRAW_COMMANDS_START is substantial (>0.4ms or
so), it typically indicates a lot of new Bitmaps were drawn which must be uploaded to the
GPU.

The time at which the hardware renderer started issuing drawing commands to the GPU.

The time between this and FRAME_COMPLETED gives a rough idea of how much GPU work the
app is producing. Problems like too much overdraw or inefficient rendering effects show
up here.

SWAP_BUFFERS

The time at which eglSwapBuffers was called, relatively uninteresting outside of
platform work.

FRAME_COMPLETED

All done! The total time spent working on this frame can be computed by doing
FRAME_COMPLETED - INTENDED_VSYNC.

You can use this data in different ways. One simple but useful visualization is a
histogram showing the distribution of frames times (FRAME_COMPLETED - INTENDED_VSYNC) in
different latency buckets, see figure below. This graph tells us at a glance that most
frames were very good - well below the 16ms deadline (depicted in red), but a few frames
were significantly over the deadline. We can look at changes in this histogram over time
to see wholesale shifts or new outliers being created. You can also graph input latency,
time spent in layout, or other similar interesting metrics based on the many timestamps
in the data.

Simple frame timing dump

If Profile GPU rendering is set to In adb shell dumpsys gfxinfo
in Developer Options, the adb shell dumpsys gfxinfo command prints out timing
information for the most recent 120 frames, broken into a few different categories with
tab-separated-values. This data can be useful for indicating which parts of the drawing pipeline
may be slow at a high level.

Similar to framestats above, it's very
straightforward to paste it to your spreadsheet tool of choice, or collect and parse with
a script. The following graph shows a breakdown of where many frames produced by the app
were spending their time.

The result of running gfxinfo, copying the output, pasting it into a spreadsheet
application, and graphing the data as stacked bars.

Each vertical bar represents one frame of animation; its height represents the number of
milliseconds it took to compute that frame of animation. Each colored segment of the bar
represents a different stage of the rendering pipeline, so that you can see what parts of
your application may be creating a bottleneck. For more information on understanding the
rendering pipeline, and how to optimize for it, see the
Invalidations Layouts and Performance video.

Controlling the window of stat collection

Both the framestats and simple frame timings gather data over a very short window - about
two seconds worth of rendering. In order to precisely control this window of time - for
example, to constrain the data to a particular animation - you can reset all counters,
and aggregate statistics gathered.

>adb shell dumpsys gfxinfo <PACKAGE_NAME> reset

This can also be used in conjunction with the dumping commands themselves to collect and
reset at a regular cadence, capturing less-than-two-second windows of frames
continuously.

Diagnosing performance regressions

Identification of regressions is a good first step to tracking down problems, and
maintaining high application health. However, dumpsys just identifies the existence and
relative severity of problems. You still need to diagnose the particular cause of the
performance problems, and find appropriate ways to fix them. For that, it’s highly
recommended to use the systrace tool.

Additional resources

For more information on how Android’s rendering pipeline works, common problems that you
can find there, and how to fix them, some of the following resources may be useful to
you:

Automate UI performance tests

One approach to UI Performance testing is to simply have a human tester perform a set of
user operations on the target app, and either visually look for jank, or spend an very
large amount of time using a tool-driven approach to find it. But this manual approach is
fraught with peril - human ability to perceive frame rate changes varies tremendously,
and this is also time consuming, tedious, and error prone.

A more efficient approach is to log and analyze key performance metrics from automated UI
tests. Android 6.0 includes new logging capabilities which make it
easy to determine the amount and severity of jank in your application’s animations, and
that can be used to build a rigorous process to determine your current performance and
track future performance objectives.