Achieve proper frame pacing

The Android Frame Pacing library, also known as Swappy, is part of the Android Game SDK. It helps OpenGL and Vulkan games achieve smooth rendering and
correct frame pacing on Android. This document defines frame pacing, describes
situations where frame pacing is needed, and shows how the library addresses
these situations. If you want to jump directly to implementing frame pacing in
your game, see Next step.

Background

Frame pacing is the synchronization of a game’s logic and rendering loop with
an OS’s display subsystem and the underlying display hardware. The Android
display subsystem was designed to avoid visual artifacts (known as tearing)
that can occur when the display hardware switches to a new frame part-way
through an update. To avoid these artifacts, the display subsystem does the
following:

Buffers past frames internally

Detects late frame submissions

Repeats the display of past frames when late frames are detected

A game informs
SurfaceFlinger,
the compositor within the display subsystem, that it has submitted all the draw
calls needed for a frame (by calling eglSwapBuffers or vkQueuePresentKHR).
SurfaceFlinger signals availability of a frame to the display hardware using a
latch. The display hardware then shows the given frame. The display hardware
ticks at a constant rate, for example 60 Hz, and if there is no new frame when
the hardware needs one, the hardware displays the previous frame again.

Inconsistent frame times often occur when a game render loop renders at a
different rate than the native display hardware. If a game running at 30 FPS
attempts to render on a device that natively supports 60 FPS, the game render
loop doesn't realize that a repeated frame remains on the screen for an extra 16
milliseconds. This disconnect usually creates substantial inconsistency in frame
times, such as: 49 milliseconds, 16 milliseconds, 33 milliseconds. Overly
complex scenes further compound this problem, as they cause missed frames to
occur.

Non-optimal solutions

The following solutions for frame pacing have been employed by games in the past
and typically result in inconsistent frame times and increased input latency.

Submit frames as quickly as the rendering API allows

This approach ties a game to variable SurfaceFlinger activity and introduces an
extra frame of latency. The display pipeline contains a queue of frames,
typically of size 2, which fills up if the game is trying to present frames too
quickly. With no more room in the queue, the game loop (or at least the
rendering thread) is blocked by an OpenGL or Vulkan call. The game is then
forced to wait for the display hardware to show a frame, and this back-pressure
synchronizes the two components. This situation is known as buffer-stuffing or
queue-stuffing. The renderer process
doesn't realize what's going on, so framerate inconsistency gets worse. If the
game samples input before the frame, input latency gets worse.

Use Android Choreographer by itself

Games also use Android Choreographer for synchronization. This component,
available in Java from API 16 and in C++ from API 24, delivers regular ticks at
the same frequency as the display subsystem. There are still subtleties as to
when this tick is delivered relative to the actual hardware VSYNC, and these
offsets vary by device. Buffer-stuffing may still occur for long frames.

Advantages of the Frame Pacing library

The Frame Pacing library uses Android Choreographer for synchronization and
deals with the variability in the tick delivery for you. It uses presentation
timestamps to make sure frames are presented at the proper time and sync fences
to avoid buffer stuffing. The library uses the
NDK Choreographer if it is available and
falls back to the Java Choreographer if
it is not.

The library handles multiple refresh rates if they are supported by the device,
which gives a game more flexibility in presenting a frame. For example, for a
device that supports a 60 Hz refresh rate as well as 90 Hz, a game that cannot
produce 60 frames per second can drop to 45 FPS instead of 30 FPS to remain
smooth. The library detects the expected game frame rate and auto-adjusts frame
presentation times accordingly.

How it works

The following sections show how the Frame Pacing library deals with long and
short game frames in order to achieve correct frame pacing.

Correct frame pacing at 30 Hz

When rendering at 30 Hz on a 60 Hz device, the ideal situation on Android is
shown in figure 1. SurfaceFlinger latches new graphic buffers, if present (NB in
the diagram indicates "no buffer" present and the previous one is repeated).

Figure 1. Ideal frame pacing at 30 Hz on a 60 Hz device

Short game frames lead to stuttering

On most modern devices, game engines rely on the platform choreographer
delivering ticks to drive the submission of frames. However, there is still the
possibility for poor frame pacing due to short frames, as seen in figure 2.
Short frames followed by long frames are perceived by the player as stuttering.

Figure 2. Short game frame C causes frame B to present only one frame,
followed by multiple C frames

Long frames lead to stuttering and latency

When the display workload takes longer than the application workload, extra
frames are added to a queue. This leads, once again, to stuttering and may also
lead to an extra frame of latency due to buffer-stuffing (see figure 4). The
library both removes the stuttering and the extra frame of latency.

Figure 4. Long frame B gives incorrect pacing for 2 frames—A and B

The library solves this by using sync fences
(EGL_KHR_fence_sync
and
VkFence)
to inject waits into the application that allow the display pipeline to catch
up, rather than allowing back pressure to build up. Frame A still presents an
extra frame, but frame B now presents correctly, as seen in figure 5.

Figure 5. Frames C and D wait to present

Supported operating modes

You can configure the Frame Pacing library to operate in one of the three
following modes:

Auto mode off + Pipeline

Auto mode on + Pipeline

Auto mode on + Auto pipeline mode (Pipeline/Non-pipeline)

Recommended mode

You can experiment with auto-mode and pipeline modes, but you start by turning
them off and including the following after initializing Swappy:

Pipeline mode

To coordinate engine workloads, the library typically uses a pipelining model
which separates the CPU and GPU workloads across VSYNC boundaries.

Figure 6. Pipeline mode

Non-pipeline mode

In general, this approach results in lower, more predictable input-screen
latency. In cases where a game has a very low frame time, both the CPU and GPU
workloads may fit into a single swap interval. In this case, a non-pipelined
approach would actually deliver lower input-screen latency.

Figure 7. Non-pipeline mode

Auto mode

Most games don’t know how to choose the swap interval, which is the duration for
which each frame is presented (for example, 33.3 ms for 30 Hz). On some devices,
a game can render at 60 FPS while on another it may need to drop to a lower
value. Auto mode measures CPU and GPU times in order to do the following:

Automatically select swap intervals: Games which deliver 30 Hz in some
scenes and 60 Hz in others can allow the library to adjust this interval
dynamically.