GPU Analysis/Performance

Section by Anand Shimpi

Understanding the A6's GPU architecture is a walk in the park compared to what we had to do to get a high level understanding of Swift. The die photos give us a clear indication of the number of GPU cores and the width of the memory interface, while the performance and timing of release fill in the rest of the blanks. Apple has not abandoned driving GPU performance on its smartphones and increased the GPU compute horsepower by 2x. Rather than double up GPU core count, Apple adds a third PowerVR SGX 543 core and runs the three at a higher frequency than in the A5. The result is roughly the same graphics horsepower as the four-core PowerVR SGX 543MP4 in Apple's A5X, but with a smaller die footprint.

As a recap, Imagination Technologies' PowerVR SGX543 GPU core features four USSE2 pipes. Each pipe has a 4-way vector ALU that can crank out 4 multiply-adds per clock, which works out to be 16 MADs per clock or 32 FLOPS. Imagination lets the customer stick multiple 543 cores together, which scales compute performance linearly.

SoC die size however dictates memory interface width, and it's clear that the A6 is significantly smaller in that department than the A5X, which is where we see the only tradeoff in GPU performance: the A6 maintains a 64-bit LPDDR2 interface compared to the 128-bit LPDDR2 interface in the A5X. The tradeoff makes sense given that the A5X has to drive 4.3x the number of pixels that the A6 has to drive in the iPhone 5. At high resolutions, GPU performance quickly becomes memory bandwidth bound. Fortunately for iPhone 5 users, the A6's 64-bit LPDDR2 interface is a good match for the comparatively low 1136 x 640 display resolution. The end result is 3D performance that looks a lot like the new iPad, but in a phone:

Mobile SoC GPU Comparison

Adreno 225

PowerVR SGX 540

PowerVR SGX 543MP2

PowerVR SGX 543MP3

PowerVR SGX 543MP4

Mali-400 MP4

Tegra 3

SIMD Name

-

USSE

USSE2

USSE2

USSE2

Core

Core

# of SIMDs

8

4

8

12

16

4 + 1

12

MADs per SIMD

4

2

4

4

4

4 / 2

1

Total MADs

32

8

32

48

64

18

12

GFLOPS @ 200MHz

12.8 GFLOPS

3.2 GFLOPS

12.8 GFLOPS

19.2 GFLOPS

25.6 GFLOPS

7.2 GFLOPS

4.8 GFLOPS

We ran through the full GLBenchmark 2.5 suite to get a good idea of GPU performance. The results below are largely unchanged from our iPhone 5 Performance Preview, with the addition of the Motorola RAZR i and RAZR M. I also re-ran the iPad results on iOS 6, although I didn't see major changes there.

We'll start out with the raw theoretical numbers beginning with fill rate:

The iPhone 5 nips at the heels of the 3rd generation iPad here, at 1.65GTexels/s. The performance advantage over the iPhone 4S is more than double, and even the Galaxy S 3 can't come close.

Triangle throughput is similarly strong:

Take resolution into account and the iPhone 5 is actually faster than the new iPad, but normalize for resolution using GLBenchmark's offscreen mode and the A5X and A6 look identical:

The fragment lit texture test does very well on the iPhone 5, once again when you take into account the much lower resolution of the 5's display performance is significantly better than on the iPad:

The next set of results are the gameplay simulation tests, which attempt to give you an idea of what game performance based on Kishonti's engine would look like. These tests tend to be compute monsters, so they'll make a great stress test for the iPhone 5's new GPU:

Egypt HD was the great equalizer when we first met it, but the iPhone 5 does very well here. The biggest surprise however is just how well the Qualcomm Snapdragon S4 Pro with Adreno 320 GPU does by comparison. LG's Optimus G, a device Brian flew to Seoul, South Korea to benchmark, is hot on the heels of the new iPhone.

When we run everything at 1080p the iPhone 5 looks a lot like the new iPad, and is about 2x the performance of the Galaxy S 3. Here, LG's Optimus G actually outperforms the iPhone 5! It looks like Qualcomm's Adreno 320 is quite competent in a phone. Note just how bad Intel's Atom Z2460 is, the PowerVR SGX 540 is simply unacceptable for a modern high-end SoC. I hope Intel's slow warming up to integrating fast GPUs on die doesn't plague its mobile SoC lineup for much longer.

The Egypt classic tests are much lighter workloads and are likely a good indication of the type of performance you can expect from many games today available on the app store. At its native resolution, the iPhone 5 has no problems hitting the 60 fps vsync limit.

Remove vsync, render at 1080p and you see what the GPUs can really do. Here the iPhone 5 pulls ahead of the Adreno 320 based LG Optimus G and even slightly ahead of the new iPad.

Once again, looking at GLBenchmark's on-screen and offscreen Egypt tests we can get a good idea of how the iPhone 5 measures up to Apple's claims of 2x the GPU performance of the iPhone 4S:

Removing the clearly vsync limited result from the on-screen Egypt Classic test, the iPhone 5 performs about 2.26x the speed of the 4S. If we include that result in the average you're still looking at a 1.95x average. As we've seen in the past, these gains don't typically translate into dramatically higher frame rates in games, but games with better visual quality instead.

I agree that rooting the phone and installing a new kernel for benchmarking is silly, but at least having up-to-date figures for a phone known to have received significant performance increases since its release would be a nice idea. This chap's numbers certainly make the phone look very different in terms of attractiveness.Reply

Something seems very wrong with RAZR M. It uses the same S4 processor as One X, has smaller 4.3" screen, has lower qHD resolution, bigger battery, and yet it still significantly underperforms the S4-based One X in Sunspider performance, in battery life, and other stuff as well. That shouldn't happen, and it seems like the issue is some very sloppy software that Motorola put on top of the RAZR M hardware. Reply