As always, our good friends over at Kishonti managed to have the first GPU performance results for the new 4th generation iPad. Although the new iPad retains its 2048 x 1536 "retina" display, Apple claims a 2x improvement in GPU performance through the A6X SoC. The previous generation chip, the A5X, had two ARM Cortex A9 cores running at 1GHz paired with four PowerVR SGX 543 cores running at 250MHz. The entire SoC integrated 4 x 32-bit LPDDR2 memory controllers, giving the A5X the widest memory interface on a shipping mobile SoC in the market at the time of launch.

The A6X retains the 128-bit wide memory interface of the A5X (and it keeps the memory controller interface adjacent to the GPU cores and not the CPU cores as is the case in the A5/A6). It also integrates two of Apple's new Swift cores running at up to 1.4GHz (a slight increase from the 1.3GHz cores in the iPhone 5's A6). The big news today is what happens on the GPU side. A quick look at the GLBenchmark results for the new iPad 4 tells us all we need to know. The A6X moves to a newer GPU core: the PowerVR SGX 554.

Mobile SoC GPU Comparison

PowerVR SGX 543

PowerVR SGX 543MP2

PowerVR SGX 543MP3

PowerVR SGX 543MP4

PowerVR SGX 554

PowerVR SGX 554MP2

PowerVR SGX 554MP4

Used In

-

iPad 2

iPhone 5

iPad 3

-

-

iPad 4

SIMD Name

USSE2

USSE2

USSE2

USSE2

USSE2

USSE2

USSE2

# of SIMDs

4

8

12

16

8

16

32

MADs per SIMD

4

4

4

4

4

4

4

Total MADs

16

32

48

64

32

64

128

GFLOPS @ 300MHz

9.6 GFLOPS

19.2 GFLOPS

28.8 GFLOPS

38.4 GFLOPS

19.2 GFLOPS

38.4 GFLOPS

76.8 GFLOPS

As always, Imagination doesn't provide a ton of public information about the 554 but based on what I've seen internally it looks like the main difference between it and the 543 is a doubling of the ALU count per core (8 Vec4 ALUs per core vs. 4 Vec4). Chipworks' analysis of the GPU cores helps support this: "Each GPU core is sub-divided into 9 sub-cores (2 sets of 4 identical sub-cores plus a central core)."

I believe what we're looking at is the 8 Vec4 SIMDs (each one capable of executing 8+1 FLOPS). The 9th "core" is just the rest of the GPU including tiler front end and render backends. Based on the die shot and Apple's performance claims it looks like there are four PowerVR SGX554 cores on-die, resulting in peak theoretical performance greater than 77 GFLOPS.

There's no increase in TMU or ROP count per core, the main change between the 554 and 543 is the addition of more ALUs. There are some more low level tweaks which helps explain the different core layout from previous designs, but nothing major.

With that out of the way, let's get to the early performance results. We'll start with low level fill rate and triangle throughput numbers:

Fill rate goes up by around 15% compared to the iPad, which isn't enough to indicate a huge increase in the number of texture units on the 554MP4 vs. the 543MP4. What we may be seeing here instead are benefits from higher clocked GPU cores rather than more texture units. If this is indeed the case it would indicate that the 554MP4 changes the texture to ALU ratio from what it was in the PowerVR SGX 543 (Update: this is confirmed). The data here points to a GPU clock at least 15% higher than the ~250MHz in the 3rd generation iPad.

Triangle throughput goes up by a hefty 65%, these are huge gains over the previous generation iPad.

The fragment lit triangle test starts showing us close to a doubling of performance at the iPad's native resolution.

Throw in a more ALU heavy workload and we really start to see the advantage of the new GPU: almost double the performance in Egypt HD at 2048 x 1536. We also get performance that's well above 30 fps here on the iPad at native resolution for the first time.

Normalize to the same resolution and we see that the new PowerVR graphics setup is 57% faster than even ARM's Mali-T604 in the Nexus 10. Once again we're seeing just about 2x the performance of the previous generation iPad.

Vsync bound gaming performance obviously won't improve, but the offscreen classic test gives us an idea of how well the new SoC can handle lighter workloads:

For less compute bound workloads the new iPad still boasts a 53% performance boost over the previous generation.

Ultimately it looks like the A6X is the SoC that the iPad needed to really deliver good gaming performance at its native resolution. I would not be surprised to see more game developers default to 2048 x 1536 on the new iPad rather than picking a lower resolution and enabling anti-aliasing. The bar has been set for this generation and we've seen what ARM's latest GPU can do, now the question is whether or not NVIDIA will finally be able to challenge Imagination Technologies when it releases Wayne/Tegra 4 next year.

Post Your Comment

113 Comments

I wonder when Kishonti will release their GLBenchmark 3.0 with tests for OpenGL ES 3.0. They were supposed to release it by the end of the year. Adreno 320 already has it, and Mali T604 should get it soon. Will they just wait until Apple release their PowerVR 6 series iPad in spring or whenever, to do it?Reply

As far as I know, the OpenGL ES 3.0 conformance tests aren't even finalized yet so there are no official OpenGL ES 3.0 drivers. Adreno 320 and Mali-T604 maybe the first OpenGL ES 3.0 capable GPUs on the market, but it'll likely go unused until next year.Reply

Mali T604 seems or be slightly faster than A5X, and now A6X overtook it by 60%. If Apple didn't need to upgrade their GPU in iPad because of the poor performance with the retina display, we would've waited a while until next spring to see an increase in speed over Mali T604.

I wonder when Mali T624 or Mali T658 are supposed to appear. I'm not entirely sure they will come a year from now. At least one of them (probably T624) should appear in Galaxy S4 along with a big.Little set-up.Reply

its really sad that the t604 doesn't blow everything out of the water, I was eagerly anticipating its release from last year, and now it comes out with a noticeable improvement over the 400 but nothing close to the ipads gpu.Reply

I don't think so, Tegra 2 and Tegra 3 have been disappointing in terms of graphics performance. According to NVIDIA, Tegra 4 is going to be about 3 times faster than its predecessor, even if these claims are true it would still be slower than the A6X.Reply

When did they say it will be 3x faster? Do you mean those old 2011 charts? That's outdated now. The Tegra 4 that will arrive next year is more like Tegra 5 or Tegra 4.5, together with a new GPU architecture (somehow based on Kepler, according to rumors).

One of the rumors said it will have 64 GPU cores, compared to the 12 in Tegra 3 right now. Assuming the cores are no more powerful than Tegra 3 (they probably are) then it should be 5.3x faster than Tegra 3, which is the least Nvidia needs to even be competitive in 2013.

However, another rumor also says there will be a 32 core version as well, and if the cores are no more powerful than the ones in Tegra 3, then I hope that's just a chip version for $200 tablets or something, because otherwise it would be pretty disappointing if it doesn't launch straight with the 64 core one this spring.Reply

Wayne - About 10 times faster than Tegra 2. Q1 2013.Logan - About 50 times faster than Tegra 2. To be released in 2013 (no specific Q).Stark - About 75 times faster than Tegra2. To be released in 2014.

Now I'm not sure the speed differences between Tegra 2 and Tegra 3, but right now, at the distance Tegra 3 trails the A6, I think there's very little chance Wayne will beat it if the 10x Tegra 2 spec is accurate. Logan, probably, but with 2 launches in a year, I'd expect that Q4 or thereabouts. However PowerVR is gearing up to launch their 6xx series Rogue GPU in Q1 2013 I believe. That gives Apple time to put a Rogue in their next iPad, which I would guess will be launched in Q4 2013. I personally don't see anything coming to really challange Imagination earlier than 2014.Reply