These days ARM and its customers are in the midst of a major evolution in GPU design. Back in May the company announced their new Bifrost GPU architecture, a new and modern architecture for future GPUs. With Bifrost ARM would be taking a leap that we’ve seen many other GPU vendors follow over the years, replacing an Instruction Level Parallelism (ILP)-centric GPU design with a modern, scalar, thread level parallelism (TLP)-centric design that’s a better fit modern workloads.

The first of these new Bifrost GPUs was introduced at the same time, and that was Mali-G71. However as our regular readers likely know, ARM doesn’t stop with just a single GPU design; rather they have multiple designs for their partners to use, running the gamut from high performance cores to area efficient cores. Mali-G71 was the former, and now this week ARM is introducing the latter with the release of the Mali-G51 design.

If Mali-G71 was the successor to the Mali-T880, then Mali-G51 is the successor to the Mali-T820 & T830. That is to say, it’s a mainstream part that has been optimized for performance within a given area – when SoC space and/or cost is at a premium – as opposed to G71’s greater total throughput. Broadly speaking, mainstream parts like Mali-G51 end up in equally mainstream SoCs like the Exynos 7870 (Galaxy A-series), as opposed to flagship-level SoCs like the Exynos 8890 (Galaxy S7). And along those lines, somewhat surprisingly, ARM is rather keen on talking about the VR market in conjunction with G51, even though it’s not their high-performance GPU design. Even G51, they’re confident, can offer good VR performance for the kinds of admittedly simpler workloads they have in mind.

Meanwhile at a technical level, rather than just being a cut-down version of Mali-G71, Mali-G51 is an interesting GPU design in its own right. ARM has opted to go with a continuous development cycle for the Mali-G series, which means that each GPU is in essence branched off of the ongoing Mali design process when a new design is needed. That means besides market-specific optimizations, successive GPUs can contain features not found in earlier GPUs under the same brand, and that’s definitely the case for G51.

So what sets G51 apart from G71? From the area efficiency perspective, the big change here is that ARM has reworked the shader cores to offer what they call a “dual pixel” design, as opposed to G71’s “single pixel’ design. In brief, per a G71 shader core could process 24 FLOPS (12 FMAs) over its three execution engines, while its texture and blending units could process 1 texel and 1 pixel respective. G51, by contrast, has adjusted the throughput ratio to more heavily favor pixel/texel throughput; a G51 shader core has the same 24 FLOPS throughput, but couples that with 2 texels and 2 pixels per clock. ARM did something similar in previous Mali Midgard generations – varying the number of ALUs – and the reason to do so is fairly straightforward, as advanced graphical effects are traditionally more shader-heavy than pixel-heavy. The end result being that for simpler workloads such as application UIs, the need for the shader throughput tends to scale down more rapidly in the mobile space.

ARM Mali G Series

Mali-G71

Mali-G51

Role

High Performance

Area Efficient

Core Configurations

4-32

N/A

ALU Lanes Per Core (Default)

12

12

Texture Units Per Core

1

2

Pixel Units Per Core

1

2

FLOPS:Pixel Ratio

24:1

12:1

APIs

OpenGLES 3.2
OpenCL 2.0
Vulkan

OpenGLES 3.2
OpenCL 2.0
Vulkan

And while the dual pixel core is the biggest change for G51, it’s not the only change. By being based on a newer iteration of Bifrost, it includes a few notable, low-level tweaks to improve performance. Transcendental performance has been significantly improved; it turns out those operations are still used more often than ARM expected, G51 bakes in better support to maintain higher performance. There are also some outright new instructions on G51, and ARM’s framebuffer compression technology has been improved as well. Version 1.2 of AFBC implements some optimizations for better memory traffic shaping and burst lengths, as well as an improvement for constant color blocks.

Overall, ARM is touting that G51 offers significant improvements to performance, density, and energy efficiency relative to the Mali-T830. On equal processes, G51 a mix of 30% smaller than T830, 60% better performance per mm2, and 60% higher performance per watt. I’m told area efficiency was the primary design in the goal, making the latter a pleasant surprise of sorts.

Finally, like ARM’s other GPU IP announcements, this week’s announcement is about making the technology available to the company’s partners for implementation, rather than being a consumer-oriented announcement. ARM’s partners are already looking at early versions of the G51 design, and based on typical product development cycles, G51 should be showing up in devices in 2018.

Mali-V61

Meanwhile on a quick note, alongside the Mali-G51 GPU, ARM is also announcing the Mali-V61 video processor. This is the product formerly known as Egil, which ARM unveiled back in June while it was still under development. Now, along with G51, V61 is being released to ARM’s partners as well.

V61/Egil has not significantly changed since we’ve last seen it. ARM’s fully modernized video encode and decode block follows a who’s who list of codecs and features, supporting 10-bit HEVC encode/decode and 10-bit VP9 encode/decode. Relative to the VP550 before it, ARM’s latest video processor supports a wider range of codecs, and now, having a full-feature HEVC encoder implementation, offers much better HEVC compression as well.

Ultimately ARM is looking to sell Mali-V61 alongside Mali-G51 and their DP650 display process as a complete graphics solution to partners, which they call the Mali Multimedia Suite (though it can be used stand-along as well). And like Mali-G51, expect to see Mali-V61 start showing up in devices around a year from now.

The Kirin 960 SoC in the Mate 9 is built on TSMC's 16nm process. A few recent rumors point to Samsung using a MP16 core version of the Mali G71 built on it's own 10nm process for it's next high end Exynos SoC. Also Samsung will be producing the Qualcomm Snapdragon 830 and that should be packing an Adreno 540. We will know for sure come February as these will be the Soc's in the upcoming Galaxy S8 line.Reply

The g71 looks to be about 85% of an adreno.The biggest surprise to me was that the new arm memory controller (and probably also due to the new efficiencies with cache management) is just crazy fast. Twice as fast as the pixel in basically every test.Reply

Very curious to see how Bitfrost performs in gaming and compute given the major changes it brings.Any chance you guys already have the Mate 9 review sample and we see a review at launch in a few days?Reply