How ARM's Cortex-A7 Beats the A15

You get more raw performance at about the same power consumption and in the same die area with four Cortex-A7s.

ARM's Cortex-A series of processors has now divided into three tiers associated with low, medium, and high performance. The high tier is optimized for performance, and the low tier is optimized for stripped-down power efficiency at lower absolute performance levels, all in support of the big-little and heterogeneous multicore processing.

At the 32-bit level these three tiers feature the A7, the A12, and A15, and the 64-bit level is represented by the Cortex-A57. The high efficiency processor is the Cortex-A53.

Does that mean the market should expect a series of mid-range parts in the Cortex-A5X series and going forward? Yes although ARM executives that I recently met with were also keen to keep their marketing powder dry. Does it mean that ARM is going to start implementing a big-medium-little processor core strategy? In the short term probably not, but I would like to assign that discussion for another day.

But what ARM engineers did show me at a recent analysts' conference is that relative performance of the Cortex-A7 and A15 differs by a factor of two or three depending on workload and implementation. Now, when you add that to the fact that the Cortex-A7 occupies about one quarter or one fifth of the die area and consumes one quarter to one fifth of the power, things become interesting (see chart below).

Three tiers of processor cores with performance going up and to the right over time. (Source: ARM)

To reiterate: With a Cortex-A15, at five times the area and five times the power consumption, you can get two or three times the performance of the Cortex-A7. So why wouldn't you replace any Cortex-A15 cores with four Cortex-A7s? You would get more raw performance at about the same power consumption and in the same die area.

The reason, of course, depends on considering single-thread performance, but still it makes you think about the implications for multicore SoC architectures.

When I put my observation to Nandan Nayampally, vice president of product marketing for application processors at ARM, he said: "Yes four A7s is more performance than one A15 for multi-threaded applications, but in mobile, single-threaded peak performance is also important." Point made.

But Nayampally conceded that the stripped-down "little" cores do have a vital role to play in future SoCs. He admitted that some SoCs may depend substantially on little cores. "So you will see things like networking SoCs with a couple of A15s or A57s and a lot of A7s or A53s. This then moves to generalize further to A7/A12/A15 and other resources, and an operating system governor will make allocations."

Peter, your point on multiple small cores versus a single larger one has power management implications as well. It gives the potion of shutting down cores to save power, as opposed to (or maybe in addition to) clock rate management. Do the new CPUs support both of these power management strategies? Did ARM discuss any improvements that they might have coming in terms of power management?

Yes, You are right, It is a major area of research about optimizing the number of cores as compared with the performance. Turning on and off the cores as an when required dynamically is the present day solution while using more cores. Lets see if a major power improvement comes in the mean time by ARM.

The analysis of power consumption is tricky and should take into account voltage-frequency scaling.

I believe the assertion "With a Cortex-A15, at five times the area and five times the power consumption, you can get two or three times the performance of the Cortex-A7" is a statement about what happens when the A15 and A7 are run at the same frequency and voltage. If we look at running the two processors at the same performance point then the frequency of the A15 can be one-half to one-third of the A7; that means that the A15 power will be 2.5x to 1.7x the A7 power at the same voltage. But the A15 design is being run slow - so the voltage can be dropped. Taking the figures shown on the ST-Ericssson blog* for 28FDSOI as guidance, dropping from 1.0V to 0.7V drops frequency by 2x. If we drop the voltage then the power consumption drop by a further 0.7^2 = 50% - so for applications where the A15 is 2x the performance of an A7 at the same frequency, the power consumption is 25% higher than A7, for those where A15 is 3x the performance of A7 the power consumption is at at least 15% better than A15.

For sure I've ignored leakage but its likely leakage is only really an issue at high-performance points which cannot be reached by an A7 anyway.

Finally, it is also the case that to first order, from a power perspective, you woul dbe better using two half-fequency processors to do a job than 1 full-frequency. It's just there is a little problem with software.