Regular

(Pictures from AnandTech.) I'd like to know whether the little cores are Apple-designed or Cortex-A series. (I assume the big cores are an evolution of Twister—not that they aren't also interesting.) The 40% faster performance of the big cores aligns with some clock-for-clock improvements and the rumor of a 2.4-2.45 GHz clock speed.

Legend

So my bet is that simply because they likely went with 16FF+ again this generation to achieve the higher clocks they needed to do a physical implementation with faster/higher leakage transistors which adversely affects power at low frequency so opening up the need for low leakage/low power cores for low perf/idle scenarios.

Because they mention the controller, i.e. it being a hardware governor it should mean that the kernel/OS only sees 2 cores and the switching is transparent between pairs of big and little cores.

Veteran

It's interesting that not very long ago many people were saying that big.LITTLE was a failed concept because no one other than ARM was designing CPU cores for phones this way. Since then Samsung and Qualcomm both started doing it with their custom cores, Apple now appears to be doing it, and Intel has given up on getting into phones altogether. I think we can safely say that it wasn't a dumb idea.

LegendAlpha

To confirm in case I'm reading too much into the marketing blurb, is the Apple-designed performance controller a piece of actual hardware?

This is potentially an evolution beyond ARM's existing concept, which has scheduling handled at the software level.

In theory, a hardware control block could access items like DVFS activity counters and instruction retirement counts to get a very rapidly updated picture of power consumed vs progress made, as well as reference values for long-latency events or the percentage of a time slice a thread seems to be hitting resources hard.
Context like latency hints or whether the FPU is needed might feed into it.

Veteran

It's interesting that not very long ago many people were saying that big.LITTLE was a failed concept because no one other than ARM was designing CPU cores for phones this way. Since then Samsung and Qualcomm both started doing it with their custom cores, Apple now appears to be doing it, and Intel has given up on getting into phones altogether. I think we can safely say that it wasn't a dumb idea.

ModeratorVeteranAlphaSubscriber

It's interesting that not very long ago many people were saying that big.LITTLE was a failed concept because no one other than ARM was designing CPU cores for phones this way. Since then Samsung and Qualcomm both started doing it with their custom cores, Apple now appears to be doing it, and Intel has given up on getting into phones altogether. I think we can safely say that it wasn't a dumb idea.

Click to expand...

For the record, I don't recall calling big.little "dumb". Just not convinced it wasn't an effort by arm to sell more cores! Kidding aside, it's still not clear to me that big.little is the most optimal solution (ignoring marketing benefits).

I will say though that Qualcomm doesn't count. @Rys can correct me, but I believe their "big.little" cores are all exactly the same. I don't interpret that as a ringing endorsement of big.little. And let's be honest, intel leaving had nothing to do with their lack of big.little design. But you're right, apple did switch! Hopefully for technical reasons!

Maybe?
There are lower-cost ways of accomplishing that goal than what Apple has done.
A software driver or explicit .exe detection has been an option that has been utilized in this space to get those benchmark wins. It would seem as if Apple wants something beyond that, and that the effort provides them some avenue for value-add or product differentiation.
Given the complexity, I think the decision for this would have been decided earlier, maybe before knowing what the latest configurations are for Qualcomm and Samsung.

Qualcomm's Kryo core differentiation is a modest one, Samsung's custom core is a bigger custom core and standard little ones, which Apple may or may not have done.
We might need to wait for Samsung's next custom core effort. If the narrative is that Mongoose is their equivalent of Swift, a more distinctive core could result and it's not clear if Samsung would opt for the same little cores.

Have either added dedicated hardware for transfer or as a governor?
Depending on what elements that controller plugs into and what it's responsible for, it's not the playbook of Samsung or Qualcomm Apple was cribbing notes from.

There is more than a little uncomfortable truth in this.
For whatever the reason, Apple seems to have wanted to increase clocks substantially. Switching to InFO is probably part of that, but switching to higher speed higher leakage circuit solutions for the main cores is probably another contributor. And if so, power draw at lower power states will get worse, all other things being equal. Which will strengthen the case for big.LITTLE.
I would guess software only ever sees two cores, but it will be interesting to get a little more flesh on the bones here. I'm both impressed and surprised by their effort though, given that this SoC will be superceded by a 10nm product within a year.

LegendAlpha

How possible is it that apple is just using a pair of A35 or A53 as LITTLE cores but they simply won't ever mention it?

Click to expand...

Vendor openness is spotty at best, and Apple sees those sorts of details as a point of product differentiation.
I'd like to see more details.
In theory, a custom little core would be more amenable to any custom hooks Apple might want, possibly for the hardware controller to use, or whatever internal tweaks Apple may have and won't talk about.

They have a good architectural team. If there were to be a benefit to doing so Apple would be positioned well to achieve it.

Another set of benefits I can think of is that a simpler core could be used as a pathfinder for some more experimental changes without risking problems with the big cores, and optimizing the two cores with the knowledge that neither needs the same sort of dynamic range that big cores from Intel have.
One possible difference is that there's still demand for using standard little cores as primary cores in larger-count cheap SOCs, so there is some provision in the design for that scenario. Apple could in theory skip that and optimize further.

Legend

How possible is it that apple is just using a pair of A35 or A53 as LITTLE cores but they simply won't ever mention it?

Click to expand...

Impossible. The cores are said to be in the same L2 cache hierarchy as the big cores thus making this a micro-architectural impossibility of it being anything ARM.

Plus they said the small cores are 1/5th the power of the big cores, if the big cores are still the same power envelope as the A9's that makes the small cores vastly exceed power envelopes of ARM's small cores and I suspect that they're also much better performing.

VeteranRegular

Impossible. The cores are said to be in the same L2 cache hierarchy as the big cores thus making this a micro-architectural impossibility of it being anything ARM.

Plus they said the small cores are 1/5th the power of the big cores, if the big cores are still the same power envelope as the A9's that makes the small cores vastly exceed power envelopes of ARM's small cores and I suspect that they're also much better performing.

Click to expand...

What if the small cores are just the dual core from the S2 in the Apple Watch? Do we know if Watch OS 3 is 64 bit? Or would Apple possibly run 64 bit cores in a compatibility mode with Watch OS?

Veteran

What if the small cores are just the dual core from the S2 in the Apple Watch? Do we know if Watch OS 3 is 64 bit? Or would Apple possibly run 64 bit cores in a compatibility mode with Watch OS?

Click to expand...

Hopefully Anandtech or someone else will do another dive on the S2 to get some idea of what the CPU characteristics are. They did one on the S1 and found that it had timings in accordance with Cortex-A7's, 32KB L1 dcache/256KB L2 cache, and a 520MHz clock speed. So it probably was a pretty standard Cortex-A7 implementation. There have been some rumors that S2 is using Cortex-A32, which is pretty suitable as ARM claims it's more efficient than Cortex-A7. These CPUs represent a pretty substantially different perf vs. efficiency design point than any of the custom CPUs Apple has done, even their first "Swift" microarchitecture. So it'd seem to be in their best interests to not have to divert resources away from their main uarch design to support a very low power core, but who knows how all of the trade-offs play out.

It is possible to configure some ARM cores to not include L2 cache at all, something that is almost certainly the case with A32. You could technically fit such a configuration with some other external L2. But it'd be a pretty inefficient design.

About Us

Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!