Apple A11 SoC

Due to geekbench, iPhone 8 gets 2800 single thread and 3800 multiple threads in lower mode. It seems that A11 uses one Monsoon and one Mistral in lower mode. Would anyone like to test vfp benchmark in lower power mode?

Newcomer

For further details about these features, please check the Metal 2 documentation. A11 makes many significant performance improvements to the GPU. It has up to 2X math performance when it comes to tasks for computer vision, image processing, and machine learning. But that is not the only area of improvement for performance. Let us review the improved performance and capabilities of A11 GPU. We doubled F16 math and texture filtering rate per clock cycle when compared to A10 GPU.
Please note: On A11, using F16 data types in your shaders when possible makes a much larger performance difference.

Click to expand...

source:link(transcript section)
if A10 GPU 6 cluster FP16 is equal to PowerVR 7XT(+) 6 Cluster FP16 = 768 FP16/Clock and 12 texels/clock,
then A11 GPU have 1536 FP16/Clock and 24 texels/clock, same as A10X GPU but with more low clock because the texture fill rate
is not same based on gfxbench texturing offscreen maximum value(17002 vs. 21261 MTexel/s)
no information regarding FP32 and FP16 ratio. my guess A11 GPU clock is around 800MHz.

could you make some test of gfxbench low-level and high-level offscreen test on low power mode on? since iphone 6s, apple soc throttling a lot on graphics. for example, iphone 7 plus stable only 67% from initial score on manhattan 3.1 long term benchmark. with low power mode, iphone 7 plus will produce stable result from the beginning.

Legend

Well, just from looking at the numbers it seems to have about 5 minutes additional battery life...which would be impressive, considering the (much) faster SoC and allegedly, a smaller capacity battery to begin with.

Newcomer

I finally uploaded iOS Spectre attack proof-of-concept to github, after all the hype has died down.
Actually it was done a couple of weeks ago, but I was too lazy to prepare and submit it :-/https://github.com/vvid/ios-spectre-poc

There is also disabled code to check for meltdown, but it doesn't work (at least with Spectre-V1-like speculation).
I.e. lines 835/849

Newcomer

For GPU GFLOPS I think gfxbenchmark metal 3.0.3 ALU test can be an alternative to know how many GFLOPS FP32 of A11 GPU because my iPhone 7 plus ALU test score near DasherX score(around 300). But you need Charles proxy trick to download that old version of gfxbenchmark metal.

ModeratorLegendVeteran

Any test suite which has a single "ALU" test doesn't seem very reliable to me. It's frankly complete nonsense to evaluate shader core performance with a *single* microbenchmark as if that was some kind of end-all-be-all of ALU performance. Just the fact someone is doing that indicates to me they probably don't understand how to test ALU performance in a useful way, and therefore I'd be very reluctant to believe any results.

The original GFXBench 3.0 ALU1 test, for example, is basically testing nothing but trigonometric performance... GPUs that are slower at sin/cos (because they're bloody useless in most real workloads) are significantly slower at it. The GFXBench 3.1 ALU2 test is slightly better, but still just one biased datapoint amongst others (e.g. branch and divergence efficiency matters quite a bit more than in typical workloads).

I honestly don't remember anything about any analysis of the A11, I've completely erased that from my brain for lack of interest... which tells me that it's probably *not* 2x slower than A10 for FP32 FMAs, because I'd hopefully have remembered that

A few random thoughts on what might be going wrong: maybe they're using a very low resolution with lots of blended layers, and it's not enough tiles/pixels to fill all the shader cores? or maybe it's just one *massive* shader and the A11 either has lots of instruction cache misses, or it's a tiny loop with many iterations and the A11 compiler doesn't unroll it properly while the A10 did?

There's literally a billion things that could go wrong with any microbenchmark, which is why if a result seems anomalous, you really want to iterate and modify the test to understand what's going on - or at least have a lot of knowledge of the trade-offs so you intuitively create a test which is unlikely to hit that kind of problem in the first place, and not a lot of people have that knowledge unfortunately. Kishonti for example didn't do that very well - they often relied on all the HW vendors telling them everything they did wrong in the beta versions until it kinda sorta worked in the final release...

Veteran

A10 seems to have some real advantages over the A11 in performance, as shown in numerous speed tests like this one during an intensive video rendering/export match-up featured at 10:35 in the following video:

I believe iMovie on iOS makes good use of the GPU to accelerate this render out process, so I’m not surprised to see Apple’s first attempt at a GPU distinct from PowerVR DNA (as much as it is considering they still follow a solid TBDR path) not match the refinement and effectiveness of the PowerVR recipe overall.

Apple should’ve just bought the company at whatever high price was being asked. Having all the top designers they could get, along with the refined IP already designed, is invaluable.

About Us

Beyond3D has been around for over a decade and prides itself on being the best place on the web for in-depth, technically-driven discussion and analysis of 3D graphics hardware. If you love pixels and transistors, you've come to the right place!