Following our launch article I promised an update on the performance scores of the Exynos 9810 variant of the Galaxy S9. I was able to have some time with one of the demo devices at the launch event and thoroughly benchmark it with a few of our common tests.

As a refresher, early in the year Samsung LSI had dropped a bombshell in claiming an astounding 2x single-thread performance improvement with the new Exynos 9810. While this initially caused a lot of controversy and discussions on the validity of the claim, early this year we exclusively covered the high-level micro-architectural features of the new Exynos M3 core and by then it was clear that the performance claims were not just marketing claims. The new Samsung CPU core is the first “very wide” CPU microarchitecture to power Android SoCs and the first to finally follow Apple’s footsteps in the direction of maximising single-thread performance. As a result it stands to be a very interesting - and ideally very powerful - SoC for the Android market.

Determining Clock Speeds

Firstly one of the biggest questions for me was confirming the final clock that Samsung would use on the Galaxy S9. We detected the clock as 2704 MHz, which is 200MHz less than the 2.9 GHz that Samsung's LSI division advertises for the chipset. What makes the story more compelling is that the 2.7 GHz clock is only achievable when one of the cores in the cluster is active - thus making Samsung employ scalable maximum frequencies depending on active core numbers in the big cluster. At two active cores the frequency drops down to 2314 MHz while three and four active cores the cores clock down to only 1794 MHz.

We can also confirm that the Mali G72MP18 GPU is running at a very conservative 572MHz. This is not what we had expected - the previous generation Exynos 8895 had a larger MP20 configuration, running at a similar 546MHz. The resulting performance gains for the GPU thus seem to be even lower than we had expected, as I was betting on a ~650-700 MHz clock for the graphics.

Memory Latency

I was also able to confirm the cache configurations of the CPUs with help of our latency test. The L1D cache of the M3 cores is 64KB, up from the 32KB on the previous generation. The M3 cores also come with 512KB of private L2 caches, and a shared 4MB L3 cache.

The little A55 cores came at a surprise as they look to be in a separate cluster, rather than in a single DynamIQ cluster with the big cores. This creates something similar to a big.Little design, but each part of the 4+4 is its own DynamIQ cluster. So here it looks like Samsung has decided not to employ the optional L2 caches for the Cortex A55s, and instead the cluster solely relies on a shared 512KB L3 cache of the DSU. The latency scores to DRAM are outlandishly good and the best we’ve ever seen among current Android SoCs, so Samsung has definitely introduced a new generation of interconnect or memory controllers.

Parsing the Benchmark Results: Geekbench Looks Good

In our testing we were able to confirm the GeekBench 4 scores already leaked, where we saw the Exynos 9810 achieving excellent performance gains and vastly outpacing the Snapdragon 845, and coming into the territory of the Apple A10 and A11. Meanwhile versus the last-generation Exynos 8895, the floating point performance increases handily exceed Samsung’s projected gains of 2x as we see a 114% improvement even at the lowered 2.7GHz frequency.

When looking at the performance per clock it is clear how the Exynos M3 distinguishes itself as a much wider microarchitecture compared to any other existing CPU which powers Android SoCs.

Parsing the Benchmark Results: PCMark and Web Tests

Finally I stumbled upon some very questionable performance figures when testing system performance. I’m not going to go into the details for every benchmark as they are generally all painting the same picture:

What seems clear is that there is something is very very wrong with the Exynos 9810 S9+ that I tested. It was barely able to distinguish itself from last year’s Exynos 8895, let alone the Snapdragon 845 in the Qualcomm Reference Device which we previewed earlier this month. I looked through the system and monitored frequencies and indeed the big cores were reaching the maximum 2.7GHz core frequency. The only explanation I have right now is that it’s possible that the DVFS configuration, as well as the scheduler, are currently so conservatively tuned that there is barely any activity on the big cores.

I dug a bit more through the system and found out Samsung uses some new scheduler called “eHMP”. I’m not sure if this is something based on EAS but the system did use schedutil as a frequency governor.

One of the Samsung spokesmen confirmed to me that the demo unit were running special firmware for MWC and that they might not be optimized. I’m having a bit of a hard time believing they would so drastically limit the performance of the device for the show demo units and less so that they would mess around with the scheduler settings. I did get confirmation that Samsung is planning to “tune down” the Exynos variant to match the Snapdragon performance – however the current scores which I got on these devices make absolutely no sense so I do hope this is just a mistake that will be resolved in shipping firmwares and we see the full potential of the SoC.

Parsing the Benchmark Results: Graphics

On the GPU side, the lower cluster count of the new Mali G72MP18 is a surprise, as the minor clock bump is negated by the fact that the new SoC has two less GPU cores compared to the 8895. If the performance per clock per core between the G71 and G72 were the same then this would actually mean a downgrade in raw GPU power from the Exynos 8895, so any increase, if any, should come solely thanks to the architectural changes of the new G72 GPU, power efficiency improvements, as well as possibly SoC memory subsystem improvements.

In T-Rex, the increase is 18% which might be one of the benchmarks that Samsung sourced their 20% improvement from. Here the Exynos is more near to the performance of the Snapdragon 845.

Measuring Power

I wasn’t able to properly measure power on the event demo devices, as they had different interface settings than my tool had been programmed with, so I only was able to make some inaccurate estimates based on coarse current readout from the system.

For CPU workloads, our usual CPU power virus used up 3.1W at 1-core 2.7 GHz loads. 2-core 2.3 GHz seemed to have floated around 3.1-3.5W, and a 4-core load at 1.8 GHz maintained this power consumption.

Over the following days I will need more time, and hopefully get some SPEC figures to paint a more accurate picture. For now the results could swing either way and be either positive or negative for the M3 cores. It’s clear that the higher frequencies have a very large power penalty, and Samsung should want to operate more in the low-to-mid frequencies, hence the current frequency scheme.

On the GPU side for Manhattan fluctuated between 4.5 and 5.2W, which is an improvement over the Exynos 8895. But again, this is still at a disadvantage compared to the Snapdragon 845.

Quick Thoughts

Overall today’s quick benchmarking session opened up more questions than it managed to answer. Hopefully with more time we will be able to investigate the working of the new SoC and, fingers crossed, today’s results are not representative of shipping product as that would otherwise be an utterly massive disappointment.

Post Your Comment

123 Comments

I was really hoping they would go with 2 M3's and 4 M2's. At 10nm LPP, wouldn't it be relatively efficient if last year's cores were heavily tuned down? I mean, the small cluster is usually always close to max clocks... I don't know, the chip would have ballooned causing it not to be economically feasible.Reply

Reminds me of another SoC containing 810 in the name. Also in that SoC there was 'almost no activity on big cores'. Here at least throttling on big cores is going to be ~50% leading to maybe acceptable system performance. On SD810 throttling was up to 75%.Reply

That's ignoring the transient load issues with Apple's A9 that regularly and repeatedly flatline the whole device after 6-9 months of normal use. A11 also struggles to maintain performance under a sustained load, although it seems churlish to complain when its overall performance still remains relatively high.Reply

The iphone 6, 6s and 7 have been found to permanently throttle the CPU within the reasonable 2 year operation of the device. Even a slight battery wear could cause the devices in question to shut down so Apple kneecapped them. The CPUs are good initially, but the numbers are not sustainable.

Even the relatively new A10 is throttled already as the folks from GB have shown. According to a letter to a US committee of some sorts, the iphone 8 and the x have hardware in them that fixes this. Unfortunately, even after battery swaps the previous iphones will arrive to the same permanent throttle in unreasonably time length. Reply

Evidence that it is reasonable use over two years? I have plenty of anecdotal use over two years with no throttling, even going on 3.5 years. There is no fixed battery life, it all depends on use/abuse and charging patterns.Reply

It was common enough for a given sample that Apple issued the kneecapping fix. Before the patch they issued a limited recall as shutdown issues started to multiply. They realized that the issue is wider than they thought and issued the patch instead of issuing a total recall.

In Geekbench HQ' example an iphone 7 was already throttled after one year. This constitutes in many parts of Europe a concealed defect. It is not reasonable to lose power after one year. It is reasonable to lose battery life. Furthermore no other devices does this.Reply

Think of the legal ramifications. Apple has gamed strict warranty and insurance terms. This permanent loss of performance instead of battery life is also a hidden flaw illegal in many parts of the world. Reply