Choosing a Testbed CPU

Although I was glad I could put some of these old GPUs to use (somewhat justifying them occupying space for years in my parts closet), there was the question of what CPU to pair them with. Go too insane on the CPU and I may unfairly tilt performance in favor of these cards. What I decided to do was to simulate the performance of the Core i5-3317U in Microsoft's Surface Pro. That part is a dual-core Ivy Bridge with Hyper Threading enabled (4 threads). Its max turbo is 2.6GHz for a single core, 2.4GHz for two cores. I grabbed a desktop Core i3 2100, disabled turbo, and forced its default clock speed to 2.4GHz. In many cases these mobile CPUs spend a lot of time at or near their max turbo until things get a little too toasty in the chassis. To verify that I had picked correctly I ran the 3DMark Physics test to see how close I came to the performance of the Surface Pro. As the Physics test is multithreaded and should be completely CPU bound, it shouldn't matter what GPU I paired with my testbed - they should all perform the same as the Surface Pro:

Great success! With the exception of the 8500 GT, which for some reason is a bit of an overachiever here (7% faster than Surface Pro), the rest of the NVIDIA cards all score within 3% of the performance of the Surface Pro - despite being run on an open-air desktop testbed.

With these results we also get a quick look at how AMD's Bobcat cores compare against the ARM competitors it may eventually do battle with. With only two Bobcat cores running at 1.6GHz in the E-350, AMD actually does really well here. The E-350's performance is 18% better than the dual-core Cortex A15 based Nexus 10, but it's still not quite good enough to top some of the quad-core competitors here. We could be seeing differences in drivers and/or thermal management with some of these devices since they are far more thermally constrained than the E-350. Bobcat won't surface as a competitor to anything you see here, but its faster derivative (Jaguar) will. If AMD can get Temash's power under control, it could have a very compelling tablet platform on its hands. The sad part in all of this is the fact that AMD seems to have the right CPU (and possibly GPU) architectures to be quite competitive in the ultra mobile space today. If AMD had the capital and relationships with smartphone/tablet vendors, it could be a force to be reckoned with in the ultra mobile space. As we've seen from watching Intel struggle however, it takes more than just good architecture to break into the new mobile world. You need a good baseband strategy and you need the ability to get key design wins.

Enough about what could be, let's look at how these mobile devices stack up to some of the best GPUs from 2004 - 2007.

We'll start with 3DMark. Here we're looking at performance at 720p, which immediately stops some of the cards with 256-bit memory interfaces from flexing their muscles. Never fear, we will have GL/DXBenchmark's 1080p offscreen mode for that in a moment.

Graphics Test 1

Ice Storm Graphics test 1 stresses the hardware’s ability to process lots of vertices while keeping the pixel load relatively light. Hardware on this level may have dedicated capacity for separate vertex and pixel processing. Stressing both capacities individually reveals the hardware’s limitations in both aspects.

In an average frame, 530,000 vertices are processed leading to 180,000 triangles rasterized either to the shadow map or to the screen. At the same time, 4.7 million pixels are processed per frame.

Right off the bat you should notice something wonky. All of NVIDIA's G70 and earlier architectures do very poorly here. This test is very heavy on the vertex shaders, but the 7900 GTX and friends should do a lot better than they are. These workloads however were designed for a very different set of architectures. Looking at the unified 8500 GT, we get some perspective. The fastest mobile platforms here (Adreno 320) deliver a little over half the vertex processing performance of the GeForce 8500 GT. The Radeon HD 6310 featured in AMD's E-350 is remarkably competitve as well.

The praise goes both ways of course. The fact that these mobile GPUs can do as well as they are right now is very impressive.

Graphics Test 2

Graphics test 2 stresses the hardware’s ability to process lots of pixels. It tests the ability to read textures, do per pixel computations and write to render targets.

On average, 12.6 million pixels are processed per frame. The additional pixel processing compared to Graphics test 1 comes from including particles and post processing effects such as bloom, streaks and motion blur.

In each frame, an average 75,000 vertices are processed. This number is considerably lower than in Graphics test 1 because shadows are not drawn and the processed geometry has a lower number of polygons.

The data starts making a lot more sense when we look at the pixel shader bound graphics test 2. In this benchmark, Adreno 320 appears to deliver better performance than the GeForce 6600 and once again roughly half the performance of the GeForce 8500 GT. Compared to the 7800 GT (or perhaps 6800 Ultra), we're looking at a bit under 33% of the performance of those cards. The Radeon HD 6310 in AMD's E-350 appears to deliver performance competitive with the Adreno 320.

The overall graphics score is a bit misleading given how poorly the G7x and NV4x architectures did on the first graphics test. We can conclude that the E-350 has roughly the same graphics performance as Qualcomm's Snapdragon 600, while the 8500 GT appears to have roughly 2x that. The overall Ice Storm scores pretty much repeat what we've already seen:

Again, the new 3DMark appears to unfairly penalize the older non-unified NVIDIA GPU architectures. Keep in mind that the last NVIDIA driver drop for DX9 hardware (G7x and NV4x) is about a month older than the latest driver available for the 8500 GT.

It's also worth pointing out that Ice Storm also makes Intel's HD 4000 look very good, when in reality we've seen varying degrees of competitiveness with discrete GPUs depending on the workload. If 3DMark's Ice Storm test could map to real world gaming performance, it would mean that devices like the Nexus 4 or HTC One would be able to run BioShock 2-like titles at 10x7 in the 20 fps range. As impressive as that would be, this is ultimately the downside of relying on these types of benchmarks to make comparisons - they fundamentally tell us how well these platforms would run the benchmark itself, not other games unfortunately.

At a high level, it looks like we're somewhat narrowing down the level of performance that today's high end ultra mobile GPUs deliver when put in discrete GPU terms. Let's see what GL/DXBenchmark 2.7 tell us.

+1Anand sticking to the subject and diving into details and at the same time giving perspective is great work!I dont know if i buy the convergence thinking on the technical side, because from here it look like people is just buying more smatphones and far less desktop. The convergence is there a little bit, but i will see the battle on the field before it gets really intereting. Atom is not yet ready for phones and bobcat is not ready for tablets. When they get there, where are arm then?

If Atom is ready for tablets, then Bobcat is more than ready. The Z-60 may only have one design win (in the Vizio Tablet PC), but it should deliver comparable (if not somewhat superior) CPU performance with much, much better GPU performance.Reply

Uh, no... The Hondo Z-60 is basically just an update to the Desna, which itself was derived from the AMD Fusion/Brazos Ontario C-50.

While it is true that Bobcats are superior to ATOM processors for equivalent clock speeds. The problem is AMD has to deal with higher power consumption and that generates more heat, which in turn forces them to lower the max clock speed... especially, if they want to offer anywhere near competitive run times.

So the Bobcat cores for the Z-60 are only running at 1GHz, while Clover Trail ATOM is running at 1.8GHz (Clover Trail+ even goes up to 2GHz for the Z2580, that that version is only for Android devices). The differences in processor efficiency is overcome by just a few hundred MHz difference in clock speed.

Meaning you actually get more CPU performance from Clover Trail than you would a Hondo... However, where AMD holds dominance over Intel is in graphical performance and while Clover Trail does provide about 3x better performance than previous GMA 3150 (back in the netbook days of Pine Trail ATOM) it is still about 3x less powerful as the Hondo graphical performance.

Only other problems is Hondo only slightly improves power consumption compared to the previous Desna, down to about 4.79W max TDP though that is at least nearly half of the original C-50 9W...

However, keep in mind Clover Trail is a 1.7W part and all models are fan-less but Hondo models will continue to require fans.

While AMD also doesn't offer anything like Intel's advance S0i Power Management that allows for ARM like extreme low mw idling states and allowing for features like always connected standby.

So the main reason to get a Hondo tablet is because it'll likely offer better Linux support, which is presently virtually non-existent for Intel's present 32nm SoC ATOMs, and the better graphical performance if you want to play some low end but still pretty good games.

It's the upcoming 28nm Temash that you should keep a eye out for, being AMD's first SoC that can go fan-less for the dual core version and while the quad core version will need a fan, it will offer a Turbo docking feature that lets it go into a much higher 14W max TDP power mode that will provide near Ultrabook level performance... Though the dock will require an additional battery and additional fans to support the feature.

Intel won't be able to counter Temash until their 22nm Bay Trail update comes out, though that'll be just months later as Bay Trail is due to start shipping around September of this year and may be in products in time for the holiday shopping season.Reply

It is in several phones already, where it delivers a strong performance from a CPU and power consumption perspective. It's weak point is the GPU from Imagination. In two years time, ARM will be a distant memory in high-end tablets and possibly high-end smartphones too, even more so if we get advances in battery technology.Reply

Well Atom is in several phones that do not sell in any meaningfull matter. Sure there will be x86 in high-end tablets, and jaguar will make sure that happens this year, but will those tablets matter? There is ARM servers also. Do they matter?Right now there is sold tons of cheap 40nm A9 products, the consumers is just about to get into 28nm quadcore A7 at 2mm2 for the cpu part. And they are ready for cheap, slim phones, with google play, and acceptable graphics performance for templerun 2.Reply

Also note that despite Anand making the odd "Given that most of the ARM based CPU competitors tend to be a bit slower than Atom" claim, the Atom 2760 in the Vivo Tab Smart scores consistently the worst on both the CPU and GPU tests. Even Surface RT with low clocked A9's beats it. That means Atom is not even tablet-ready...Reply

The Atom scores worse in 3DMark's physics test, yes. But any other CPU benchmark I've seen has always favored Clover Trail over any A9-based ARM SoC. A15 can give the Atom a run for its money, though.Reply

Well I haven't seen Atom beat an A9 at similar frequencies except perhaps SunSpider (a browser test, not a CPU test). On native CPU benchmarks like Geekbench Atom is well behind A9 even when you compare 2 cores/4 threads with 2 cores/2 threads.Reply