A Look At Intel’s OpenCL Performance

It’s not what you might expect…

Intel recently introduced OpenCL 1.2 compatibility with a series of updates to their OpenCL SDK and graphics drivers. Do to the small field of publicly available OpenCL benchmarks we’ve never done any formal OpenCL testing here at S|A. But now seems as good a time as any, so we’ll be putting two old foes back into the ring for one last fight. Intel’s Core i7-3770k and AMD’s A10-5800k have faced off against each other numerous times in the last six months and if you were to base your expectations for this OpenCL battle off of each contender’s GPU performance then you’re in for quite a surprise.

OpenCL is all about using GPUs to accelerate applications. To that end we’re going to run almost all of our benchmarks on the GPUs of each of these chips. In the case of Intel’s i7-3770k that means the HD 4000 will be its OpenCL ambassador, and in the case of AMD’s A10-5800k this means that its GPU the HD 7660D, or Devastator, will be on the frontline. Intel’s HD 4000 has 16 EUs and AMD’s HD 7660D has 384 VLIW4 cores arranged into six groups. AMD does have an advantage when it comes to raw GPU power, but Intel has been working on improving their OpenCL implementation and the HD 4000 can use the i7-3770k’s eight megabyte L3 cache to improve its performance.

Let’s take a look at the benchmarks we’ll be running today. First up we have CLBenchmark 1.1 which tests a variety of different OpenCL workloads. For our purposes though, we’ll be looking at the results coming from three specific tests; Fluid Simulation, Raytracing, and Optical Flow.

Next up we have ratGPU which is an OpenCL based raytracing renderer. This benchmark measures the time in seconds it takes to complete a series of renderings, thus making its scores a matter of lower-is-better.

Following that we have FlopsCL which is a simple application designed to measure the peak throughput of an OpenCL device in GFLOP/s when doing floating point or double precision calculations. This is probably the most synthetic of all the benchmarks that we’ve chosen for this article.

Folding@Home is a distributed computing project that has mostly fallen by the wayside in recent years. But now it seems to be making a bit of a comeback with an OpenCL based implementation. The benchmark that we’ll be using measures performance in terms of nanoseconds rendered per day. One version of this benchmark computes the model with water molecules explicitly rendered and the other with water molecules implicitly rendered.

Our final benchmark is LuxMark 2.0. Benchmarking the ‘Salsa’ scene has become a popular way to test the OpenCL based rendering performance of many GPUs. So that’s exactly what we’ll be doing today.

We were a little dubious of our findings at first. So we re-ran all of our benchmarks on our AMD based test system with a fresh install of Windows and the latest drivers. The performance results did not differ between our two Windows installs. There are a few things that we can draw from these results though. Intel’s chip seems to have a pretty significant advantage in raytracing and rendering scenarios. AMD’s APU on the other hand tends to perform best when it’s doing simulations or putting its raw GPU power to the test in synthetic benchmarks.

This is not the outcome that I had expected when I began running these benchmarks. It seems that despite AMD’s greater GPU performance these two chips offer very similar OpenCL performance. Though it’s clear that AMD leads Intel significantly in some areas, it seems to trail them in just as many. Performance wise, Intel appears to have a rather mature OpenCL implementation which is surprising considering that Intel has just added OpenCL 1.2 support whereas AMD has been offering that feature for quite some time.

In any case it’s clear that AMD’s and Intel’s latest chips are in a tight race, competing for the best OpenCL performance and trading the benchmarking crown back and forth at every turn. Intel has the lead for now and seems to have built a fortress out raytracing performance, but AMD will surely counter. This battle will no doubt get more interesting as AMD and Intel bring their next-generation APUs to market in the near future.S|A

Thomas Ryan is a freelance technology writer and photographer from Seattle, living in Austin. You can also find his work on SemiAccurate and PCWorld. He has a BA in Geography from the University of Washington with a minor in Urban Design and Planning and specializes in geospatial data science. If you have a hardware performance question or an interesting data set Thomas has you covered.

Share this:

Thank you, Subscribers!

Thank you to our Subscribers, past and present. You are appreciated. You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.

Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also a council member with Gerson Lehman Group.

Thomas Ryan is a freelance technology writer and photographer from Seattle, living in Austin. You can find his work on SemiAccurate and PCWorld. He has a BA in Geography from the University of Washington with a minor in Urban Design and Planning and specializes in geospatial data science. If you have a hardware performance question or an interesting data set Thomas has you covered.