Feature Articles

Intel, The Way It's Meant To Be Played?

** Updated on 5th August 2008 **

Intel's Larrabee Detailed

If you've been keeping abreast with developments from Intel, such as the Intel Roadmap Article we penned earlier this year , you would have had a good idea of Intel's Larrabee project. Unlike the established visual computing leaders of AMD and NVIDIA, Intel will approach this space with an architecture that they excel at, and what better than their x86 processing cores. Intel actually did some design experimentation to bring about a theoretical 10-core throughput-optimized processor with the same area and power consumption of a dual-core CPU.

Intel's Design Experimentation for Larrabee

Features

Intel Conroe

Theoretical Larrabee

No. of CPU cores

2 (out-of-order)

10 (in-order)

Instruction per Issue

4 per clock

2 per clock

Vector Processing Unit (VPU) lanes per core

4-wide SSE

16-wide

L2 cache size

4MB

4MB

Single-stream throughput

4 per clock

2 per clock

Total Vector operations throughput

8 per clock

160 per clock

With a simpler x86 core design derived from the original dual-issue Pentium processor's in-order architecture versus that of the modern four-issue and out-of-order Core architecture, stream throughput takes a dive too on the Larrabee x86 core. The net result though, is that the simpler theoretical 10-core design is able to process more than 20 times the vector operations of a modern dual-core processor. This is the idea behind Larrabee and will contain numerous of these Intel x86 cores of undisclosed amount, each with a vector processing unit, a much wider SIMD unit (16-wide), support for 64-bit extensions and sophisticated pre-fetching. These requirements will introduce a new vector handling instruction set as well.

Take note that unlike traditional GPUs, there isn't a fixed function rasterization logic between the vertex and pixel shaders, nor is there a frame buffer blend in the backend. The functions are all programmable and don't follow the fixed pipeline format. According to data gathered by Intel, there is no single magic workload for any game and it actually varies quite widely. As such, all the processing blocks are fully programmable on the Larrabee to what extent is required by the task at hand.

This brings us to the last topic of order for Larrabee - the API used to interface the hardware. As mentioned previously, Larrabee will be able to tackle DirectX and OpenGL calls, but it won't be at the run-time level within the hardware; rather this will require a software translator/renderer to interface between DirectX and OpenGL instructions and Larrabee's x86-based hardware. Part of the goal is to ensure this takes place fast enough that it's seamless enough and is at run-time speeds. This is where Intel has to get their software stack ready and good to go as best as they can since game developers have already been accustomed to DirectX/OpenGL programming. Additionally, Intel's shoddy past for the IGP drivers aren't exactly a beacon of light, so it's imperative that Intel can get their act together if Larrabee is to be successful.

However, thanks to the x86 processing cores in Larrabee, it actually supports C/C++ just like your desktop processors. In that sense, it's actually a lot easier to program for the Larrabee hardware since programming in this language is nothing new. Only thing required is to be aware of the newer core extensions and how to utilize them to get the best out of Larrabee. Albeit C/C++ is a common programming language, it's not in the game development world. If the developers spend some time to focus on this 'new' code path, they may be able to harness more out of the Larrabee architecture, be it either in special implementations or much faster processing since it's native to the hardware.

There's certainly an interesting potential for Larrabee and how it complements both traditional processing tasks as well as stream processing tasks with an x86-based processing core at the heart of the Larrabee, but at this point of time, Larrabee has a long way to go. No expected results of any sort have been shared at the moment and with the existing launch timeline, Larrabee is expected to see light during 2010. Sounds like a long time away, but time flies faster than you think. Later this month, Intel will host their IDF Fall conference and we expect to see some nice updates on Larrabee among other developments on their next generation Server/Desktop architecture with Nehalem. For now, we'll leave you with this rough performance scaling chart that Intel has shared on how they expect Larrabee to perform (albeit not informative enough without any points of reference, this does leave you with an idea of how multiple Larrabee-based product SKUs will come about):-