IDF Fall 2007: Terascale Computing Updates and more

Terascale still moving forward

With our time at IDF quickly drawing to a close, we are in the final stretch of content from the show. While I didn’t get to the keynotes on the morning on mobility, we have the general basis of their contents here as well as some other interesting topics like terascale processors and Intel’s upcoming Extreme Memory.

Terascale Computing Update

Last year at IDF Intel first started to demonstrate and discuss their terascale computing research project that featured a processor that was a dramatic shift from anything Intel had done in the past. The CPU was built out of 80 separate very small and simple processing cores all connected with some basic communications and local data storage.

The project created a lot excitement with claims of teraflop computing and higher all with a very minimal power footprint. At PC Perspective we covered the project in great detail with a pairof articles spanned over the previous 18 months or so. The terascale processor is still alive and well at Intel and was on display at the IDF technology showcase.

The engineers on hand were showing the terascale hardware running an example application that was capable of performing various typical high-performance computing algorithms and then outputting the speed and power consumption of the chip during that time. From the image above you can see that Intel has pushed the chip up to a two teraflop performer where each of the 80 cores was running at a speed of 6.26 GHz and consumed 150.47 watts during the computing process. They did show other algorithms being run besides the partial differential equations and could also lower the slider on the right hand side to keep the maximum performance at a lower level, thus decreasing the clock speed of the cores and lowering the total power usage.

This poster at the exhibit gives us a lot of information on the CPU itself:

Built on the 65nm process technology

Total die size of 275 mm^2 with each of the 80 tiles 3mm^2 each

100 million transistors (much lower than the 731 million on the upcoming Nehalem cores)

Using a 1248-pin packaging

The graphs below the die detail shots show the relationships that Intel’s designers have created between power usage and peak performance numbers. Most interesting is the power that is required to run the chip – at only 0.95v the terascale processor generates only 62 watts of heat and is capable of 1 teraflop of pure horsepower. To get near the 2 teraflop levels the CPU is pushing 1.4v still a reasonable voltage by today’s standards.

Even more details are in this document that show off some details on the individual tiles as well. Each core is a single-precision processor and uses an on-die mesh interconnect for communication with the nodes directly surrounding it. Internally that mesh can provide 1.62 terabits/ s of communications bandwidth. Each tile can be clocked independently as well with scalable power consumption and a heavy dose of dynamic sleeping functions.

The terascale processor was built on a custom PCB motherboard and was using was standard liquid cooling. The memory system being used on this demo system is stacked memory – the processor is physically attached to the memory system. That is one of the only ways that Intel has been able to provide a processor with this much power enough data to fully utilize the cores. Unfortunately, the cost structure of stacked memory on a CPU is one the main inhibitors of terascale computing for now.

The team had a wafer of 65nm terascale processors on hand for us to get my grubby hands. I only wish they had turned away for just a moment...

Getting an up close shot was difficult, but in doing so I was actually able catch a glimpse of each individual tile for each processor.

Again, I left the terascale engineers feeling very impressed with the work they were doing – and fairly confident that this kind of technology is going to end up in many more computers than we realize; and maybe much sooner.