GPGPU and cloud computing have been hot topics for the last several years. Intel has shown off several designs like Larrabee and the Single-chip Cloud Computer in the past. However, it is Knights Corner that will be the firm's first commercial product to use the Many Integrated Core (MIC) architecture. The co-processor will be offered as a PCIe add-in board.

The MIC concept is simple: Use architecture specifically designed to process highly parallel workloads, but ensure compatibility with existing x86 programming models and tools.

This would give MIC co-processors the ability to run existing applications without the need to port the code to a new programming environment, theoretically allowing maximum CPU and co-processor performance simultaneously with existing x86 based applications. This would dramatically save time, cost and resources that would otherwise be needed to rewrite them to alternative proprietary languages.

AMD and NVIDIA have been trying to do with their latest architectures by enabling support for languages like C++, but Intel wants to challenge them in this potentially lucrative market.

Knights Corner will be manufactured using Intel’s latest 3-D Tri-Gate P1270 22nm transistor process and will feature more than 50 cores. Intel demonstrated first silicon of Knights Corner at the SC11 conference yesterday. The co-processor wowed the crowd by delivering more than 1 TeraFLOPS of double precision floating point performance.

The firm also touted its "commitment to delivering the most efficient and programming-friendly platform for highly parallel applications", and showed off the benefits of the MIC architecture in weather modeling, tomography, protein folding, and advanced materials simulation at its booth.

There is no timeframe on when Knights Corner will enter production or be available to customers.

Comments

Threshold

Username

Password

remember me

This article is over a month old, voting and posting comments is disabled

515 Gigaflops is what the Tesla is able to do in double precision. The Intel drop in card can do 1000+ Gigaflops in double precision. This makes the intel card twice as fast. Impressive to say the least.

Actually, the Tesla C2050 can do 515 DP GFLOP peak. According to this nvidia presentation (http://www.nvidia.com/content/GTC-2010/pdfs/2057_G... it looks like they get around 360 GFLOP DGEMM. Hence the single Knights Corner chip is delivering around 3x the performance of the Tesla C2050. It'll be quite interesting to see what production performance looks like... not to mention power consumption.

True but Intel is doing this on a single chip vs. the SLI configuration which still requires two or three GPUs. I'm sure Intel is also working on a way to make multiples of these chip work in parallel like SLI/X-Fire.

To be fair, he's comparing 3 cards that actually exist and can be bought in the shops to one specially selected piece of silicon that only exists Intel's super secret research lair.

By the time that Intel's processor reaches the market, it'll be interesting to see what one Nvidia card could do, or how ever many that you could buy and put in a rig for the same money as the rig that you'll need to get those figures out of a Knights processor.

Still, at least it's good to see that the Larrabee time and money might not be going to waste after all.

I'm not saying intel will dominate, or that this card will even see the light of day within time to be relevant, but to say something like "we can already do this, but with 3 cards!" is silly. Similar to someone dropping a petaflop chip and saying "we've done that with supercomputers for years!"

I can see nvidia outpacing them in raw performance, especially since there's no window of release, if ever released.

They're touting x86 compatibility, though. With GPGPU you're porting/building new CUDA/openCL software. Theoretically this chip would work with current software, which is very enticing for developers and for those of us using slow adobe software.

As I stated above, it's quite important to differentiate between theoretical and actual performance. The Tesla M2090 has a peak theoretical throughput of 665 GFLOPS. From what I've found on NVIDIA's own presentations for their C2050, that theoretical throughput on the M2090 will likely go down to around 470 GFLOPS on the common DGEMM. The 1TFLOPS demonstration on Knights Corner was on DGEMM, it's not a theoretical max.