We just got off the phone with Nick Knupffer of Intel, who confirmed something that has long been speculated upon: the fate of Larrabee. As of today, the first Larrabee chip’s retail release has been canceled. This means that Intel will not be releasing a Larrabee video card or a Larrabee HPC/GPGPU compute part.

The Larrabee project itself has not been canceled however, and Intel is still hard at work developing their first entirely in-house discrete GPU. The first Larrabee chip (which for lack of an official name, we’re going to be calling Larrabee Prime) will be used for the R&D of future Larrabee chips in the form of development kits for internal and external use.

The big question of course is “why?” Officially, the reason why Larrabee Prime was scrubbed was that both the hardware and the software were behind schedule. Intel has left the finer details up to speculation in true Intel fashion, but it has been widely rumored in the last few months that Larrabee Prime has not been performing as well as Intel had been expecting it to, which is consistent with the chip being behind schedule.

Bear in mind that Larrabee Prime’s launch was originally scheduled to be in the 2009-2010 timeframe, so Intel has already missed the first year of their launch window. Even with TSMC’s 40nm problems, Intel would have been launching after NVIDIA’s Fermi and AMD’s Cypress, if not after Cypress’ 2010 successor too. If the chip was underperforming, then the time element would only make things worse for Intel, as they would be setting up Larrabee Prime against successively more powerful products from NVIDIA and AMD.

The software side leaves us a bit more curious, as Intel normally has a strong track record here. Their x86 compiler technology is second to none, and as Larrabee Prime is x86 based, this would have left them in a good starting position for software development. What we’re left wondering is whether the software setback was for overall HPC/GPGPU use, or if it was for graphics. Certainly the harder part of Larrabee Prime’s software development would be the need to write graphics drivers from scratch that were capable of harnessing the chip as a video card, taking in to consideration the need to support older APIs such as DX9 that make implicit assumptions about the layout of the hardware. Could it be that Intel couldn’t get Larrabee Prime working as a video card? That’s going to be a big question that’s going to hang over Intel’s heads right up to the day that they finally launch a Larrabee video card.

Ultimately when we took our first look at Larrabee Prime’s architecture, there were 3 things that we believed could go wrong: manufacturing/yield problems, performance problems, and driver problems. Based on what Intel has said, we can’t write off any of those scenarios. Larrabee Prime is certainly suffering from something that can be classified as driver problems, and it may very well be suffering from both manufacturing and performance problems too.

To Intel’s credit, even if Larrabee Prime will never see the light of day as a retail product, it has been turning in some impressive numbers at trade shows. At SC09 last month, Intel demonstrated Larrabee Prime running the SGEMM HPC benchmark at 1 TeraFLOP, a notable accomplishment as the actual performance of any GPU is usually a fraction of its theoretical performance. 1TF is close to the theoretical performance of NVIDIA’s GT200 and AMD’s RV770 chips, so Larrabee was no slouch. But then again its competition would not be GT220 and RV770, it’s Fermi and Cypress.

Next, this brings us to the future of Larrabee. Larrabee Prime may be canceled, but the Larrabee project is not. As Intel puts it, Larrabee is a “complex multi-year project” and development will be continuing. Intel still wants a piece of the HPC/GPGPU pie (least NVIDIA and AMD get it all to themselves) and they still want in to the video card space given the collision between those markets. For Intel, their plans have just been delayed.

The Larrabee architecture lives on

For the immediate future, as we mentioned earlier Larrabee Prime is still going to be used by Intel for R&D purposes, as a software development platform. This is a very good use of the hardware (however troubled it may be) as it allows Intel to bootstrap the software side of Larrabee so that developers can get started programming for real hardware while Intel works on the next iteration of Larrabee. Much like how NVIDIA and AMD sample their video cards months ahead of time to game developers, we expect that Larrabee Prime SDKs would be limited to Intel’s closest software partners, so don’t expect to see much if anything leak about Larrabee Prime once chips start leaving Intel’s hands, or to see extensive software development initially. Widespread Larrabee software development will still not start until Intel ships the next iteration of Larrabee, if this is the case.

We should know more about the Larrabee situation next year, as Intel is already planning on an announcement at some point in 2010. Our best guess is that Intel will announce the next Larrabee chip at that time, with a product release in 2011 or 2012. Much of this will depend on what the hardware problem was and what process node Intel wants to use. If Intel just needs the ability to pack more cores on to a Larrabee chip then 2011 is a reasonable target, otherwise if there’s a more fundamental issue then 2012 is more likely. This lines up with the process nodes for those years: if they go for 2011 they hit the 2nd year of their 32nm process, otherwise if they launched in 2012 they would be able to launch it as one of the first products on the 22nm process.

For that matter, Since the Larrabee project was not killed, it’s a safe assumption that any future Larrabee chips are going to be based on the same architectural design. The vibe from Intel is that the problem is Larrabee Prime and not the Larrabee architecture itself. The idea of an x86 many-cores GPU is still alive and well.

On-Chip GMA-based GPUs: Still On Schedule For 2010

Finally, there’s the matter of Intel’s competition. For AMD and NVIDIA, this is just about the best possible announcement they could hope for. On the video card front it means they won’t be facing any new competitors through 2010 and most of 2011. That doesn’t mean that Intel isn’t going to be a challenge for them – Intel is still launching Carkdale and Arrandale with on-chip GPUs next year – but they won’t be facing competition at the high-end too. For NVIDIA in particular, this means that Fermi has a clear shot at the HPC/GPGPU space without competition from Intel, which is exactly the kind of break NVIDIA needed since Fermi is running late.

I'm still more partial for a software problem. The article may state: "The software side leaves us a bit more curious, as Intel normally has a strong track record here", but that's only in relation to the purely X86 ecosystem.

Given Intel's GPU track record however, that statement is not true at all. Their integrated graphics have underachieving for almost their entire existence - lot's of promises made ("feature x in next driver revistion", ...) but little delivered. Reply

Moreover, there's no mention of the new extra-wide vector instruction set from Larrabee in the new 48-core chip, nor would that really be especially useful in its intended "cloud-computing" target market.
Reply

I even wonder if Larrabee will have a successor, since it sounds like the initial product is a pretty major disappointment. It's easier for Intel to say the product is not totally dead, but an announcement like this just keeps investors looking to the future. After all, Intel has a very long way to go in a segment of the market where they have totally sucked for a long while. Could Larrabee become another Itanic, where Intel's dream is to take over a segment with their own architecture, only to have it become a niche product? Reply

If you mean Intel needs to build their own GPGPU, I think you're wrong (my opinion). There are already two very successful companies who already make better GPGPUs than Intel... namely NVidia... ATI/AMD, a distant second. But even AMD/ATI is doing better than Intel, otherwise they won't have canceled Larabee (my opinion). And building their own GPGPU they've overlooked something critical... something very critical. The way Intel and AMD are handling the direction of CPUs... they're just adding more and more cores and perhaps wasting valuable transistor real estate that could be put to better use like more cache memory.

They're shifting attention away from where they should be focusing... improving the overall system design of the computer. By improving I don't mean adding more cores to the CPU, but rethinking the CPU/GPGPU relationship and redesigning the whole computer system. They need as they call it another "paradigm-shift".

And integrating a GPU to the CPU isn't really a good idea. You get unnecessary design complexity and end up with a lower performing CPU/graphics design... in engineering textbook this would be a no-no unless you are going towards perhaps a low-transistor/low-power design, which won't hold a candle to the high-end discreet CPU/GPU design.

How will Intel's CPUs compete with a TFLOP(S) of compute power?

Easy... They need to rethink the CPU and GPU... TFLOPS... tera (trillion) floating-point calculations per second... where does it come from... look underneath the hood and you find Stream processors... and what do all Stream processors have in common... they do floating point math...

128-bit floating point math.... There used to be a time when CPU had a discrete math processor call the FPU (80-bit floating-point processing unit), but rather than create multi-core FPUs (because the concept/technology/design methodology and basic knowledge to do something that wasn't available at the time) they integrated the FPU into the CPU (irony).

The reason for the reemergence of floating-point processing is because they are HEAVILY utilized in 3D graphics and their use is universal in computing, especially simulations.

The only thing they don't do is integer operations and this is the part where the CPU design engineers need to look into... creating a discrete part with a massive array of SSE ("Stream"ing SIMD extension) units with some limited integer operations... maybe.

The CPU might be stripped down of unnecessary components and instead integrate with the chipset to become a super fast systems I/O manager... managing I/O, thread manager/scheduler, system watchdog, etc., etc. They've already moved in that direction by incorporating the memory controller hub into the CPU.

There are some details I have left out since it's beyond the scope of this forum/venue.

But it is my belief that traditional multi-core CPUs are on a fast track to becoming obsolete. Reply

GPUs are very limited in their flexbility and so would Larrabee have been. Few programs align ideally with the "massive numbers of simple processing units" design and so general purpose CPUs aren't going anywhere anytime soon. Reply

Intel is like any other company: expand or die. I have no doubts that there will be a Larrabee product at some point in the future. Intel can't afford to ignore the massively parallel computing/HPC market. Reply