Below is a rundown of what was announced, along with some analysis. Hit the links above for more details.

Coherent HyperTransport and the Big Picture

As I explained at the very beginning of this year in, "Intel has a major '06-'07 weak spot, and AMD is aiming right at it," AMD plans to fight Conroe's onslaught in the server space by leveraging coherent HyperTransport (HT). AMD's strategy, which was already old news when it was re-announced with fanfare under a new codename (Torrenza) at yesterday's event, is to license coherent HT as widely as possible to third party hardware makers. (For more on the differences between HT and coherent HT, see the post linked above.)

The idea here is to develop an ecosystem of HT-compatible coprocessors of different types that can be gluelessly dropped into an open socket in an Opteron system. Ideally, you'd be able to pick from an array of application-specific coprocessors that would enable you to tailor your system for a particular type of workload—e.g. a Java coprocessor for web serving, or a vector coprocessor for simulations. Such a tailored mix of Opteron processors and application-specific coprocessors would give properly customized Opteron servers the edge over more general-purpose Woodcrest-based servers on an application-by-application basis. So even though the Opteron would be weaker in single-threaded performance than Woodcrest, AMD would still (in theory) be able to claim a price/performance advantage by gluelessly and cheaply ganging together Opterons with different types of coprocessors.

While this coherent HT-based strategy may yet work in the server space, it doesn't really help AMD that much in the consumer space, the latter being single-socket and a place where coherent HT offers no advantage. The only way that AMD could bring coherent HT into play in the consumer space would be to introduce a system with more than one socket.

AMD 4x4's takes aim at Conroe

Ok, I wasn't being 100 percent fair above when I said that, except for the codename, AMD's HT-based strategy was already old news when they re-announced it yesterday. The other new piece of information that came out of yesterday's event concerns AMD's plans to leverage coherent HT against Intel not just in the server space, as I've previously reported, but in the consumer space as well. The vehicle for making coherent HT a factor in the consumer market is a new, multi-socket, gaming/enthusiast-oriented system design called "4x4."

AMD's newly announced 4x4 platform is directly intended to counter Conroe in the performance segments of the high-end consumer market. The platform is basically two CPU sockets (+ unregistered DDR2 DRAM) linked with coherent HyperTransport (HT), such that you can drop in one or two dual-core Athlon64 FX CPUs to make a quad-core system. AMD's presentation suggested pairing a quad-core setup with a quad-GPU SLI graphics solution for ultimate gaming satisfaction.

AMD also has high hopes that there will eventually be things like physics coprocessors and GPUs available for use in that second slot, but more on that below.

So is the gaming world ready for quad-core? In a word, no. My prediction is that for most games in the next two years, you'll be better off with a dual-core machine where each individual core has superior single-threaded performance than with a quad-core machine where the single-threaded performance is lacking. I'll even go ahead and predict that a production dual-core Conroe system will still beat a production quad-core AMD system when the two are benchmarked against each other on popular games. (Note the words "production" and "popular" games. This means systems and software that are actually available to ordinary gamers in the retail channel, and not some combination of unreleased/beta/demo hardware and/or software.)

I will concede that in my prediction above, I may not be giving enough credit to the influence of the console market. With all of the new consoles pushing multicore and multithreading in a big way, it's possible that we'll see a non-trivial subset of games that really shine on a quad-core Athlon FX system. Still, I think the forthcoming K8L has a bigger chance of bringing AMD back to parity with Conroe than any amount of quad-core overkill.

Coherent HT licensing and coprocessors

One of the more exciting, pie-in-the-sky aspects of yesterday's meeting was AMD's talk of licensing coherent HT to GPU, physics, and media processor makers for eventual use in 4x4 systems. The idea here is that you'd drop, say, a GPU into that second CPU socket, in place of another Athlon FX. I was more optimistic about this coprocessor strategy in the server space, where the prices are higher and the volumes are lower, but I'm not so sold on its viability in the consumer market.

From a technical perspective, a multisocket coherent HT system that includes a dual-core processor and a tightly coupled GPU or physics coprocessor is a fantastic idea. If ATI or NVIDIA were to take AMD up on the licensing offer, such a system could make for a high-performance, expandable, and relatively low-cost God Box. Think about it: you don't have the extra production cost associated with a graphics daughtercard, you can cheaply expand the amount of DDR2 that's attached to the GPU socket, you get the benefits of a shared pool of system and graphics RAM, you have a high-bandwidth link directly between the CPU and GPU with no intervening bridge chip, and so on. It has all the makings of a killer gaming rig that, while expensive, might still be cheaper than a comparable Conroe system.

If you expand this fantasy further to include a theoretical four-socket system, the possible combinations get even more fun, e.g. two dual-core Athlons + two GPUs; or, a dual-core Athlon + two GPUs + a physics coprocessor. That's a lot of hardware to hook up to a big pool of DDR2, and a game developer could go nuts with it.

While the scenario described above majorly turns my geek knobs, I'm not so sure the market will approve. The money for mass-market media and physics chips is wherever the volume is, and right now the volume is in PCIe. AMD has to convince potential coprocessor makers who're already safely making money selling PCIe daugthercards that if they build an HT-based chip, the market will come. Getting enough market traction to sell chipmakers on such specialized hardware is going to be a tough, uphill battle in the face of Conroe + PCIe.

Will K8L save the day?

AMD's best weapon against Conroe in the consumer space will probably be their forthcoming major core revision, the 65nm K8L. K8L is a pretty major evolutionary step in the K8 lineage, sort of like the move from Yonah to Conroe. David Kanter of RWT posted some K8L details a while back in his site's forum, and he now has fresh info from the analyst meeting available. Here some highlights on the new design, which is due out in the first half of '07.

Native quad-core

On-die L3 cache @ 2MB initially

Each core will have a private L2

All SSE units will have 128-bit datapaths (like Conroe), which will make for single-cycle SSE execution

FPUs will also have 128-bit datapaths

Faster HyperTransport links

Instruction fetch is doubled from 16 bytes/cycle to 32 bytes/cycle

I really have no idea whether K8L will have enough mojo to take on Conroe in single-threaded performance. The K8L vs. Conroe contest of 2007 will hinge on how fast the two chips are running, possible DDR3 support, and all kinds of other factors that have little to do with microarchitecture.