AMD launched its much anticipated Bulldozer architecture for the consumer market last month, but many were disappointed at the performance numbers. Now the company has officially launched new processors using the same architecture for the server and workstation markets, but things have changed significantly.

The key difference is in the software used to process instructions. The consumer side is reliant upon Windows 7 and earlier operating systems, which are unaware of the shared nature of the Bulldozer architecture. Resource sharing is inefficient at best, and the full possibilities of higher Turbo Core frequencies are missed.

AMD has worked to ensure optimization and/or support on many commonly used server operating systems. Linux 2.6.37, Windows Server 2008 R2 SP1, Xen 41, Ubuntu 11.04, and VMware vSphere 5.0 already have OS hypervisor support for Bulldozer, while others such as RedHat Enterprise Linux 6.2 and Windows 8 Server are currently in development.

AMD is specifically targeting the High Performance Computing (HPC) segment, with over 500,000 Bulldozer cores already shipped to this market since September. The AVX, FMA4, and XOP instructions require software to be recompiled in order to take advantage of their performance enhancements. Java 7 was mentioned as a program that was being worked on.

The Opteron 6200 series was formerly codenamed Interlagos. It is scalable to 4 sockets supporting 16 Bulldozer cores each. The fastest model is the 6282 SE at 2.6Ghz, with a maximum Turbo Core frequency of 3.3GHz and a TDP of 140W. The Opteron 4200 series was formerly codenamed Valencia. It is the most similar to the FX series (Zambezi) launched in October, but it will support up to 2 sockets with 8 cores each.

Both series support DDR3-1600 memory natively, but there will be official support for DDR3-1866 through specific OEMs. Opteron 6200 CPUs have quad memory channels, while the Opteron 4200 chips have dual channels. 1.35v low voltage memory and 1.25v ultra-low voltage memory is also supported, as are Load Reduced DIMMs (LRDIMMs).

The L1 cache is arranged as 16KB data per core and 64KB instruction per module, while the L2 cache is 1MB per core. Opteron 6200s have a shared 16MB of L3 cache per socket, while Opteron 4200s only have a shared 8MB per socket.

In order to speed time to market and lower validation costs, AMD has designed its new Opterons to function on its previous platforms using the G34 and C32 sockets. The company believes that its lower total platform costs over Intel’s Xeon platforms impart a significant advantage. For example, the AMD Opteron 6276 will ship at the same price as the Xeon E5640, but will outperform it by 89%.

Cloud computing requires high throughput, scalability, density, and power efficiency. AMD thinks that it can gain significant market share by claiming the lowest x86 watts/core in the industry at 5.3W for Interlagos and 4.375W for Valencia. The new C6 power state reduces power consumption at idle by up to 46% over the previous generation by enabling core power gating When a core is halted, its context is exported to system memory and voltage is removed from the core.

Intel will be launching new server and workstation products based on the Sandy Bridge architecture next year, but AMD also has plans for the future with its Piledriver architecture. Sepang will use the C2012 socket and replace the Opteron 6200 series, while Terramar will use the G2012 socket. Both new platforms will support PCIe 3.0.

"We basically took a look at this situation and said, this is bullshit." -- Newegg Chief Legal Officer Lee Cheng's take on patent troll Soverain