HotChips 2012

Ah, the end of August. School is about to start. American college football is about to get underway. Hot Chips is now in full swing. I guess the end of August caters to all sorts of people. For the people who are most interested in Hot Chips, the amount of information on next generation CPU architectures is something to really look forward to. AMD is taking this opportunity to give us a few tantalizing bits of information about their next generation Steamroller core which will be introduced with the codenamed “Kaveri” APU due out in 2013.

AMD is seemingly on the brink of releasing the latest architectural update with Vishera. This is a Piledriver+ based CPU that will find its way into AM3+ sockets. On the server side it is expected that the Abu Dhabi processors will also be released in a late September timeframe. Trinity was the first example of a Piledriver based product, and it showed markedly improved thermals as compared to previous Bulldozer based products, and featured a nice little bump in IPC in both single and multi-threaded applications. Vishera and Abu Dhabi look to be Piledriver+, which essentially means that there are a few more tweaks in the design that *should* allow it to go faster per clock than Trinity. There have been a few performance leaks so far, but nothing that has been concrete (or has shown final production-ready silicon).

Until that time when Vishera and its ilk are released, AMD is teasing us with some Steamroller information. This presentation is featured at Hotchips today (August 28). It is a very general overview of improvements, but very few details about how AMD is achieving increased performance with this next gen architecture are given. So with that, I will dive into what information we have.

Less Risk, Faster Product Development and Introduction

There have been quite a few articles lately about the upcoming Bulldozer refresh from AMD, but a lot of the information that they have posted is not new. I have put together a few things that seem to have escaped a lot of these articles, and shine a light on what I consider the most important aspects of these upcoming releases. The positive thing that most of these articles have achieved is increasing interest in AMD’s upcoming products, and what they might do for that company and the industry in general.

The original FX-8150 hopefully will only be a slightly embarrasing memory for AMD come Q3/Q4 of this year.

The current Bulldozer architecture that powers the AMD FX series of processors is not exactly an optimal solution. It works, and seems to do fine, but it does not surpass the performance of the previous generation Phenom II X6 series of chips in any meaningful way. Let us not mention how it compares to Intel’s Sandy Bridge and Ivy Bridge products. It is not that the design is inherently flawed or bad, but rather that it was a unique avenue of thought that was not completely optimized. The train of thought is that AMD seems to have given up on the high single threaded performance that Intel has excelled at for some time. Instead they are going for good single threaded performance, and outstanding multi-threaded performance. To achieve this they had to rethink how to essentially make the processor as wide as possible, keep the die size and TDP down to reasonable sizes, and still achieve a decent amount of performance in single threaded applications.

Bulldozer was meant to address this idea, and its success is debatable. The processor works, it shows up as an eight logical core processor, and it seems to scale well with multi-threading. The problem, as stated before, is that it does not perform like a next generation part. In fact, it is often compared to Intel’s Prescott, which was a larger chip on a smaller process than the previous Northwood processor, but did not outperform the earlier part in any meaningful way (except in heat production). The difference between Intel and AMD in this aspect is that as compared to Prescott, Bulldozer as an entirely new architecture as compared to the Prescott/Northwood lineage. AMD has radically changed the way it designs processors. Taking some lessons from the graphics arm of the company and their successful Radeon brand, AMD is applying that train of thought to processors.

Get Out the Microscope

AMD announced their Q1 2012 earnings last week, which turned out better than the previous numbers suggested. The bad news is that they posted a net loss of $590 million. That does sound pretty bad considering that their gross revenue was $1.59 billion, but there is more to the story than meets the eye. Of course, there are thoughts of “those spendthrift executives are burying AMD again”, but this is not the case. The loss lays squarely on the GLOBALFOUNDRIES equity and wafer agreements that have totally been retooled.

To get a good idea of where AMD stands in Q1, and for the rest of this year, we need to see how all these numbers actually get sorted out. Gross revenue is down 6% from the quarter before, which is expected due to seasonal pressures. This is right in line with Intel’s seasonal downturn, and in ways AMD was affected slightly less than their larger competitor. They are down around 2% from last year’s quarter, and part of that can be attributed to the continuing hard drive shortage that continued to affect the previous quarter.

The biggest news of the quarter was that AMD is no longer constrained by 32 nm availability. GLOBALFOUNDRIES was able to produce as many 32 nm parts for AMD as needed with yields continuously improving over the past two quarters. AMD seems very comfortable about where they are at in terms of yields and availability for both Bulldozer and Llano based product lines. AMD has in fact been ramping production of the upcoming Trinity based processor and has been shipping finished products to customers since mid Q1. They have also started shipping Brazos 2.0 parts to customers, and both Trinity and Brazos will be launched in mid Q2 of this year.

The CPU/APU World According to AMD

The mobile area has been one of tremendous growth for AMD and Q1 saw 100% of all mobile shipments be APU products (both Llano and Brazos 1.0). AMD is very bullish about Trinity. They say that it offers around 50% more performance at the same TDP as the earlier Llano based processors. This 50% is a combination of both CPU and GPU performance, so do not expect massive jumps in CPU performance alone from current Llano based products at those TDPs. The big jump does appear to be in graphics, and AMD is certainly more than willing to hang their hat on that portion. With the latest Ivy Bridge IGPs still not able to match last year’s Llano, AMD feels that Trinity will truly leave Intel behind in terms of overall graphics performance. Trinity features a totally redesigned graphics portion which combines the VLIW4 architecture of the HD 6900 series with aspects of the new 7000 series of products.

More MHz for the Masses

AMD has had a rough time of it lately when it comes to CPUs. Early last year when we saw the performance of the low power Bobcat architecture, we thought 2011 would be a breakout year for AMD. Bulldozer was on the horizon and it promised performance a step above what Intel could offer. This harkened back to the heady days of the original Athlon and Athlon 64 where AMD held a performance advantage over all of Intel’s parts. On the graphics side AMD had just released the 6000 series of chips, all of which came close in performance to NVIDIA’s Fermi architecture, but had a decided advantage in terms of die size and power consumption. Then the doubts started to roll in around the April timeframe. Whispers hinted that Bulldozer was delayed, and not only was it delayed it was not meeting performance expectations.

The introduction of the first Llano products did not help things. The “improved” CPU performance was less than expected, even though the GPU portion was class leading. The manufacturing issues we saw with Llano did not bode well for AMD or the upcoming Bulldozer products. GLOBALFOUNDRIES was simply not able to achieve good yields on these new 32 nm products. Then of course the hammer struck. Bulldozer was released, well behind schedule, and with performance that barely rose above that of the previous Phenom II series of chips. The top end FX-8150 was competitive with the previous Phenom II X6 1100T, but it paled in comparison to the Intel i7 2600 which was right around the same price range.

AMD Gives a Glimpse of the Near Future

AMD has released an updated roadmap for these next two years, and the information contained within is quite revealing of where AMD is going and how they are shifting their lineup to be less dependent on a single manufacturer. The Financial Analyst Day has brought a few surprises of where AMD is headed, and how they will get there. Rory Read and Mark Papermaster have brought a new level of energy to the company that seemingly has been either absent or muted. Sometimes a new set of eyes on a problem, or in this case the attitudes and culture of a company, can bring about significant changes for the positive. From what we have seen so far from Rory and company is a new energy and direction for AMD. While AMD is still sticking to their roots, they are looking to further expand upon their expertise in some areas, all the while being flexible enough to license products from other companies that are far enough away from AMD's core competence that it pays to license rather than force engineers to re-invent the wheel.

This first slide is a snapshot of the current and upcoming APU lineup. Southern Islands is the codename for the recently released HD 7000 series of desktop parts. This will cover products from the 7700 level on up to the top end 7990. Of great interest are the Brazos 2.0 and Hondo chips. AMD had cancelled the "Krishna" series of chips which would have been based on Bobcat cores up to 4 on 28 nm. Details are still pending, but it seems Brazos 2.0 will still be 40 nm parts but much more refined so they can be clocked higher and still pull less power. Hondo looks to be the basic Brazos core, but for Ultra Low Power (lower clocks, possibly disabled units, etc.) which would presumably scale to 5 watts and possibly lower.