83 Comments

I just want to say how refreshing it is to read an article written by Anand. He is a master of the English language; he perfectly communicates and explains every technical detail and I come away with a better understanding of whatever he's talking about.

Remember AMD's old president and CEO Jerry Sanders with comments
like "We will see what we see" and "More bang for your buck" I
cannot wait to see duel socket motherboards with two four core
Barcelona's working their magic. reminds me of Carol shelby
when he brought the Cobra out for road test. exciting is not
the word, jaw droping performance? Don't take Richard's Statements
lightly Reply

Does anyone have any idea how compatible the "Barcelona" CPU will be with current motherboards? When it comes out, does it need a new n-phase voltage regulator, for example?

the reason I'm asking is, I want to upgrade and with the current state of affairs was going to go for a C2D CPU. But with these Barcelona CPU's due out I may stick with AMD - get a AM2 motherboard and cheap AM2 CPU and upgrade to the Barcelona CPU at a later date. But I have to be sure that whatever motherboard I buy now will be 100% Barcelona compatible.

Can anyone inform us about what the situation is in this regard? Reply

I asked above and non-AnandTech folks like you and I said it would...but no one from AnandTech themselves jumped right in to give an affirmative.

I asked for links from AMD's own website confirming that Agena and Kuma would work in current AM2 motherboards, and no one posted back.

Right now the AM2+ CPU's will work in current AM2 boards rumor is just that, a rumor...when AMD themselves confirm it, or a site such as AnandTech confirms it with AMD and reports on it, then I'll believe it.

Until then, it's <i>probable</i> that AM2+ will work in current AM2 motherboards...if you're willing to take the risk I say go for it, else, wait until we have an official answer one way or the other.

"Intel regained the undisputed performance crown it hadn't seen ever since the debut of AMD's Athlon 64."
Intel in fact lost the "undisputed performance king" title during the early lifetime of the K7 architecture. The Pentium !!! was faster at some tasks and slower at others (games) than the K7. Before that, the Pentium II was better than the K6-2 (the K6-3 had better IPC than Pentium3, but was slower in MHz) Reply

Wow! A lot of dicussion in here.
And, by the way, very interesting article.

I'm a software engineer from Brazil and I'm planning to change my PC this year.
I've bem using AMD processors since the K6.
Today I've a XP Mobile 2500+(@2.2ghz), 1gb ram, 200gb and an AGP 6600GT
My PC is not very slow, but I'm thinking in going dual core to speed things up(office applications, web development and some games).
I can run some of the newest games, but not in high graphics.
I expect that my PC can run C&C 3 (Already run the demo in 1024 medium, but have some craches although it's not running it slow)

So, today I'm thinking in 3 options:
1) Stay with this computer and wait until AMD launchs it's new architecture (I pretend to go with an average price Kuma)

2) Go with Intel Core 2 Duo (e6300 or e6400). They're not expensive and for games I can easily make an overclock and gain more performance.

3) Buy a good AM2 board and a cheap Atlhon X2 (3600) and wait new AMD processors and then change only the processor.

Here in Brazil the taxes are to high, so I'm planning in buying a PC with these specs:

Athlon 64 AM2's arnt exactly slow so if you're an AMD fan just get one..like a 3800+ or 3600+ and overclock it. It will be at least 4x faster than what you have now and accept K8L Agena core later. It will be cheaper than C2D by about $50 USD and You'll also pay cheap for a GeForce 6100 Motherboard which is only $50 USD. Overall expect the the AM2 system to be about $100 USD cheaper.

Keep in mind that C2D is 20% faster clock for clock in most apps so it's not exactly a quantum leap here getting a C2D.. Gap gets a lot larger when overclocking since C2D's overclcok higher like 3.2Ghz is common on air vs. only 2.8Ghz for AM2, so, at the end of the day a C2D setup is able to be about 40% faster over most benchmarks. That is getting significant and why enthusiasts are buying C2D's. Reply

Usually we criticize writing style based on a whole experience... obviously Anand is one of the best technical review writers on the Internet; if you bother to read his articles more fully perhaps you'd realize that. The colloquial writing sometimes brings it to a more personal level that a reader can better relate to and understand -- it works especially well in this case, where it's a future design, we really don't know how it's going to perform. That he can guess and say "pretty significantly" tells me he understands the uncertainty of the situation, and the language communicates that fact perfectly well. It would be more confusing if he said it would impact performance "significantly" as you want him to, as that would imply that he was more certain than he might actually have been.

Masters are allowed to bend the rules, and Anand is one, so lay off. Reply

Any rough guess as to how Barcelona will compete with Core2 in gaming? Many articles have shown how Core2 gets you a slight FPS boost in games that aren't graphics card limited. I'm curious how Barcelona will fit in with the overall picture of DX10 cards like G80 and R600. Reply

Games have quite a lot of LOAD instructions, like most programs, as well as plenty of branches (esp. in the AI routines). Most likely the boost that Core 2 gets is due in a large part to the better instruction reordering and branch prediction, although the cache and prefetchers probably help as well. Given AMD was better than NetBurst due to memory latency, through in better OOE (Out of Order Execution) logic and keep the improved latency and they should do pretty well.

Naturally, everything at this point is purely speculation, but in the next few months we should start to get a better idea of what's in store and how it will perform. One problem that still remains is that even if AMD can be competitive clock-for-clock, Intel looks primed to be able to go up to at least 3.6 GHz dual core and 3.46 GHz quad core if necessary. AMD has traditionally not reached clock speeds nearly as high as Intel, possibly due in part to having more metal layers (speculation again - process tech and other features naturally play a role), so if they release 2.9GHz Barcelona at $1000 you can pretty much guarantee Intel will launch 3.2 and/or 3.46 GHz Kentsfield (and/or FSB1333 3.33 GHz).

On the bright side, at least things should stay interesting in the CPU world. :D Reply

Yesterday, Intel announced that they were converting a fourth fab to 45nm. A great deal of confidence in that process. And a few days ago they announced desktop shipments of Penryn-based CPUs pulled forward into 2007. Looks as if AMDs 'window of opportunity' is likely to be very small. IBM has not yet announced a successful implementation of a RAM on their 45nm process. Intel had their RAM design on 45nm up and running late 2005. Reply

Then it should be no problem for AMD to confirm through AnandTech that this is the case.

Surely if Barcelona is this close to shipping (only a few months away), AMD must know if Agena and/or Kuma will work in current AM2 motherboards, especially their own 690 series their just about to release.

All I'm asking for is a definite either way, it shouldn't be that hard for AMD to do at this point.

Can you post the link that originates at AMD's own website then that says specifically that AM2+ CPU's are guaranteed to work - understandably maybe not supporting every new feature - in current AM2 boards?

Not a news post from DailyTech, The Inquirer, Toms, whatever...one that's on AMD's site itself.

And No, AMD could make AM2+ completely incompatible with current AM2 boards and they probably wouldn't see much drop if at all from the large OEM's. The large OEM's would just ensure that when the AM2+ CPU's came in, AM2+ motherboards would likewise come in.

Believe me, I want to see the link...because I'm desperately awaiting 690G or MCP68, whichever comes first (which is probably MCP68 at the pace AMD is moving on 690G).

AMD doesn't do knee-jerk reactions like Intel because AMD has superior products. AMD continues to take market share from Intel in every segment and Barcelona will continue that trend. Barcelona looks to be every bit as superior to Intel's hacked/patched/glued together chips as Opteron was when introduced. Intel's chips depend on huge cache size for their performance and that crutch won't work after the intro of Barcelona.

For those without a clue, AMD didn't start design of Barcelona last week or last year. It's been in the development pipeline for many years and thr performance will demonstrate exactly why AMD's long term platform stability is the right choice for most enterprise buyers. Intel is gonna feel the pain again. Reply

AMD, like Intel, start numerious projects. Just not all of them get to this finish line. Actually a lot of them don't even reach the end of the planning phase before being scratched.

As for Intel and their large caches...well I'd say it's amazing how half their die (if not more) is used for cache and still had enough space for all the core logic that's kicking the crap out of the K8 now.

Looks like some good improvements coming down the pipe. The cache size issue makes me nervous, though - 512kb per core is starting to look a little antiquated, and there's no information about the bandwidth to the L3 cache (which, presumably, is slower than L2). Reply

In the past, AMD did not need the large cache sizes that Intel did for their processors. This was very obvious in regards to the Netburst architecture. However, while Core2 is much better than Netburst there are still disadvantages for Intel.

I'll explain a little background as far as I understand it. In the K7 and Netburst days, Intel had to have the cache to make up for their long pipeline. Branch mispredictions are going to happen and the penalty on the long pipeline of the Netburst processors hurt their IPC badly. The shorter pipeline on the K7 did not have the same performance penalty due to the shorter pipeline. With K8, the on die memory controller also negated the need for large L2 caches due to the reduced latency when accessing main memory. This has been one of the major performance aspects for the K8 architecture.

The Core2 architecture obviously does not have the on die memory controller so the need for larger caches is still present and Intel sees improvement due to the larger caches. Barcelona still has the on die memory controller and the previous efficiency is still there and still negates the need for large caches. This is just the difference between architectures. While having a larger cache on the K8 did improve performance some in some usage scenarios, it wasn't on the same scale as the improvements Intel received with a larger cache.

AMD can't compete with Intel in regards to cache size. However, other architecture differences make up for the lack of large amounts of cache. Barcelona having a smaller cache does not seem to be a big problem. If it was a big problem, AMD probably would have gone with a larger cache to get the extra performance. Bigger does not always mean better or at least enough better to warrant the extra.

Smaller cache will mean fewer transistors which should mean better yields, lower power consumption and cheaper to produce. Reply

I agree with Anands closing article that AMD now needs it's own "snowball effect" for the next couple of years. 4-5 years with a sitting target against a giant like Intel prooved to be costly in terms of competivness.

We all saw it coming when Intel developed the first Pentium M. It looks like AMD got the message as well and started the Barcelona project. Maybe AMD learned their lesson. Reply

So bascially all intel 's C2D improvement are made into Barcelona. And apart from Virtualization improvement there are nothing new from AMD that Intel doesn't have?

On performance note Barcelona doesn't seem to offer better clock scaling. I.e even if it is 30% faster then its current K8 it will only have slight advantage against C2D clock per clock. Not to mention it is up against Penryn. Although Penryn is nothing much then a few minor tweaks and more cache. It does allow intel to scale higher in clock speed.

And given AMD slow roll out rate, and AMD limited production capacity Barcelona never seem like much of a threat.

The article does not mention anything about FP improvement. Are AMD keeping them secret for now or is that all we are going to see? Reply

The FP improvement is the SSE improvement, and according to the theory it's more powerful than what core2 duo is offering.

There are improvements mentioned that are not in core2 (+ other way around, like instruction fusing), and improvements that are inspired on the same principle but implemented differently. The architectures themselves differ widely (see earlier article that compares K8 with Core2 - reservation station etc.) so different implementations of principally the same optimizations on a different architecture will have vastly different effects. Even after these improvements, the capabilities (how much can you decode, etc) of each read nothing alike. And if it were all the same, AMD has the platform advantage, so it would still end up faster by virtue of nothing else but that. Some guesstimates made by varying sites would put Barcelona ahead in FP code and at the same level or slightly behind in INT code. But those are just guesstimates.

What I'm trying to say here is that barcelona is still very different from core2, and that we just don't know yet in which direction the pendulum will swing ;) Reply

No....precisely in theory is where Barcelona lacks. Core 2 Duo could in theory do 6 64bit or 3 128bit SSE instructions per cycle. Barcelona can do 4 64bit or 2 128bit. AMD provided this information aswell. Reply

Hmmm, in the earlier article, there was explicit emphasis on the fact that 2 of the 3 units are symmetric in core2, but I'm not too sure what it means. It does imply however that those 3 units of core2 can only be used fully in certain combinations, and are not 3 independent units. On 128-bit performance, what was said is this: "so the Core architecture has essentially at least 2 times the processing power here [compared to K8]". Not 3 times, but "at least" 2 times, so again the 3 times will probably only be in certain situations.

The next paragraph said this:
"With 64-bit FP, Core can do 4 Double Precision FP calculations per cycle, while the *Athlon64* can do 3."
So K8 was not at such a big disadvantage when it came to 64-bit SSE, if Barcelona doubles everything SSE, it should come ahead in this area.

So to me it looks like for 128-bit, core2 will be faster in some situations, on par in others, and for 64-bit, Barcelona would be ahead.

If this is wrong, I do not know where some of the articles I read over time came from, implying Barcelona would be better overall in SSE. Reply

And 64 or 128bit doesnt matter. I dont know how you think that way.
Barcelona got 2 SSE ports. They are able to do 2 128bit or 4 64bit. Most 128bit actually contains 2 64bit or 4 32bit.
Core 2 got 3 SSE ports. They are able to do 3 128bit or 6 64bit. Reply

One apparently overlooked detail of Barcelona's architecture is its instruction fetch ability: Barcelona is able to send 32 bytes (128 bits) to its decoders per cycle, where Core can send only 16 bytes to be decoded, increasing the likelihood of 'split fetch' cases in the latter. This means that, even if Core does have more raw FP power in terms of its execution units, Barcelona can expect greater utilisation of its FPUs/SSE, and the impact of this will be even more pronounced when running 64 bit code, due to the increased size of 64 bit instruction blocks. If Barcelona does, as expected, outperform Core in IPC in 32 bit mode, the performance gap may well increase in 64 bit mode. Reply

Did you miss page 3? The SSE128 stuff largely deals with FP and cache improvements. Standard FP is still used, but most programs are optimizing for SSE2/3 as that can run circles around x87 FP performance. Reply

Is there no information on the bandwidth between the new caches? Or are they left the same? I'm only asking because last I read, Intel had a huge advantage in that department, with double or so the bandwidth between the caches. Isn't that important in FP-code, especially if you have to feed 4 cores (so the bw at the level 3 cache..) Reply

Page 3: the cache bandwidth as I understand it should be doubled (128-bit vs. 64-bit), and several other areas have wider data paths as well. I think Intel has a 256-bit cache bus, so they still have more cache bandwidth, but as a whole it's difficult to say which will end up faster right now. The integrated memory controller has a lot of influence on a lot of areas, after all. Reply

K7 to K8 transition did the doubling of the 64bit interface to the 128bit one.. Core indeed has a 256bit interface (as far as I remember, even the P3 had a 256bit interface to L2). So according to page 3 the interface would be doubled again this time around?

I'm only asking because I remember this quote from Johan De Gelas' article a while back.
"The Core architecture's L1 cache delivers about twice as much bandwidth (Measured by ScienceMark), while it's L2-cache is about 2.5 times faster than the Athlon 64/Opteron one."
And that must have *some* impact on performance. I think the bandwidth of the L3 cache will also be key, but haven't seen any official information about it. Reply

K8 had a 64-bit read and a 64-bit write path to its L2 cache, giving a total of 128 bits. Barcelona has a 128-bit read and 128-bit write path to its L2, giving a total of 256 bits - the same as Core.
One thing that surprised me on the subject of cache was the associativity of the L1, which I had expected to see increased to 4-way. This would have allowed AMD to extend its lead in L1 hitrate and regain the ground lost in this area since the introduction of Core. Maybe we'll see an improvement to L1 associativity in future iterations of Barcelona. Reply

Argh this article is such a cock tease. I read most of it but now I want some prelim benchies or some kind of numbers. Guess we'll have to wait till Mid-2007?

I can't stand the anticipation, my girlfriend pulls this same shit every now and then, she'll get me going then quit and laugh....I always tell her I'll pull the same thing on her and see how she likes it but I can never gather up enough will power :) Reply

Hello Anand, great article as always. I suppose your much at home nowadays building your house etc. But when are we going to read more of your blogs or the relaunch of anandtech? I think the plan was to have many of the staff to have their own blogs?

I agree, I would like to see more Anand blog entries. The blog currently doesn't seem to be working -- I can't pull up any of the older entries. I would like to go back and read through some of the old Macdates. Reply

I may be wrong but I think that new CPU or GPU technologies are planned years ahead so for me it look more like they came down to the "same" conclusion on how to improve their CPU. Only Intel did it 1 year before AMD. Reply

There are fundamentally only so many ways to improve processor performance, and Intel used most of them with Core 2. That AMD is using similar patterns (more buffers, better branch prediction, wider execution, etc.) isn't at all surprising. Just because the same basic principles are used, however, doesn't mean that at the transistor level there aren't significant differences and challenges to overcome. Reply

Another great article that displays all the reasons why I read AT - lengthy, technical reviews written by educated authors that are interesting to read and to top it off, with no typing errors! I'm sure you guys use voice software to write these mammoths.

I was waiting for details on Barcelona for so long and this is finally it. I have no doubt that AMD will be up to par with Intel again, but the question is, will this significantly SURPASS Core 2 offerings at the time? I hope so but it's not a definite thing yet.

The best thing is, I'm a ways into my computer engineering degree now so I can actually understand a lot of these very techincal articles! Reply

None... AMD will loose no marketshare. They are in bloody price war... Intel hasn't really regained any lost territory. But Intel have the advantage of performance is trying to find a breakthrough in AMD market share to retake back the lost territory. AMD is still selling everything they make but at huge looses caused by the price war.
Reply

...you have people like me who won't buy anything from Intel. If we didn't have AMD to make Intel competitive we would never have the range of choices we have today. We'd all be running monster Itanics with massive electricity bills.

Intel has the resources to effectively put AMD out of business over time if it so chooses, and today I suspect they are focused on something close to that.

If AMD were to be purchased by a larger corporation, like IBM, it would leave Intel free to beat AMD down with all of their resources. Of course, at that point AMD would have the resources of IBM behind it and could potentially fight back better. Reply