Introduction

Historically, mobile CPUs were designed as derivatives of their desktop counterparts. You'd usually cut down on the cache, lower the clock speed and voltage, and maybe tweak the package a bit, and you'd have your mobile CPU. For years, this process of trimming the fat off of desktop (and sometimes server) CPUs to make mobile versions was the industry norm - but then Timna came along.

Timna was supposed to be Intel's highly integrated CPU to be used in sub-$600 PCs, which were unheard of at the time. Timna featured an on-die memory controller (RDRAM however), integrated North Bridge and integrated graphics core. The Timna design was very power-optimized and very cost-optimized. In fact, a lot of the advancements developed by the Timna team were later put into use in other Intel CPUs simply because they were better and cheaper ways of doing things (e.g. some CPU packaging enhancements used in the Pentium 4 were originally developed for Timna). What set Timna apart from Intel's other processors was that it was designed in Israel by a team completely separate from those who handled the desktop Pentium 4 designs. Intel wanted a fresh approach for Timna, and that's exactly what they did get. Unfortunately, after the chip was completed, the market looked bleak for a sub-$600 computer and the chip was scrapped, and the team was reassigned to a new project a month later.

The new project was yet another "out-of-the-box" project called Banias. The idea behind Banias was to design a mobile processor from the ground up; instead of taking a higher end CPU and doing your best to cut down its power usage, you started with a low power consumption target and then built the best CPU that you could from there. With a chip on their shoulder (no pun intended) and a bone to pick with Intel management, the former Timna team did the best that they could on this new chip - and the results were impressive.

Banias, later called the Pentium M, proved to not only be an extremely powerful mobile CPU, but was also one of Intel's most on-time projects - missing the team's target deadline by less than 5 days. For a multi-year project, being off by 5 days is nothing short of impressive - and so was the CPU's architecture.
While many will call the Pentium M a Pentium 3 and 4 hybrid, it is far from it. Intel knew that the Pentium 4 wasn't a low-power architecture. The Pentium 4's trace cache, double-pumped ALUs, extremely long pipeline and resulting high frequency operation were horrendous for low power mobile systems. So, as a basis for a mobile chip, the Pentium 4 was out of the question. Instead, Intel borrowed the execution core of the Pentium III; far from the most powerful execution core, but a good starting point for the Pentium M. Remember that the Pentium III's execution core was partly at fault for AMD's early successes with the Athlon, so performance-wise, Intel would have their work cut out for them.

Taking the Pentium III's execution units, Intel went to town on the Pentium M architecture. They implemented an extremely low power, but very large L2 cache - initially at 1MB and later growing to 2MB in the 90nm Pentium M. The large L2 cache plays a very important role in the Pentium M architecture, as it highlights a very bold design decision - to keep the Pentium M pipeline filled at all costs. In order to reach higher frequencies, Intel had to lengthen the pipeline of the Pentium M from that of the Pentium III. The problem with a lengthened pipeline is that any bubbles in the pipe (wasted cycles) are wasted power, and the more of them you have, the more power you're wasting. So Intel outfitted the Pentium M with a very large, very low latency L2 cache to keep that pipeline full. Think of it like placing a really big supermarket right next to your home instead of having a smaller one next to your home or a large one 10 miles away - there are obvious tradeoffs, but if your goal is to remain efficient, the choice is clear.

A large and low latency L2 cache isn't enough, however. Intel also equipped the Pentium M with a fairly sophisticated (at the time) branch prediction unit. With each mispredicted branch, you end up with a large number of wasted clock cycles and that translates into wasted power - so beef up the branch predictor and make sure that you hardly ever mispredict anything in the name of power.

The next thing to tackle was chip layout. Normally, CPUs are designed to exploit the fastest possible circuits within the microprocessor, but in the eyes of the power conscious, any circuit that could run faster than what it needed was wasting power. So, the Pentium M became the first Intel CPU designed with a clock speed wall in mind. Intel would have to rely on their manufacturing to ramp up clock speed from one generation to the next. This is why it took the move from 130nm down to 90nm for the Pentium M to hit 2.0GHz even though it launched at 1.6GHz.

There were other advancements made to the core to improve performance, things like micro-ops fusion and a dedicated stack manager are also at play. We've talked in detail about all of the features that went into the first Pentium M and its later 90nm revision (Dothan), but the end result is a CPU that is highly competitive with the Athlon 64 and the Pentium 4 in notebooks.

Take the first Pentium Ms for example; at 1.6GHz, the first Pentium Ms were faster than 2.66GHz Pentium 4s in notebooks in business and content creation applications. More recently, the first 2.0GHz Pentium Ms based on the Dothan core managed to outperform the Pentium 4 3.2GHz and the Athlon 64 3000+. Pretty impressive for a notebook platform, but what happens when you make the move to the desktop world?

On the desktop, the Pentium 4 runs at higher clock speeds, as does the Athlon 64. Both the Pentium 4 and Athlon 64 have dual channel DDR platforms on the desktop, unlike the majority of notebooks out there. Does the Pentium M have what it takes to be as competitive on the desktop as it is in the mobile sector? Now that the first desktop Pentium M motherboards are shipping, that's why this review is here - to find out.

Something else to remember about the Banias/Dothan line of chips... Agressive power reduction was the #1 goal of the design process. In a 'normal' chip design, not all pipeline stages are the same length, the clock speed it runs at is the speed of the slowest part of the CPU. Since power usage is directly related to the frequency of the switching gates, the Intel engineers actually deliberately slowed down some parts of the chip to match the target release speeds (or get close to them) to reduce power consumption. This is, perhaps, the main reason why the frequencies don't scale so well as some would want them to scale.Reply

here's another thought... when the opterons launched initially at ECC DDR266, there were similar comments like "give it unbuffered DDR400 or higher and stay out of its way" :) well, now that we have that, ok it did improve performance a bit. but not hugely. shouldn't help the dothan significantly more too.Reply

I like how AMD got beaten by the P-M :) not because im intel fan, just because this will make things more interesting now.

don't catch flame from this comment :p its my oppinion

Funny how you picked the game benchmarks btw, its almost as if you wanted to show the P-M lacking behind the A64... from what I've seen it beats A64 in HL2 and CSS, and that's a game you don't skip usually :) so why now?

Also looks suspicious how in lots of tests where P-M performs well with the A64 clock-for-clock or beats it, there is almost no difference in the 3800+ and 4000+ results... like if L2 isnt all that important, yet L2 is exactly how everyone explains the P-M success

Maybe we'll see some 2MB L2 A64 "emergency edition" once Dothan gets a decent desktop chipset, just like what intel did to (try to) save P4 from the A64 :)
actually i'd be happy if Dothan motivates AMD to develop faster L2 cache or something.

Knowing Intel, i dont expect they'd even try to match AMD's prices with the P-M... and there's a lot of room for AMD to decreace prices, as they're selling with quite a margin now. So for sure the P-M won't be cost-effective compared to A64, not if you don't care for ultra-low power consumption at least.

also it doesn't look likely Dothan could scale beyond 2.6GHz on current 90nm tech. by the time it gets there, AMD should've launched the 2.8 FX and most likely 3GHz too. so I have no doubts AMD will keep the lead for quite a while... maybe the race to 65nm will be the next turning point, as it seems its going smooth for intel (at least for P-M)

anyway, even if AMD is better in absolute performance, pricepoint and (arguably) clock-for-clock, you gotta admit it to the P-M, it does quite a punch. fun times are coming :)Reply

the first FX51 was release around late third quarter 2003. So in a little over a year the FX series has only increased 400 Mhz. Can you automatically assume that the FX has poor scalability in terms of cpu speed. NO. You know why, because the EE is underperforming and can't touch the FX. AMD has no need to push large scale speed increases out of the FX line, which would do nothing but increase cost with each new stepping it used to boost performance.

The same goes for the Dothan at 2.26Ghz by the end of 2005. What other cpu offers the same level of performance vs. battery life. So why push for performance except to push sales.

You simply can't determine the scalabiltiy of a cpu based on its roadmap especially when its the performance leader in its market segment and has no current viable competitor or one in the near future.Reply

58 "How long has A64 been stuck on 2.4Ghz."
----------------------------------

There not. 2.6 FX-55 been out for months. More importantly AMD does'nt have to release new chips the way they dominate the benchmarks now. Could they? Hell ya.They got a nice buffer going, New FX's hit 3.0 on stock air. Cheap 90nm's are now hitting 2.7 on default Vcore and air. And by air I mean AMD's cheap all aluminum HS with a itty bitty 15mmx70mm fan, not Prescotts copper core screamers.