Final Words

Expecting a sequel to be a reincarnation of the original is just setting yourself up for disappointment. A good sequel will be able to stand on its own, independent of whatever may have come before it. Nehalem is Intel's Dark Knight, it lacks the reinvention that made Conroe so incredible, but it continues what was started in 2006.

The Core i7's general purpose performance is solid, you're looking at a 5 - 10% increase in general application performance at the same clock speeds as Penryn. Where Nehalem really succeeds however is in anything involving video encoding or 3D rendering, the performance gains there are easily in the 20 - 40% range. Part of the performance boost here is due to Hyper Threading, but the on-die memory controller and architectural tweaks are just as responsible for driving Intel's performance through the roof.

The iTunes results do paint a downside to Nehalem, there are going to be some situations where Intel's new architecture doesn't offer a performance advantage over its predecessor. If you're not doing a lot of 3D rendering or video encoding work and you already have a Core 2 Quad, the upgrade to Nehalem won't be worth it. If you're still stuck on a Pentium 4 or something similarly slow by today's standards, a jump to Nehalem would be warranted.

Gaming performance is actually better than expected for Nehalem, there were enough cases where the new architecture pulled ahead despite its very small L2 cache that I wouldn't mind recommending it for gamers. In most GPU limited situations however you won't see any performance improvement, at least with today's GPUs, over Penryn.

While posting some very impressive performance gains, Nehalem is nearly as much about efficiency. Hyper Threading alone delivers a 0 - 30% increase in performance at a 0 - 15% increase in power consumption; the problem is that Nehalem's efficiency is only as good as its performance and in those areas where Nehalem can't outperform Penryn, its power efficiency suffers.

I can't help but wonder if what we saw with the QX9770 is indicative of a larger Nehalem advantage, if Penryn's power consumption truly does increase dramatically as clock speed goes up, while Nehalem is able to reel it back in. If that is indeed the case, then Nehalem is even more important for the future of the Core microarchitecture than I originally thought. You could consider it the reverse-Prescott in that case, if its design choices are meant to keep power consumption under control as clock speed ramps up.

It seems odd debating over the usefulness of a processor that can easily offer a 20 - 40% increase in performance, the issue is that the advantages are very specific in their nature. While Conroe reset the entire board, Nehalem is very targeted in where it improves performance the most. That is one benefit of the tick-tock model however, if Intel was too aggressive (or conservative?) with this design then it only needs to last two years before it's replaced with something else. I am guessing that once Intel moves to 32nm however, L2 cache sizes will increase once more and perhaps bring greater performance to all applications.

Quite possibly the biggest threat to Nehalem is that, even at the low end, $284 is a good amount for a microprocessor these days. You can now purchase AMD's entire product line for less than $180 and the cost of entry to a Q9550 is going to be lower, at least at the start, than a Core i7 product. There's no denying that the Core i7 is the fastest thing to close out 2008, but you may find that it's not the most efficient use of money. The first X58 motherboards aren't going to be cheap and you're stuck using more expensive DDR3 memory. If you're running applications where Nehalem shines (e.g. video encoding, 3D rendering) then the ticket price is likely worth it, if you're not then the ~10% general performance improvement won't make financial sense.

It also remains to be seen what will happen to the Nehalem market once Intel introduces the LGA-1156 version next year for lower price points. By introducing a $284 part this early Intel appears to be courting the Q6600/Q9450/Q9550 buyers to the LGA-1366 platform, which would mean that the two-channel Nehalems are strictly value parts and perhaps there won't be much fragmentation in the market as a result.

Intel has two thirds of the perfect trifecta here. Nehalem brings the ability to work on more threads at a time, redefining video encoding and 3D rendering performance, its SSDs shook the storage world, that just leaves Larrabee...

would you guys consider rebenchmarking?
from the x264 changelog since the nehalem specific optimizations:
"Overall speed improvement with Nehalem vs Penryn at the same clock speed is around 40%."
Reply

Good review and better than Tom's overall. However Tom's stumbled on something that changed my mind about gaming with Nehalem. While Anand's testing shows minimal performance gains (and came to the not good for games conclusion) Tom's approached it with 1-4 GPU's SLI or Crossfire. All I can say is the performance gains with Nvidia cards in SLI was stunning. Maybe the platform favors SLI or Nvidia had a driver advantage in licensing SLI to Intel. Either way Nehalem and SLI smoked ATI and the current 3.2 extreme quad across the board. Reply

Something I think you guys missed in your article/conslusion is the fact that we're now able to pair a great CPU with a pretty damn good North/South Bridge AND SLI.

I found that the 680/780/790 featureset is plainly lacking and that the Intel ICH9R/10R seems to always perform better and has more features. If any doubt, look at Matrix RAID vs nVidia's RAID. Night and day difference, especially with RAID5.

The problem with the X38/X48 was you got a great board but were effectively locked into ATI for high end Gaming.

Now we have the best of both worlds. You get ICH10R, a very well performing CPU (even the 920 beats most of the Intel Quad Core lineup) AND you can run 1/2/3 nVidia GPUs on the machine. In my opinion, this is a winning combination.

The only downside I see is board designs seem to suck more and more.

With socket 1366 being so massive and 6 DIMM slots on the Enthusiast/Gamer boards, we're seeing not only 6 expansion slots (down from the standard of 7) but in most boards I have seen pics of, the top slot is an x1 so they can wedge it next to the x58 IOH which means your left with only 5 slots for other cards. Using 3 dual slot cards is out of the question without a massive 10 slot case (of which there are only like 3-5 on the market) and even if you can wedge 2 or 3 dual slot cards into the machine, you have almost zero expansion card slots should you ever need them.

Then we get to all the cooling crap surrounding the CPU. ALL these designs rely on a top down traditional cooler and if you decide to use a highly effective tower cooling solution, all the little heatsink fins on the Northbridge and pwer regulators around the CPU get very little or no airflow. Now your in there adding puny little 40/60mm fans that produce more noise than airflow, not to mention that the DIMMs are hardly ever cooled in today's board designs.
Call me a cooling purist if you will, but I much prefer traditional front to back airflow and all this side intake top exhaust stuff just makes me cringe. I personally run a Tyan Thunder K8WE with 2 Hyper6+ coolers and the procs and RAM are all cooled front to back. Intake and exhaust are 120mm and I have a bit of an air channel in which that airflow never goes near the expansion card slots below, which by the way have a 92mm fan up front pushing air in across the drives and another 92mm fan clipped onto the expansion slots in the back pulling it back out.

I dont know how to resolve these issues, but I think someone surely needs to because IMHO its getting out of control. Reply

"Looking at POV-Ray we see a 30% increase in performance for a 12% increase in total system power consumption, that more than exceeds Intel's 2:1 rule for performance improvement vs. increase in power consumption."

You cant use "total system power", but must make the best estimate of CPU power draw. Why? Because imagine if you had a system with 6 sticks of RAM, 4 HDDs, etc. you would have ever increasing power figures that would make the ratio of increased power consumption (a/b) smaller and smaller!

If you take your figures and subtract (a guestimate of) 100W for non CPU power draw, then you DONT get the Intel 2:1 ratio at all!