Introduction to the Nehalem Architecture

We Begin Once More

This is an article we have all been anticipating for years now as it introduces the most dramatic shift in Intel processing technology since the introduction of the front-side bus. And ironically, it is this shift that will finally remove the FSB from Intel products for good. The Nehalem core architecture has been the focus of most of Intel's Developer Forums for the last 24 months and the culmination of the technology, marketing and products begins today.

Intel's Core i7 processors will bring a dramatic set of changes to the enthusiast and PC community in general including a new processor, new CPU socket, new memory architecture, new chipset, new motherboards and new overclocking methods. All of that and more will be addressed in our review today so be prepared for a LOT of valuable information.

The Nehalem Architecture - Years of data summed up

We have done more than our share of technical documentation of the
architecture and design, enough so that I feel that duplicating all of
it here would be somewhat of a disservice to our frequent readers. I will highlight the most important architectural shifts in the Nehalem design here but I still encourage you to read over my much more in-depth look at the processor design published in August: Inside the Nehalem: Intel's New Core i7 Microarchitecture.

Here
you can see a die shot of the new Nehalem processor - in this iteration
a four core design with two separate QPI links and large L3 cache in
relation to the rest of the chip. The primary goal of Nehalem was to
take the big performance advantages that the Core 2 CPUs have and
modularize them. Now with the Nehalem design, which will be branded as
the Intel Core i7, Intel can easily create a range of processors from 1
core to 8 cores depending on the application and market demands. Eight
core CPUs will be found in servers while you'll find dual core machines
in the mobile market several months after the initial desktop
introduction. QPI (Quick Path Interlink) channels can also vary in
order improve CPU-to-CPU communication.

At
a high level the Nehalem core adds some key features to the processor
designs we currently have with Penryn. SSE instructions get the bump
to a 4.2 revision, better branch prediction and pre-fetch algorithms
and simultaneous multi-threading (SMT) makes a return after a brief
hiatus with the NetBurst architecture.

HyperThreading Returns

I
mentioned before that Intel is using Nehalem to mark the return of
HyperThreading to its bag of weapons in the CPU battle; the process is
nearly identical to that of the older NetBurst processors and allows
two threads to run on a single CPU core. But SMT (simultaneous
multi-threading) or HyperThreading is also a key to keeping the 4-wide
execution engine fed with work and tasks to complete. With the larger
caches and much higher memory bandwidth that the chip provides this is
a very important addition.

Intel claims that HyperThreading is an extremely
power efficient way to increase performance - it takes up very little
die area on Nehalem yet has the potential for great performance gains
in certain applications. This is obviously much more efficient than
adding another core to the die but just as obviously has some drawbacks
to that method.

Here
you can see Intel's estimations of how much HyperThreading can help
performance in specific applications. Surprisingly one of the best
performers is the 3DMark Vantage CPU test that simulates AI and
physics on the processor while POV-Ray 3.7 still sees huge 30% boost in
performance for this relatively small cost addition in logic.

Welcome to the Uncore, we got fun and games...

A
new term Intel is bringing to world with this modular design is the
"uncore" - basically all of the section of the processor that are
separate from the cores and their self-contained cache. Features like
the integrated memory controller, QPI links and shared L3 cache fall
into the "uncore" category. All of these components that you see are
completely modular; Intel can add cores, QPI links, integrated graphics
(coming later in 2009) and even another IMC if they desired.