First Look at Nehalem Microarchitecture. Page 9

In the next few days Intel is going to make another revolution in the processor market – launch Core i7, new processors on Nehalem microarchitecture. This microarchitecture should become the next significant step after extremely successful Core microarchitecture. In our today’s article we are going to talk about the details behind Nehalem microarchitecture that will help us better understand what we could expect from Intel.

Power Management and Turbo-Mode

A lot of things Intel engineers introduced in their Nehalem processors are inspired by the optimization of this microarchitecture for native multi-core design. Therefore, it was necessary to also revise the processor power management system. Multi-core processors on Core microarchitecture are very power-inefficient in the sense that there is a single algorithm for their power management needs that doesn’t take into account the individual cores. Therefore, it is a pretty frequent situation when one of the cores in contemporary quad-core CPUs that is loaded heavily prevents other cores from going into power-saving mode even though they are hardly involved.

That is why Nehalem microarchitecture has one more important processor unit called PCU (Power Control Unit). It is actually just another programmable micro-controller built into the CPU that should manage power consumption intelligently. No wonder that PCU is of pretty complex design: it consists of about 1 million transistors.

PCU’s main task is to adjust the frequency and voltage of individual cores and it has everything it takes for that. It receives the sensor readings of temperatures, voltage and current for all cores. PCU analyzes these data and switches qualifying cores to power-saving mode by adjusting their frequency and voltage. Namely, PCU may disable inactive cores and put them in deep sleep state where their power consumption will be close to 0.

To make it all happen, Intel engineers and technologists created special semiconductor material that allowed disconnecting the cores from the power bus independently. The main advantage of this technology is that power management of individual cores is performed inside the CPU and doesn’t require enhancing the processor voltage regulator circuitry on mainboards in any way.

As for the processor units identical for all cores, such as memory controllers and QPI interface, they go into power-saving mode when all processor cores sleep.

An intelligent controller that can manage processor cores independently allowed Intel to implement one more interesting technology called Turbo Boost Technology. It introduces a Turbo-mode, when individual cores can work at frequencies exceeding the nominal, i.e. be overclocked. According to Turbo Boost Technology main principle, the overall processor power consumption and heat dissipation lowers when some cores go into power-saving mode, which allows increasing the frequencies of other cores without risking to get past the TDP limits.

In fact, they have already introduced something similar to this technology in mobile dual-core Penryn processors, however, they developed it much more in Nehalem. If there is no risk of exceeding typical power consumption and heat dissipation, PCU may increase the clock frequency of certain cores one step over the nominal (133MHz). It may occur when the workload is not paralleled and some cores are idling.

Moreover, if all the above described conditions are met, frequency of one of the cores may be increased two steps above the nominal (266MHz).

I have to stress that Turbo-mode doesn’t necessarily get enabled when one or more cores go into power-saving mode. It is simply one of the possible scenarios. Since the PCU can get all the information on the current processor cores status, Turbo-mode can also be enabled when all cores are active but the workload is relatively small.

Turbo Boost Technology is absolutely transparent for the operating system and it is its great advantage. It is implemented only in hardware and doesn’t require any software applications or utilities to be running.

To see what it actually means in practical tests we observed the status of our quad-core Nehalem processor with 3.2GHz nominal frequency during work with 1-8 computational threads created by Prime95 utility.

Click to enlarge

As you can see in the animated illustration above, Intel Enhanced SpeedStep technology kicks is when there is no workload: the processor frequency drops to 1.6GHz. Launching one thread activated one core, so the CPU can increase its multiplier from 24x to 26x overclocking itself to 3.46GHz. Two threads increase the processor load so much, that PCU only dares raise the clock speed to 3.33GHz. The frequency remains at this point until we have 5 simultaneous threads working. And only sixth thread increases the CPU utilization to 75% lowering its frequency back to the nominal 3.2GHz. In other words, Turbo Boost Technology is not an ephemeral concept: its effect is real.