Modern computer processors are constantly changing their operating frequency (and voltage) depending on workload. For Intel processors, this is often handled by the operating system which will request a particular level of performance, known as the Performance State or P-State, from the processor. The processor then adjusts its frequencies and voltage levels to accomodate, in a DVFS (dynamic voltage and frequency scaling) sort of way, but only at the P-states fixed at the time of production. While the best for performance would be to run the system at the maximum all the time, due to the high voltage, this is the least efficient way to run a processor and wasteful in terms of energy used, which for mobile devices means a shorter battery life or thermal throttling. With the P-state model, to increase efficiency, the operating system can request lower P-states in order to save power, but if a task requires more performance, and the power/thermal budgets are sufficient, the P-State can be changed to accomodate. This 'technology' on Intel processors has historically been called 'Speed Step'.

With Skylake, Intel's newest 6th generation Core processors, this changes. The processor has been designed in a way that with the right commands, the OS can hand control of the frequency and voltage back to the processor. Intel is calling this technology 'Speed Shift'. We’ve discussed Speed Shift before in Ian’s Skylake architecture analysis, but despite the in-depth talk from Intel, Speed Shift was noticably absent at the time of the launch of the processors. This is due to one of the requirements for Speed Shift - it requires operating system support to be able to hand over control of the processor performance to the CPU, and Intel had to work with Microsoft in order to get this functionality enabled in Windows 10. As of right now, anyone with a Skylake processor is actually not getting the benefit of the technology, at least right now. A patch will be rolled out in November for Windows 10 which will enable this functionality, but it is worth noting that it will take a while for it to roll out to new Windows 10 purchases.

Compared to Speed Step / P-state transitions, Intel's new Speed Shift terminology, changes the game by having the operating system relinquish some or all control of the P-States, and handing that control off to the processor. This has a couple of noticable benefits. First, it is much faster for the processor to control the ramp up and down in frequency, compared to OS control. Second, the processor has much finer control over its states, allowing it to choose the most optimum performance level for a given task, and therefore using less energy as a result. Specific jumps in frequency are reduced to around 1ms with Speed Shift's CPU control from 20-30 ms on OS control, and going from an efficient power state to maximum performance can be done in around 35 ms, compared to around 100 ms with the legacy implementation. As seen in the images below, neither technology can jump from low to high instantly, because to maintain data coherency through frequency/voltage changes there is an element of gradient as data is realigned.

The ability to quickly ramp up performance is done to increase overall responsiveness of the system, rather than linger at lower frequencies waiting for OS to pass commands through a translation layer. Speed Shift cannot increase absolute maximum performance, but on short workloads that require a brief burst of performance, it can make a big difference in how quickly that task gets done. Ultimately, much of what we do falls more into this category, such as web browsing or office work. As an example, web browsing is all about getting the page loaded quickly, and then getting the processor back down to idle.

For this short piece, Intel was able to provide us with the Windows 10 patch for Speed Shift ahead of time, so that we could test and see what kind of gains it can achieve. This gives us a somewhat unique situation, since we can isolate this one variable on a new processor and measure its impact on various workloads.

To test Speed Shift, I’ve chosen several tasks which have workloads that could show some gain from Speed Shift. Tests which run the processor at its maximum frequency for long periods of time are not going to show any significant gain, since you are not limited by the responsiveness of the processor in those cases. The first test is PCMark 8, which is a benchmark which attempts to represent real-life tasks, and the workload is not constant. In addition, I’ve run the system through several Javascript tests, which are the best case scenario for something like Speed Shift, since the processor has to quickly complete a task in order to allow you to enjoy a website.

The processor in question is an Intel Core i7-6600U, with a base frequency of 2.6 GHz, and turbo frequency of 3.4 GHz. Despite the base frequency being rated on the box at 2.6 GHz, the processor can go all the way down to 400 Mhz when idle, so being able to ramp up quickly could make a big impact even on the U-series Skylake processors. My guess is that it will be even more beneficial to the Y series Core m3/m5/m7 parts since they have a larger dynamic range, and typically more thermal constraints.

PCMark 8

Both the Home and Work tests show a very small gain with Speed Shift enabled. The length of these benchmarks, which are between 30 and 50 minutes, would likely mask any gains on short workloads. I think this illustrates that Speed Shift is just one more tool, and not a holy grail for performance. The gain on Home is just under 3%, and the difference on the Work test is negligible.

JavaScript Tests

JavaScript is one of the use cases where short burst workloads are the name of the game, and here Speed Shift has a much bigger impact. All tests were done with the Microsoft Edge browser.

The time to complete the Kraken 1.1 test is the least affected, with just a 2.6% performance gain, but Octane's scores shows over a 4% increase. The big win here though is WebXPRT. WebXPRT includes subtests, and in particular the Photo Enhancement subtest can see up to a 50% improvement in performance. This bumps the scores up significantly, with WebXPRT 2015 showing an almost 20% score increase, and WebXPRT 2013 has a 26% gain. These leaps in performance are certainly the kind that would be noticeable to the end user manipulating photographs in something like Picasa or watching web-page based graph adjustments such as live stock feeds.

Power Consumption

The other side of the coin is power consumption. Having a processor that can quickly ramp up to its maximum frequency could mean that it will consume more power due to the greater penalty of increasing the voltage, but if it can complete the task quickly and get back to idle again, there is a chance to be more efficient when work is done in 10s of milliseconds rather than 100s of milliseconds, as the frequency ramps up and down again before the old P-state method has decided to do anything. The principle of 'work fast, finish now' was the backbone of Intel's 'Race To Sleep' strategy during the ultrabook era and focused on the impulse of response-related performance, however the drive for battery life means that efficiency has tended to matter more, especially as devices and batteries get smaller.

Due to the way modern processors work, we don’t have the tools to directly measure the SoC power. Intel has told us that Speed Shift does not impact battery life very much, one way or the other, so to verify this, I've run our light battery life test with the option disabled and enabled.

This task is likely one of the best case scenarios for Speed Shift. It consists of launching four web pages per minute, with plenty of idle time in between. Although Speed Shift seems to have a slight edge, it is very small and would fall within the margin of error on this test. Some tasks may see a slight improvement in efficiency, and others may see a slight regression, but Speed Shift is less of a power savings tool than other pieces of Skylake. Looking at it another way, if, for example, the XPS 13 with Skylake was to get 15 hours of battery life, Speed Shift would only change the result by about 7 minutes. Responsiveness increases, but net power use remains about the same.

Final Words

With Skylake, while there was not the large leap in clock for clock performance gain that we have become accustomed to with new Intel microarchitectures, but when you look at the overall package, there was a decent net gain in performance combined with new technologies. For example, being able to maintain higher Turbo frequencies on multiple cores has increased the stock to stock performance more than the smaller IPC gains.

Speed Shift is just one small part of the overall performance gain, and one that we have not been able to look at until now. It does lead to some pretty big gains in task completion, if the workloads are bursty and short enough for it to make a difference. It can’t increase the absolute performance of the processor, but it can get it to maximum performance in a much shorter amount of time, as well as get it back down to idle quicker. Intel is billing it as improved responsiveness, and it’s pretty clear that they have achieved that.

The one missing link is operating system support. We’ve been told that the patch to enable this is coming to Windows 10 in November. While this short piece looks at what Speed Shift can bring to the table in terms of performance, if you'd like to read more about how it is implemented, please check out the Skylake architecture analysis which goes into more detail.

Update: Daniel Rubino at Windows Central has tested the latest Windows 10 Insider build 10586 and it appears to enable Speed Shift on his Surface Pro 4, which is in-line with the November timeline we were provided.

Post Your Comment

54 Comments

"...due to the high voltage, this is the least efficient way to run a processor and wasteful in terms of energy used, which for mobile devices means a shorter battery life or thermal throttling."

Admittedly, it's a bit unrelated to the article, but it's somewhat annoying that mobile devices are built with cooling that's insufficient to sustain full load/maximum heat generating potential scenarios of modern processors without the CPU having to reduce its speed in order to avoid overheating. That's stupid. Sure the CPU is capable of responding to a situation where cooling isn't good enough, but I dislike the idea that some mobile systems are built so that thermal throttling happens pretty regularly under load. Sorry, that's just bad design.Reply

No, it's optimizing for bursty loads and lightness; they could put a heavier heatsink in and either make the laptop significantly bigger or have a significantly smaller battery instead. Not that it would matter much since a sustained max CPU load would nuke your battery in an extremely short period of time.

If you go back and look at mobile clock speeds before and after large turbo modes were added you'll see that its main effect was to allow bursty loads to run at higher speeds while the sustained speeds more or less stayed where they were. The net result was that for the non-CPU bound workloads that most users have a laptop could go from being much slower than a desktop to almost as fast as one.Reply

Agree with DanNeely. That's why imherently you can't use laptops the same as desktops. The will throttle more. Explain that to the stupid IT Department in my company that offers exactly 2 models of laptops you can choose from. Both are Ultrabooks with 15 W CPUs. And I'm supposed to work with that crap while I would at least need a something like an i7-6700k.Reply

It depends. It is only "bad" design if you run a sustained load. But it is not bad design if you're doing quick ramp ups and downs such as many people do in mobile devices. Sacrificing the weight and size of the device to achieve the theoretical maximum and sustain it for hours is foolish on a mobile device. Yes you CAN have this, but at what cost? In mobility, weight and size matter as much as performance, to imply that performance sustained is the ONLY metric that matters is narrow minded to the application of mobility. The UX of a mobile device that is able to deliver performance in bursts instead of long periods of time will be superiror to one that is bulkier and needs to dissipate more heat to achieve the same. Don't be fooled, design matters a lot and in the race for thinner lighter, consumers will not really care to give a bit of performance to stay within thermal safety.Reply

I think you missed his point. It's fine if the processor throttles because the TDP of the chip isn't high enough. What isn't fine is manufacturers combining 45W CPU's with cooking that can really only handle 25W.

Sustained performance does matter for some people, but the amount of laptops you can but with crapy cooling is large.Reply

There are a lot of apologists for crappy laptop design. I agree that lackluster cooling is one of the biggest issues with laptops. If you have a gaming laptop you will game plugged in most of the time so battery life isn't as much of an issue. But when you're whole system can only run at 80% speed due to being thermally throttled, then that is a real kick in the nuts.

I wish review sites would spend more time, or actually any time, measuring performance under sustained loads, because with gaming laptops that is the thing that matters most. I really couldn't care less what numbers are spit out on a 2 minute benchmark run when the laptop doesn't even have time to fully heat up. Reply

This is why "Gaming Laptop" is an oxymoron and why I would never buy one. Besides, what is the point of a laptop that has to be plugged in all the time? If you want performance, don't buy a laptop! If you want mobility, buy a Laptop, but don't expect performance!Reply

You guys are confusing categories and falling for marketing tricks; regular laptops are in no way designed to be used under sustained load circumstances. "Gaming laptops" that don't just use the marketing terms such as "...play games on the go powered by X." actually have sufficient cooling (although the margin for throttling is low which needs to be improved).

Most laptops are designed to be marketed to normal users and casual gaming means something different to the general consumer.

Also, more importantly to your point, some people need laptops because they are on the move but would like an actual gaming laptop since when they will be using it for gaming they will be plugged in. It isn't as simple as "get a desktop" unless you can figure out an easy and economical way to haul and use a desktop everywhere you go.Reply

Most laptops I have used, whether gaming/engineering laptops or cheapo ones, can sustain the base clock speeds at 100% load on all cores. Some do have crappy cooling but most are OK. Be sure to test it out when you get a new laptop so you can return it.About gaming/engineering laptops, they have better battery life than you may think, if they have nVidia Optimus. The dedicated GPU can be off and thus quite decent battery life. Reply