March 21, 2009

Going parallel…

The so-called „many core shift“ is happening. It’s not a thing of the future, it’s not „just around the corner“, it has already begun. And it will change our developers’ life.

Last week we had some customer events, containing some talks about PDC and other stuff and how it will affect the near term future. Among other things I tried to describe the many core shift as well as its consequences. Curiously the audience was largely aware of the facts, yet the consequences had yet escaped the vast majority. So let me try to repeat the gist and see whether this is maybe a common symptom…

What’s the many core shift, anyway?

Moore’s Law states that the number of transistors on an integrated circuits doubles every two years. Until not long ago, and accompanied by more complex designs and higher clock speeds, that meant faster CPUs. This was sometimes called the „free lunch for developers“, because if one happened to write a slow application… not that anyone ever would 😉 , all he had to do was wait two years and it ran twice as fast.

However this evolution has reached its physical limits (clock speed, power consumption, etc.). Yet, still the number doubles… .
So instead of building faster and more complex CPUs, the manufacturers started placing more CPUs, read cores, on a chip. It started 2006 with Intel’s dual cores, today you won’t find a single core desktop machine anymore. High end consumer machines come with quad cores, and servers with 16 cores (delivered as 4 quad cores). Have a look at the extrapolation:

The “lower” line shows cores on a CPU, starting 2006 with 2 cores, while the steeper one assumes 4 CPU sockets. And just in case the conclusion escaped you: Five years from now we will have between 32 and 128 cores. And remember, we are talking “consumer grade stuff”, that is the box under your desk, not something special! Impressive?

So that’s the many core shift. But what does it mean?

Well, it probably means that today’s software runs a bit faster. Not much, mind you, certainly not the 32 times faster a 64 core machine is supposed to be compared to my dual core. Why is that? Well, have a look at the following task manager of a 64 core machine:

(That’s a fake of course, but have a look at Marks’s blog for a real one.)

Now look at your own desktop and count the open applications. Outlook, Word, perhaps PowerPoint, Internet Explorer, Acrobat Reader? OK, say half a dozen applications, add 10 more for the OS stuff actually doing something. That’s 16 applications, using the upper row of cores, perhaps even to 100% and yearning for more, while the other 3 rows just sit there and twiddle their thumbs. That sad truth is: Most of today’s applications simply are not capable of employing these cores appropriately. Consequence: In order to leverage these cores we have to change the way we write our software!

Two questions come to mind: Do we actually need that kind of processing power? And if so, how do we open it up?

Seriously, does the average user need 64 cores?

Well, yes he does. If for no other reason, he did need the increase in processing power during the last years, why should that change?

So, what does he need it for? Gamers are always at the forefront of processing power demand. We have the generally increasing demand in UI technology: 3D, animations, visual effects are becoming mainstream with windows, WPF and Silverlight. The trend to digital photography has had its effect on the demand for graphics software. On my dual core, DXO needs about 1 hour per 150 pictures, so there’s certainly room for improvement (I brought 2500 pics from my last vacation in Tanzania. Do the math 😉 ). Background encryption, compression, virus scanning, etc. also add up.

Even if you are an “ordinary business user”: Word just sits and waits for your input most of the time? Well, open a non trivial 100 page document and see how long Word takes for pagination, spell checking, or updating the TOC. Change the formatting and watch again. So while Word mostly does nothing exciting, there are „burst times“ when it could really need those cores.

And I did not even mention Visual Studio and the compiler yet…

How do I, Joe Developer, put 64 cores to good use?And how do I make sure the app doesn’t degrade on an old dual core beyond reasonable limits?

Here we are right at the center of the problem: Multithreading is not exactly something new, we’ve had that for more than 20 years on PC’s now. So why do I even have to ask that question? It’s because we didn’t actually use multithreading within our applications if we didn’t have to. Because it’s laborious, error prone, awkward. You have to deal with thread synchronization, race conditions, dead locks, error management, communication between threads. You can’t debug it, tracing doesn’t help very much either. In short: It’s a pain in the …, well.

So let’s face it: Most developers have avoided multithreading altogether (perhaps the lucky ones). And those who did do multithreading probably did it just for optimizations in very distinct areas.

But what we need to leverage those cores is quite the opposite: We need multithreading to become mainstream, kind of ubiquitous. For that it needs to be easier to employ parallelism. Complexity has to be pushed out of our code into the platform. Somewhat like nobody thinks any longer about virtual memory (while some of us are old enough to remember the days of physical addressing).

In other words: In order to deal with the parallelization demands, we need new patterns, libraries, and tools.

Microsoft is going to give us a first delivery on that with the next Visual Studio 2010 and .NET wave. Optimized runtimes (e.g. the thread pool), better tools (e.g. the debugger) and not the least, new libraries, introducing new patterns (e.g. Task Parallel Library). There’s more in the unmanaged world (e.g. Parallel Patterns Library), more on the server side (CCR), more on the language front (F#), more in the research area (transactional memory).
Microsoft even devoted a whole “developer center” to parallel computing (look there for more details). And quite rightly so, because there is no single solution to parallelization, it comes in different flavors (e.g. data parallelism vs. task based parallelism) and we can expect further developments in this area in the future.

Also it’s noteworthy that the OSes, namely Windows Server 2008 R2 and Windows 7 which share the same kernel, can manage 256 cores. Compared to what they supported before this is quite a jump.

Conclusion

So, parallelization is here to stay and we are going to have to deal with it. If anything, the trend is going to accelerate. It’s reasonable to assume that eventually processor manufacturers will trade single core performance for number of cores, i.e. put more but less capable cores on a chip, in order to save power consumption (green IT and mobility being two other major trends).

Looking even further, the many core shift may reach a break even where standard desktop systems will cease to profit from additional cores (how parallel can you become after all?), the problems of memory access may limit the amount of cores. Asynchronous multi cores may evolve, e.g. having cores optimized for certain tasks…

Share this:

Like this:

Related

Good article AJ. Definitely desktop applications today don’t seem to need that much processing power, but I am sure with the comfort of having that many cores, the entire application designing pattern would change.

That’s the core sentence for me at all. In my opinion the mainstream programming languages currently are not capable enough to handle parallelism in a way, that’s needed to exhaust those many cores of current and future systems. I don’t know, if the problem could be handled only by language extensions and new parallel programming patterns or if there has to come a complete new language, which handles parallelism at its core and becomes the new mainstream language. What’s your opinion on that? Extensions, libraries etc. can only be an interim solution in my opinion and things have to evolve beyond that…

To bring up another aspect, there could be done much more automatic parallelism by the system (compiler and/or operating system level). In an “immutable” type language and within functional programming as in F#, automatic parallelism would integrate nicely due to the fact, that there are no side effects on the code. In imperative languages like C#, we’ve got a problem. Code has side effects and thus compilers can’t detect if there could be done any parallelism. However, if the language was richer, this would be possible. If C# would have the capability to express code behavior (Has a method side effects? Is data immutable? …), the compiler could rely on that and perhaps parallelize portions of code. I’m interested to see if this is made possible in the future C#. Code Contracts are evolving in this direction and the developer team is looking forward to include parallelism use cases. I hope that they’ll do their homework well.

The following video shows applications that already take advantage of multi-cores. They were runnning on a dual-core, with computer vision running on one, while 3D physics and logic is running on the other. The 3D graphics run mostly on the GPU.

The applications were developed in C#/.NET 3.5, on top of unmanaged engines like OpenCV, Ogre3D and ODE.

Its a good note that it’s not just for desktop/workstations. Apple’s next iPhone has a processor that is rumored to scale from 2-16 cores. Their next dev SDKs abstract the parallism for you so you don’t have to worry about it.

I like the way .NET PLINQ (implementation of Parallel Task Library) is solving it, by simply allowing you to explicitly participate in the many core party with one additional call with an extension method.

You say “Our profession certainly remains interesting”. That’s the point.
Developers should love this multicore shift, because it means a lot of work to do. 🙂
New systems, new designs, new software. Do you remember the 32 bits revolution, the shift from DOS to Windows, the shift to the web. Now, it’s time to think in parallel. Work, work and more work. That’s good news.
However, managers must be prepared to understand this work… That’s the big problem…