Wednesday, January 18, 2012

I wrote this post about a year ago, left it in my drafts folder and forgot about it. I'm not sure why. It's quite good. The timestamp in my drafts folder says: 2/20/11.

Mozilla, the makers of the Firefox web browser, want to release four versions per year. I find it funny that most of the discussion about this in the arstechnica comments is about what the version numbers should be. As if this is the most important aspect of the decision.

Software development revolves around releases. A release is a software version that is available for purchase or download (ie: released to the public). Software manufacturers will then add new features to this old software and release a new version later. A software version number tells you which release of the software you are using.

By convention, the first release is called 1.0. The next major release is called 2.0. If there's been only bug fixes the version would be something like 1.0.1. If they're fixing bugs and introducing a few minor features the release might be 1.1. If they're introducing lots of minor features they might skip a few numbers and go straight to 1.5. The result of all this is you can use the version number to make an educated guess as to how much has changed in the new version.

All this version numbering is based on the assumption that you actually have major releases and minor releases. Some software is developed continuously and released on a fixed schedule. Every four months or so there might be a new release which contains whatever is finished at the time. Some releases might have only have bug fixes but others may have major new features. If you try and fit this into a typical version number scheme it becomes difficult to decide which release get a new major version number and which require a minor version number change.

Google's chrome Web browser is a good example of software development on a schedule. They have effectively abandoned major and minor version numbers. Google increments the major version number by one for (almost) every release irrespective of what new stuff it contains. We used a similar approach when we were developing Myster. Releasing on a schedule is a natural fit for the development team but can confuse those who expect a more traditional numbering scheme. It's confusing because the version number seems to lie; there's not always major new features despite the major version number change!Mozilla has said they will be putting Firefox on a scheduled release system with four releases per year. Up until now Firefox has been using the more traditional release schedule and version numbering scheme. The new release schedule basically mandates that the major version number is incremented by one. Simply increasing the major version number by one with each release is so much easier than having quarterly arguments as to what the version number should be based on what happens to be in that particular release.

In these days of automatic updates, version numbers are practically irrelevant. What matters is whether you're up to date or not. I would like Mozilla to downplay Firefox's version number if they are releasing on a schedule. If they focus the conversation on the capabilities of the software and less on what it's version number is most other concerns will fall into the background as people adapt to the new reality.

Well, the good news is that my main point about critical sections hurting performance when you scale up to multiple processors is still valid... but all my graphs are wrong. *heavy sigh*. Since posting those graphs, I've been trying to find a moment to write a follow-up posting. What really drove home the importance of this was I noticed that just about every concurrency related session I attended at Java One this year mentioned Amdahl's law in relation to critical sections. Doh! It looks like I am in good company about miss-applying the law.

Amdahl's law applies to systems that can be split up into multiple tasks. Each one of these tasks is dependent on the completion of the previous task. Some of these tasks are 100% parallelizable others are not.If we took this system and ran it on a machine with 3 processors, the amount of time it would take would be the amount of time for part 1 plus the amount of time for part 2 divided by the three processors:

In this example, on a single processor system, part 2 is three times longer to compute than part 1. So if we had three processors part 2 will run 3 times faster. Part 1 would stay the same speed because it can't be parallelized. The net result is we spend half of our time in part 1 and half of our time in part 2. It also means that overall, with 3 CPUs, this task will take half the time it would if we ran it on one CPU.

A real world example of this sort of situation would be painting a house. The first step would be going to get the paint from the store. It doesn't matter how many people you have it always takes the same amount of time to get the paint. The second step would be painting the house. Painting the house is something that can be split up amongst many people and so is parallelizable. The key thing here is you still need to wait until you have the paint before painting can begin and driving out to get the paint doesn't benefit from multiple people and so is not parallelizable.

Critical sections, on the other hand, are parts of a program that can only have one thread executing them at a time. A real world example would be having your friends help you move. Having multiple people can speed up moving house by every person carrying a different box out to the moving van. The thing is, two or more people can't fit through the front door at the same time. The door acts like a critical section; only allowing one thread (or person) to use it at a time.

This task contains a tiny piece that can't be done by multiple processes at the same time. In this example that same task will be executed 3 times on three different processors:

In the diagram the tasks on CPU2 and 3 started a bit late for some unimportant reason. However, when the process running on CPU 2 gets to the critical section it doesn't need to wait because CPU 1 is already finished with the critical section. Unfortunately, the processor running on CPU 3 does have to wait for CPU 2 to finish using the critical section. That's bad luck for process running on CPU3.

Critical sections are different from the situation described by Amdalh's law because the system might run completely in parallel; two processes that wait for each other is an unlucky occurrence. This means that, for a single run, we have no idea how much time the program will take because it depends on how often two or more threads both hit the same critical section.

Below is a graph of the factor speedup vs number of cores. This is is for critical sections:

Notice how the impact of the critical section is small up until the point when it suddenly starts to drastically limited the amount of speedup. Conceptually speaking this is the point where there's almost always a thread in the critical section... Or as the paper puts it the execution time "is determined by the serialized execution of all the contending critical sections".

For comparison, this is the graph for Amdalh's law:

In contrast to critical sections, the system starts to show inefficiencies much sooner but degrades much more gradually.

Here's the same data as above but in terms of % efficiency as the number of cores increase. First critical sections:

Now for Amdahl's law:Here's what the graphs looks like for up to 100 cores. First critical sections:

Now for Amdahl's law:

Here's a direct comparison between Amdalh's law and critical sections at 95% parallelizable up to 100%. The red line represents critical sections.