Cameron and Tracey Hughes

Dr. Dobb's Bloggers

Ignore Forced Partitioning, Become a Ghost Hunter

October 24, 2010

It's one thing to develop apps that leverage multicores. We can select compatible toolsets, class libraries, function libraries and paradigm and if we're careful, have a nice clean architecture. We also can leverage formal languages and models to ensure that apps are correct, reliable, robust, contains no race conditions, and no opportunities for deadly embrace.

It's one thing to develop apps that leverage multicores. We can select compatible toolsets, class libraries, function libraries and paradigm and if we're careful, have a nice clean architecture. We also can leverage formal languages and models to ensure that apps are correct, reliable, robust, contains no race conditions, and no opportunities for deadly embrace.

But, as most of us recall, multithreading libraries and techniques for multiprocessing were available before multicore computers became so pervasive. One of the consequences of multithreading capabilities arriving on the scene first is that many software providers and library providers made use of multithreading techniques to improve the performance of their software. So the providers of database functionality use multithreading as they saw fit. The providers of communications libraries use multithreading also, but perhaps a little different and with different libraries than the database provider.

The GUI libraries have their own take on how multithreading and multiprocessing should be implemented and so they put their paradigms into place. So we may have applications out there that use a multithreading scheme that was developed, tested, and deployed on single core computers. We may also have integrated one or more of these libraries or components in our organization applications. But what happens when someone gets the bright idea to optimize our legacy applications to take advantage of the new multicore world? Many developers are learning that multithreaded applications that worked fine, somehow mysteriously break when introduced into an environment where there is really multiple processors at the hardware level. Not to point any fingers here, but of course the developers tested the software on the single processors they had. How could they know there is one thing physically having 32 or 64 processors as opposed to simulating 32 or 64 processors that might make a difference with respect to timing, mutexes, semaphores, broadcast and data race. So we give them an ecumenical pass.

But now we're stuck so of speak with some of these components because they are critical parts of existing (expensively developed) systems, and they have there own paradigms for multithreading and multiprocessing, that for the most part can't be changed. These components have to be thoroughly inventoried and understood, if we are going to engage in the dirty business of retrofitting existing or legacy applications from a single core world to a multicore world. They will have a dramatic effect on your initial partitioning.
Partitioning helps to identify what components, algorithms, functions, or procedures can or cannot be run in parallel. But in this case, partitioning can also identify which multithreading, or parallel processing paradigms can be mixed and if so how, and if not why. For example, Tracey and I do virtually all of our GUI work using the Qt library. In the Unix/Linux world, Qt rides on top of X Windows. Well when partitioning a Qt application into threads, one better keep the event queue and anything that directly accesses it in the same thread (at least in the versions we're using) or there will be trouble.

So we cannot just simply divide up the GUI component of the application how we see fit, we are constrained by some design choices that the QT library designers made for us (not that this is necessarily a bad thing) but in the partitioning process. The QT library and how it already uses threads, mutexes, and semaphores must be considered when we are adding new multithreading or partitioning to our application. So if you recall from our previous forays into retrofitting, if you're looking to get speedup of legacy applications and systems by taking advantage of the new multicore computers, we first recommend that you optimize your application or system for single core computers. Make sure that you're using the most appropriate approaches to file access, video access, the most appropriate algorithms, and data structures. Once that's stuff is in place, next you will have to do a painstaking inventory of the libraries and components that your system already uses that was built to take advantage of multithreading or multiprocessing in a single processor environment. You have to identify these components, partition them in such a way that you can build new parallel processing components that can peacefully coexist with potentially hostile older multithreading approaches. All we can say my friends is that if you don't identify, understand, and then account for these older models of multithreading, you will join the ranks of the ghost hunters. You will retrofit your legacy application and manage to get it working. But soon after your next hardware or software upgrade, adding one more users, or after the database adds or delete a few thousand records, you'll begin to experience spooky lockups, dropouts, slow downs, etc. that simply won't make sense but do show up on your monitoring tools. We're only talking from the personal pain that we currently feel. We were forced to move on with a design phase (contrary to our recommendations) and we did not give a thorough accounting of the ancient multithreading practices that were already spreadout through the various libraries and components of the legacy system, that were originally designed and tested in a single processor world.

So during our partitioning phase, the new design was sufficiently partitioned, because we were told, convinced, and cajoled to believe that the existing libraries, and foundational components were solid (fool me once, shame on you). So in some cases you will have the natural, and logical partitioning of the application or system drive the new retrofit design, but in other cases, there will be some forced partitioning going on, especially if you are working with components that have their own threading libraries, or older multithreading paradigms (developed and tested in a single processor world). Identify those components and come up with wrappers for them or a nice clean protocol to talk to them, but they must be dealt with at the design level. We just want to spare you the pain. (To be continued)

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!