Thursday, April 10, 2008

The rise of massive parallel processing capabilities that multi-core architecture brings in has plentiful opportunities and challenges. The recent $20m investment by Microsoft and Intel into the parallel computing research has created quite a buzz around this topic. How well we can utilize the multi-core architecture without rewriting any of the applications is a major challenge ahead since most of the applications are not designed to leverage the potential of multi cores to its fullest extent. The current multi-threaded applications leverage parallel processing capabilities to some extent but these threads do not scale beyond few cores and hence the current applications won't run any slower on more cores but they will run relatively slower compared to the new applications that can leverage this parallel processing architecture. The best way to cease an opportunity of utilizing the multi cores is to have a single theaded application seamlessly utilizing the potential of a multi-core architecture. This does require significant work to rewrite the algorithms and middleware but this is essentially a long tail and has significant business value proposition.

The very concept of single-threaded application relying on concurrency at the algorithms level is going to challenge many developers since this is fundamentally a different way of looking at the problem. The algorithm design approaches have been changing to make algorithms explicitly aware of the available multi cores so that the algorithm can dispatch data on a random core without worrying about the issues related to multi-threading such as deadlocks, locking etc and let threads communicate with each other using asynchronous messages without sharing any common state. The algorithms always work better if it knows more about the data. If this were not to be the case, the brute-force algorithm would be the best one to solve any problem. More you know about the data, you can fine tune the algorithm and that would help you discover some good insights and that would further tighten the algorithm. The increasingly efficient processing capabilities could help deal with a large set of data early on without investing too much time upfront in an algorithm design to discard certain data. When you add the abundant main memory into this mix it has profound implications since the algorithms that were originally designed to access data from a disk are not efficient any more now that the data that they need is always available in the main addressable memory with different data structures and indexes. The cryptography algorithms have been designed to make sure that the attack cannot be completed in a reasonable amount of time given the plentiful resources. We should look at these algorithm design principles to do the opposite - how we can replicate the reverse semantics in other domains to actually make use of massive parallel processing capabilities.

I foresee an innovation opportunity in new functional languages such as Erlang and Haskell or new runtime for current dynamic and object-oriented languages to tap into this potential. The companies such as RapidMind and Google's acquisition of PeakStream also indicate growing interest in this area. Initially the cheap storage, followed by massive parallel processing capabilities, and main-memory computing is going to change the computing dynamics and redefine many things in coming years. We are ripe for the change. The whole is greater than sum of its parts. Yes, indeed.