Threads à gogo (*) is a native module for Node.js that provides an asynchronous, evented and/or continuation passing style API for moving blocking/longish CPU-bound tasks out of Node's event loop to JavaScript threads that run in parallel in the background and that use all the available CPU cores automatically; all from within a single Node process.

git clone http://github.com/xk/node-threads-a-gogo.git
cd node-threads-a-gogo
node-gyp rebuild
# It also works with node-waf, but this is outdated, so please use node-gyp nowdays.

To include the module in your project:

var threads_a_gogo= require('threads_a_gogo');

You need a node with a v8 >= 3.2.4 to run this module. Any node >= 0.5.1 comes with a v8 >= 3.2.4.

The module runs fine, though, in any node >= 0.2.0 as long as you build it with a v8 >= 3.2.4. To do that you simply have to replace /node/deps/v8 with a newer version of v8 and recompile it (node). To get any version of node goto http://nodejs.org/dist/, and for v8 goto http://github.com/v8/v8, click on "branch", select the proper tag (>= 3.2.4), and download the .zip.

Both the event loop and said listeners and callbacks run sequentially in a single thread of execution, Node's main thread. If any of them ever blocks, nothing else will happen for the duration of the block: no more events will be handled, no more callbacks nor listeners nor timeouts nor nextTick()ed functions will have the chance to run and do their job, because they won't be called by the blocked event loop, and the program will turn sluggish at best, or appear to be frozen and dead at worst.

A.- Here's a program that makes Node's event loop spin freely and as fast as possible: it simply prints a dot to the console in each turn:

cat examples/quickIntro_loop.js

(functionspinForever(){

process.stdout.write(".");

process.nextTick(spinForever);

})();

B.- Here's another program that adds to the one above a fibonacci(35) call in each turn, a CPU-bound task that takes quite a while to complete and that blocks the event loop making it spin slowly and clumsily. The point is simply to show that you can't put a job like that in the event loop because Node will stop performing properly when its event loop can't spin fast and freely due to a callback/listener/nextTick()ed function that's blocking.

cat examples/quickIntro_blocking.js

functionfibo(n){

return n >1? fibo(n -1)+ fibo(n -2):1;

}

(functionfiboLoop(){

process.stdout.write(fibo(35).toString());

process.nextTick(fiboLoop);

})();

(functionspinForever(){

process.stdout.write(".");

process.nextTick(spinForever);

})();

C.- The program below uses threads_a_gogo to run the fibonacci(35) calls in a background thread, so Node's event loop isn't blocked at all and can spin freely again at full speed:

cat examples/quickIntro_oneThread.js

functionfibo(n){

return n >1? fibo(n -1)+ fibo(n -2):1;

}

functioncb(err, data){

process.stdout.write(data);

this.eval('fibo(35)', cb);

}

var thread=require('threads_a_gogo').create();

thread.eval(fibo).eval('fibo(35)', cb);

(functionspinForever(){

process.stdout.write(".");

process.nextTick(spinForever);

})();

D.- This example is almost identical to the one above, only that it creates 5 threads instead of one, each running a fibonacci(35) in parallel and in parallel too with Node's event loop that keeps spinning happily at full speed in its own thread:

cat examples/quickIntro_fiveThreads.js

functionfibo(n){

return n >1? fibo(n -1)+ fibo(n -2):1;

}

functioncb(err, data){

process.stdout.write(" ["+this.id+"]"+ data);

this.eval('fibo(35)', cb);

}

var threads_a_gogo=require('threads_a_gogo');

threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);

threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);

threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);

threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);

threads_a_gogo.create().eval(fibo).eval('fibo(35)', cb);

(functionspinForever(){

process.stdout.write(".");

process.nextTick(spinForever);

})();

E.- The next one asks threads_a_gogo to create a pool of 10 background threads, instead of creating them manually one by one:

Its event loop can spin as fast and smooth as a turbo, and roughly speaking, the faster it spins, the more power it delivers. That's why @ryah took great care to ensure that no -possibly slow- I/O operations could ever block it: a pool of background threads (thanks to Marc Lehmann's libeio library) handle any blocking I/O calls in the background, in parallel.

In Node it's verboten to write a server like this:

http.createServer(function(req,res){

res.end( fs.readFileSync(path));

}).listen(port);

Because synchronous I/O calls block the turbo, and without proper boost, Node.js begins to stutter and behaves clumsily. To avoid it there's the asynchronous version of .readFile(), in continuation passing style, that takes a callback:

fs.readfile(path,functioncb(err, data){/* ... */});

It's cool, we love it (*), and there's hundreds of ad hoc built-in functions like this in Node to help us deal with almost any variety of possibly slow, blocking I/O.

Threads (kernel threads) are very interesting creatures. They provide:

1.- Parallelism: All the threads run in parallel. On a single core processor, the CPU is switched rapidly back and forth among the threads providing the illusion that the threads are running in parallel, albeit on a slower CPU than the real one. With 10 compute-bound threads in a process, the threads would appear to be running in parallel, each one on a CPU with 1/10th the speed of the real CPU. On a multi-core processor, threads are truly running in parallel, and get time-sliced when the number of threads exceed the number of cores. So with 12 compute bound threads on a quad-core processor each thread will appear to run at 1/3rd of the nominal core speed.

2.- Fairness: No thread is more important than another, cores and CPU slices are fairly distributed among threads by the OS scheduler.

3.- Threads fully exploit all the available CPU resources in your system. On a loaded system running many tasks in many threads, the more cores there are, the faster the threads will complete. Automatically.

4.- The threads of a process share exactly the same address space, that of the process they belong to. Every thread can access every memory address within the process' address space. This is a very appropriate setup when the threads are actually part of the same job and are actively and closely cooperating with each other. Passing a reference to a chunk of data via a pointer is many orders of magnitude faster than transferring a copy of the data via IPC.