Parallel.js

Easy multi-core processing with javascript

Parallel.js is a tiny library for multi-core processing in Javascript. It was created to take full advantage of the ever-maturing web-workers API. Javascript is fast, no doubt, but lacks the parallel computing capabilites of its peer languages due to its single-threaded computing model. In a world where the numbers of cores on a CPU are increasing faster than the speed of the cores themselves, isn't it a shame that we can't take advantage of this raw parallelism?

Parallel.js solves that problem by giving you high level access to multi-core processing using web workers. It runs on node and in your browser.

Usage

Parallel(data, opts)

This is the constructor. Use it to new up any parallel jobs. The constructor takes an array of data you want to operate on. This data will be held in memory until you finish your job, and can be accessed via the .data attribute of your job.

The object returned by the Parallel constructor is meant to be chained, so you can produce a chain of operations on the provided data.

Arguments

data

This is the data you wish to operate on. Will often be an array, but the only restrictions are that your values are serializable as JSON.

options (optional): Some options for your job

evalPath (optional): This is the path to the file eval.js. This is required when running in node, and required when requiring files in browser environments (to work around cross-domain restrictions for web workers in IE 10). Defaults to the same location as parallel.js in node environments, and null in the browser

maxWorkers (optional): The maximum number of permitted worker threads. This will default to 4, or the number of cpus on your computer if you're running node

synchronous (optional): If webworkers are not available, whether or not to fall back to synchronous processing using setTimeout. Defaults to true.

Examples

spawn

This function will spawn a new process on a worker thread. Pass it the function you want to call. Your function will receive one argument, which is the current data. The value returned from your spawned function will update the current data.

Arguments

fn

A function to execute on a worker thread. Receives the wrapped data as an argument. The value returned will be assigned to the wrapped data.

This example reverses the letters in the string forwards. First, we construct a new Parallel job, passing in the argument, `'forwards'`. We then spawn a job, passing in an anonymous function. This function receives whatever the currently stored data is, and returns what we want it to be. Finally, we call `then` to log out the result when we're finished.

map

map will apply the supplied function to every element in the wrapped data. Parallel will spawn one worker for each array element in the data, or the supplied maxWorkers argument. The values returned will be stored for further processing.

map takes one required argument.

Arguments

fn

A function to apply. Receives the wrapped data as an argument. The value returned will be assigned to the wrapped data.

We start by creating a new Parallel job, this time passing in a sequence of numbers. We then define the Fibonacci function. Make sure that your function is named, so that it can be serialized properly. This is only an issue if you reference that function, which we do, since it's recursive. Alternatively, we can share this function with the workers using require.

We then call map, which automagically spawns one worker for item in our list, unless we've specified a max number of workers. When our job is complete, we'll log out the first 7 Fibonacci numbers.

reduce

reduce applies an operation to every member of the wrapped data, and returns a scalar value produced by the operation. Use it for combining the results of a map operation, by summing numbers for example. This takes a reducing function, which gets an argument, data, an array of the stored value, and the current element.

reduce takes one required argument.

Arguments

fn

A function to apply. Receives the stored value and current element as argument. The value returned will be stored as the current value for the next iteration. Finally, the current value will be assigned to current data.

We start by creating a new Parallel job, again, passing in a sequence of numbers. We then define the add function, which will be used to reduce our values. Then we define the factorial function, and use the require method to share it with all workers.

Finally, we construct a job pipeline, consisting of a map operation, where we compute the series value for each index. Finally, the data is passed to our reduce operation, where we sum the values in the list. This sum of this series will tend to approximate e^10 as its length approaches infinity.

require

require is used to share state between your workers. Require can be used to import libraries and functions into your worker threads.

require takes any number of arguments, either functions or strings. If the argument is a function it will be converted into a string and included in your worker.

Important: If you pass functions into require they must be named functions. Anonymous functions will not work. If you wish to pass anonymous functions, you may do so by declaring them with an object literal of the form, { fn: myAnonFn, name: 'myAnonFn' }.

Important: browser security restrictions prevent loading files over the file protocol, so you will need to run an http server in order to load local files.

Personally, I like the npm package, http-server. This can be installed and run pretty easily:

$ npm install http-server -g
$ cd myproject
$ http-server .

Passing environment to functions

You can pass data to threads that will be global to that worker. This data will be global in each called function. The data will be available under the global.env namespace. The namespace can be configured by passing the envNamespace option to the Parallel constructor. The data you wish to pass should be provided as the env optionn to the parallel constructor.