"If the only tool you have is a hammer, you tend to see every problem as a nail."
Abraham Maslov

Sunday, September 6, 2009

Top gear(man)

Gearman is an open source project providing a flexible and universal framework for writing distributed applications. It differs from similar projects in easiness of use and the number of bindings for programming languages it provides: C, C++, Java, Perl, PHP and Python. In fact, Gearman has a simple command line client, that allows you to start jobs using any language you want - all you need to do is to provide the client with input data and then fetch the client's output. Gearman API is very simple, consistent, and makes writing distributed applications really easy, quick and fun.

Gearman architecture is equally simple: it consists of job servers, that accept task requests from clients and forward them to workers, and send results back to clients. Each worker can be connected to many job servers, and a client can choose which job server to use - this way there is no single point of failure that could break down the whole cluster. Job servers have their own queues and in case of worker failure they can reassign tasks to other workers. According to High Scalability Gearman has been successfully used by LiveJournal, Yahoo!, and Digg (which claims to run 300000 jobs a day through Gearman without any issues).

I decided to try out Gearman at home, and I must say that it was a really pleasant experience. I wrote a simple C++ worker and even simpler Python client. The worker recursively finds Fibonacci number for given n:

You can download the source code of both worker and client here. After you compile and install Gearman with traditional:

./configure
make
sudo make install
sudo ldconfig

Install Python extension with:

easy_install gearman

And compile the C++ example with:

make

Then you can run a job server as a daemon:

gearmand -d -L 127.0.0.1

or in debug mode:

gearmand -vv -L 127.0.0.1

Next, run a couple of Gearman workers:

./GearmanWorker -h 127.0.0.1

And the Python client:

python GearmanClient.py --host 127.0.0.1 -n 45

For a single machine, it makes sense to run at most as many workers as there are CPUs (or CPU cores) available. For a network cluster, you can run more job servers and workers (and clients) respectively.

I've made some tests with the client and worker above. using my home laptop and an Intel Atom based net-top running together in a local network. For only one laptop worker, computing the sum of 45 Fibonacci numbers took 66.955 seconds, for two laptop workers it took 35.702 seconds, and adding a remote worker reduced the total time to 25.593 seconds. Adding more workers didn't reduce computation time, it even slightly slowed the cluster down - which is quite understandable, as the number of workers exceeded the number of free CPUs (Intel Atom in fact has only one physical core, although applications see it as dual-core CPU).