Shifting Applications Into Gearman

Gearman enables a new level of software abstraction. With this lightweight infrastructure you can to outsource work to better-suited computers, run tasks in parallel, and combine code written in different computer languages.

For example, a CPU is billions of transistors, but its intricacies are veiled by assembly language. Or, consider assembly language: it manipulates memory, registers, and the stack, but such tediousness is disguised by the compiler. And so on, and so on. Programming interfaces hide implementations; objects encapsulate data structures; data structures model the problem… and, well, you get the idea.

Today, even the computer itself is abstracted. The fundamental tenet and real promise of cloud computing is independence from the machinery.

For example, instead of referring a number crunching task to servers groucho, chico, and harpo, your application can delegate work to the ensemble marx_brothers.

Of course, it takes a lot of shenanigans to realize the marx_brothers—it’s still a collection of machines (physical or virtual) and dedicated processes, workload must be balanced, and the additional zeppo might boot from time to time to help out—but the hassle is abstracted from your application.

Better yet, at least for a Web application, the cloud can absorb work that might otherwise bog down machines devoted to data persistence or user interaction.

For instance, consider the chain of events sparked at the close of an auction on eBay: all losing bidders receive notification of the final terms; the buyer and seller receive confirmation of the sale; and the buyer receives an invoice. None of this work need occur “live” it can occur in the equivalent of a computing back-office.

And that’s the premise of Gearman, a lightweight infrastructure to outsource work to additional machinery—your very own cloud. Gearman is a matchmaker (picture a geeky Chuck Woolery) or perhaps better envisioned as a recruiter: it connects workers with employers. Gearman can shift tasks to better-suited computers, run tasks in parallel, distribute copious work, and combine code written in different computer languages.

Gearman was written by Danga Interactive, the progenitors of memcached. Like memcached, Gearman is simple to use, reliable, and continues to evolve to meet real-world challenges. Specifically, contributors hope to add persistence this week to keep the work queue intact in case of failure. Ongoing work also aims to replicate the queue among actors for resilience and failover. (This new work is to be debuted at the upcoming OSCON 2009.)

Here, let’s deploy Gearman on a single machine to demonstrate its capabilities. Extending Gearman from one machine to many is a snap, as you’ll see.

Installing Gearman

A Gearman configuration requires three components: A client, a worker, and the Gearman daemon.

The client requests work. More specifically, the client requests a task by name, such as resample or render and provides the raw materials (an image, a scene, URLs, and so on).

The worker does the heavy lifting. Each worker can perform at least one kind of task, such as render.

The daemon brokers the transactions between the two other constituencies. It registers workers and anticipates clients, and makes matches to facilitate work.

Although not shown here, you can easily connect a command-line worker with a Ruby client and vice versa. You can also write a Python worker to satisfy Perl clients. The latter feature is especially appealing, as you could choose the language, library, and classes best suited to the task, say, mix Perl’s rich suite of email CPAN modules with a Rails Web application.

Abstraction, abstraction, abstraction.

The sky is the limit and the pending new features mentioned at the outset shift Gearman into overdrive. Happy tinkering.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62