For about $100,000 you can cobble together a cluster of Intel-powered PCs that generates roughly the same computing power as a brand-name super-server — for one-tenth the price. Moreover, the Linux operating system has proven to be stable, reliable, and scalable.

Stir into this mix of attractive price and performance the recent trickle of improved commercial Linux management tools, and the stage is set for widespread implementation of Linux clusters by the life science community.

But as even Linux cluster advocates admit, this frugal approach to high performance computing (HPC) has its drawbacks. For starters, there is no single, authoritative commercial support network for Linux' open-source operating system. Instead, companies must develop and maintain Linux expertise on staff. Linux also lacks several HPC functions, such as parallel file processing, performance analysis tools, and job scheduling. Debate continues about the robustness of Linux' security.

Nevertheless, these drawbacks aren't slowing the adoption of Linux clusters. According to Silico Research Ltd., 85 percent of large pharmaceutical companies and 65 percent of life science organizations use clustered and distributed computing platforms, the majority of them implemented with Linux. Genomics pioneer Incyte Genomics Inc. and drug discovery firm Tularik Inc. are just two examples of life science companies that have made big bets on Linux, and each cites dramatic IT savings.

Market-watcher International Data Corp. forecasts the cluster market will grow 35 percent annually to $4.27 billion by 2005 and that Linux will become the dominant choice in the cluster environment, growing from $226 million in 2001 to $1.4 billion by 2005.

The Story of 'Beowulf'
Linux cluster computing got its start in 1994, when NASA researchers created the first Linux cluster and nicknamed it "Beowulf" after the warrior hero in the epic poem. The Beowulf project was an exercise in IT scavenging worthy of any resource-constrained lab.

NASA engineers scrounged up 16 Intel 486-generation personal computers that had been discarded. They connected them with channel-bonded 10Mbps Ethernet and used Linux as a distributed operating system. This cluster of previously dumped PCs functioned as a parallel computer engine. Since that time, the term Beowulf has come to describe the class of clusters that use similar architecture.

The NASA engineers were searching for a cheaper way to solve computational problems while mapping the eco-regions of the country. They dreamed of a machine that could achieve 1 gigaflop — 1 billion floating-point operations per second. At the time, commercial high-performance computers at that performance level carried a price tag of $1 million — far too steep for the research group's budget. The newly created Linux cluster, which delivered 70 million floating-point operations per second, cost about $40,000, or one-tenth that of a comparable commercial machine in 1994.

Early Linux cluster users quickly discovered that clustering boosted processing speed, increased transaction speeds, and improved reliability. But this cost-efficient speed bonanza came with a concern that continues to shadow the Linux debate. Because Linux is open source software, with no single entity controlling its growth or patrolling its security needs, some observers worry that it is less secure. Others say such security concerns are overblown.

"I don't think that Linux is any more secure or any less secure than Windows or Unix," says Bill Claybrook, research director of the Aberdeen Group, an IT market analysis firm. "We hear a lot about Windows' lack of security, and I think there are problems with it, but Windows is a big piece of code and there are a lot of people pounding on it. Linux is smaller and far fewer people are using it. We'll see how good Linux security is when it is more heavily used."

Advocates further argue that most Linux clusters are isolated deep within corporate firewalls, not visible on the public Internet, and therefore less vulnerable to hacking.

Big Bang, Few Bucks
The network-of-nodes approach (each PC is a node) is a great fit for life science enterprises, in which drug discovery and human genome research are generating enormous amounts of data and emphasizing the need for cost-efficient computational approaches.

Clearly, though, it's the bigger bang for the buck that's driving the popularity of Linux clustering. Aberdeen reports that Linux clustering generally delivers a 5-to-1 performance-to-price advantage over HPC solutions from traditional suppliers. Claybrook says savings can be greater depending on the specific use. "I have heard of situations where the ratio goes to 40x, but that is very uncommon," he says.

Tularik, a San Francisco-based company focused on developing small molecule drugs that regulate gene expression, chose Linux because of its cost.

Linux Makes the List

To get a feeling for just how effectively Linux is infiltrating the clustering industry, visit the Top500 Web site — clusters.top500.org — which tracks data on HPC clusters. Longtime iron box leaders top the list.

"You can buy a high-end computer such as an SGI or a Sun, but they're very expensive," says Bruce Ling, director of bioinformatics for Tularik. "For a fraction of the money you would spend on such a server, you can buy a lot of CPUs using Linux."

Tularik has several Linux clusters, including a 150-processor Evolocity cluster from Linux NetworX Inc. The 75-node cluster features 150 dual Pentium III 1GHz processors with 300GB of memory and an Intel 10/100 Ethernet connection. It's being used to data-mine genomic information for drug development. "The Linux cluster is well-proven and can do its job," says Ling, adding that scalability and cost are also compelling factors.

Though Tularik's IT department collaborates with bioinformatics researchers on technical issues, such as predicting potential processing requirements, two bioinformatics staffers actually manage the cluster environment.

"That's the downside," Ling says. "You really need to understand the guts of it, as there's no real consumer support for it. For Linux, you need a specialized understanding of the OS."

The upside is that Linux is often a familiar environment for bioinformatics researchers and laboratory scientists in general. Most are familiar with open-source software from college computer studies and its widespread use in the cost-conscious academic environment. Ling studied biochemistry as well as computer science before receiving his doctorate in molecular biology. His bioinformatics team features three scientists and five programmers.

Scalability and Performance
Incyte, a Palo Alto, Calif.-based genomics information company, says it cut computing costs by 95 percent when it moved to Linux clusters three years ago.

Although the savings wouldn't be as dramatic today because of the continual price drops in proprietary architecture machines, Stu Jackson, Incyte's director of bioinformatics, says the attractive price — along with performance benchmarks and scalability aspects — spurred Linux cluster development at the 11-year-old company.

"A Sun E-10000 costs over a million dollars. A Linux 128-CPU cluster is going to run you $100,000, and you don't have the maintenance and license costs you would with the other option," Jackson says. "If you couple that with some really flexible, effective job distribution software, you can kick the tar out of the bigger machine for a lot less money."

At Incyte, the Linux clusters process data from human genome research and feed the results into the company's database products containing information on gene structure, sequence, and function. This data is used by pharmaceutical and biotech companies for drug development and scientific discovery.

Roughly half of the Incyte data center's 4,500 processors — which include units from Intel, Compaq, Sun Microsystems, and SGI — are used in Linux clusters that handle the "heavy lifting" computations, says Jackson.

"There are lots of things you want to do in a data center that aren't suitable for Linux clustering, so you're always going to need some of those large machines around for apps that simply need that kind of hardware to run," Jackson says.

Indeed, Linux clusters aren't a cure for all HPC needs, says Aberdeen's Claybrook. "Applications that require low latency and very high bandwidth are difficult to do with Linux-based computer clusters," he says. "But about 80 percent of the HPC applications can be done."

The Management Migraine
Incyte, like Tularik, has found that developing and maintaining Linux cluster management tools is a challenge.

A big drawback to Incyte's proprietary management application was its inability to easily move applications from one computing resource to another. Jackson says that made it nearly impossible to collect unused processing power and reallocate it to handle additional applications or one-time, ad hoc projects. If a computational job ran behind schedule and required more processing power, the CPU had to be manually reconfigured — a time-consuming effort.

In many early Linux cluster implementations, system administrators often wrote scripts for handling such menial tasks as adding a user, configuring an application, or cross-mounting a new network file system partition. These added administration costs cut into the initial savings provided by Linux clustering.

"There are always difficulties with tools," he says. "Just about anyone who does anything real is going to eventually find one or two things you can't make internally, so you have to purchase system applications to support internal customers."

Jackson considered building another internal tool, but concluded it would be more efficient to find an outside solution. "Any time you buy a commercial product, you get stuff you don't want," he says. "On the other hand, you get something that's supportable and has a bigger user community, so it's got ongoing development."

Jackson says the investment has paid off. The 1,000-CPU LSF cluster performs job distribution functions more efficiently and has produced a 50 percent increase in computer productivity.

Altogether, Incyte has spent well more than $2 million on the Linux infrastructure. The programming project for its internally developed job distribution system consumed about two employee-years; the project to test and obtain LSF took only about 10 months.

New Wave of Tools
Tularik also had the in-house Linux expertise to build its management tools, but was eager to avoid the effort if possible. The company chose Linux NetworX' ICE Box, which provides serial switching, remote power control, and system monitoring capabilities.

ICE Box is helping Tularik focus on finding new genes rather than on server operation and maintenance. "The appliance provides vital features ... so I never need to go down to the server room," says Gene Cutler, a Tularik bioinformatics scientist.

The time saved by finding an off-the-shelf tool has shortened the time to market for products, say officials at both Tularik and Incyte, eliminating their need to build proprietary tools.

According to Aberdeen, only a few Linux cluster tools were available three years ago. Today, Linux cluster suppliers are developing both open-source and proprietary cluster management products. Some commercial suppliers are building from scratch; others, such as Red Hat Inc., are using various pieces of open-source software to shorten development cycles.

Other vendors are taking proprietary Unix cluster technology and modifying it to run on Linux — among them are SteelEye Technology Inc., Hewlett-Packard Co., and Veritas Software Corp. Platform Computing recently announced its Platform Clusterware for Linux, the first hardware-independent support solution for cluster management.

There's no denying that today's Linux clusters are rivaling the throughput capabilities of legacy mainframes and becoming a central player in HPC evolution.