The Evolving Grid Computing System at J&J

ByMike May

March 14, 2006 | Johnson & Johnson is planning to double the size of its grid computing network and the number of applications that run on it by the end of this year, according to company executives. According to Robert Cohen, a fellow at the Economic Strategy Institute in Washington, D.C., and a director for industry applications at The Global Grid Forum, “Grid computing is fairly pervasive in pharma, particularly for the R&D side. Johnson & Johnson has been a leader for some time.”

Grid computing harnesses the power of multiple computers by searching the network for idle CPUs, and then either “scavenging” their unused compute cycles or scheduling the work on dedicated machines. In this way, a complex job gets scheduled more efficiently across distributed clusters or is broken into smaller pieces that run on many distributed computers, thereby running more quickly.

David Neilson, director of the discovery information sector of Johnson & Johnson Pharmaceutical Research and Development (J&JPRD), says that his colleagues started talking about a grid in 2002. After some pilot work in 2003, the J&J grid went live in Dec. 2004. As of last month, that grid consisted of 500-600 machines — ranging from servers and workstations to desktop PCs — for a total of 1,200-1,300 CPUs. J&J’s ARDA (Advanced Research with Distributed Applications) grid consists of about 40 percent Linux- and 60 percent Windows-based nodes, according to Jeff Mathers, director of strategy & delivery in J&JPRD’s Technology Office.

J&J’s grid computing system runs on its internal corporate network, with the central grid server residing at a J&JPRD facility in Raritan, N.J. A job goes to the main server, where it gets broken into pieces, and is then distributed to CPUs on the grid. The CPUs return their computations to the grid server, which puts them together and sends the final result back to the user.

Although it started as two separate grids — one in Europe and the other in the United States — Neilson’s team combined the two last summer to make one large J&J grid network that continues to expand. “Recently, we have started to add servers under the control of [J&JPRD] companies to the grid as virtual clusters or as scavenging targets. That will add another 300 to 400 servers, or 700 to 800 CPUs,” says Mathers.

As for selecting the software, Neilson says, “We looked at three different providers and selected United Devices [Grid MP platform].” Applications must be grid-capable. Mathers explains: “Some software is grid-compatible right off the shelf. Some we have to grid-enable, and that can take anywhere from a couple of days to weeks.” The J&J grid also runs various in-house applications.

Running in ParallelMany drug companies run a relatively small number of applications on grids. Steve Wallage, director of research at The 451 Group, which analyzes IT innovation, says, “Pharmaceutical companies rarely get much beyond very specific code that handles one or two drug discovery algorithms.” That is especially true of smaller companies. But Cohen adds, “Quite a few pharmas are running five to 10 applications at the present time.”

By contrast, J&J’s grid can run more than a dozen applications. “We’ve already added five or six new versions of code this year, and we are going through a major update of the core software right now,” says Mathers. By the end of 2006, Neilson says they expect to have 26 applications running on their grid.

The major challenge to bringing grid computing to Big Pharma lies beyond hardware and software. “The technical issues are not as important as the cultural ones,” Wallage says. “IT and computer experts can be very wary of scavenging, mostly from a psychological perspective. The PC is being taken out of their control.”

At J&J, Mathers says, “The tricky part is getting the scientist to know what to do with the software.” But powerful software helps. For example, a J&J application can take a few hundred thousand models of compounds — even ones that have never been made — and see if they attach to a disease-related target. Other applications determine the best dosing regiments to test in a clinical trial. In addition, J&J’s Advanced Biology and Chemistry Discovery Platform will eventually contain all of the company’s research data and make it available to company scientists around the world. Neilson says, “We have instances of people running calculations they would probably not even contemplate without the grid — calculations involving several million molecules.”

In the end, J&J’s grid should make drug discovery faster and more economical. It should also enhance patient safety and fine-tune therapeutics.