Kings of Genes and Data

ByMark D. Uehling

Sept 15, 2005 | Daniel Gudbjartsson, head of statistical development at deCODE Genetics, is sitting quietly in the company's Reykjavik, Iceland, cafeteria. From his burly physique and shy demeanor, he could be a young dairy farmer or a cod fisherman.

In fact, Gudbjartsson and his team of statistician-programmers have created genetic drug-discovery programs that are the envy of the industry. Programs like GeneHunter, Nemo, and Allegro are cited in the world's leading journals. Several programs are available to the scientific community. This is not just a quirk of Icelandic generosity. Giving away code also helps peer reviewers check the company's data prior to scientific publication.

The genealogical, genetic, and laboratory data available to deCODE has presented it with unique challenges. Those challenges, in turn, have driven the development of unique tools for managing haplotypes and family trees alike. The Allegro software, for example, estimates the distance between DNA microsatellite markers. "The aim was to do it not worse than the humans," Gudbjartsson explains. "Humans are very good but they make stupid errors - typos. Today, the calling is on a higher level. We can do it much cheaper because we don't need to do editing manually. There is nobody else doing it."

Is deCODE doing things no other biotech company is doing? Certainly. The company has rapid access to samples from patients who have already been characterized genetically and phenotypically. Its scientists have proprietary data and software to work with those samples. That one-two punch allows deCODE to move discoveries to the clinic more quickly, despite having just 430 employees (a third in Chicago and Seattle). Three promising compounds - two for cardiovascular disease, one for asthma - have progressed to clinical trials in record time. Indeed, the speed by which things move at deCODE supports the company's claim that it is not only reconnecting the bifurcated worlds of drug discovery and clinical research - it is also internally cross-pollinating ideas between those two realms.

Icelandic ITThe deCODE databases and applications are continually honed to a samurai edge. That has always been the vision of the company's founder. The mercurial Kari Stefansson may be the most-published physician-CEO in biopharma, with 170 citations to his name. It would be difficult to mistake the severe, imposing Stefansson for a dairy farmer or a cod fisherman (see "The Icelandic Man Cometh," January 2003 Bio-IT World, page 22).

"The IT department became the biggest part of the company," says Stefansson, recalling the company's early days, when 150 people were writing code to handle the mathematically intensive aspects of genealogy and genetics. "There was nothing out there, no understanding of how we should do this. We understood this early. The magnitude of the data, the importance of handling it correctly."

On a trip to the company's server room, the man in charge, Hannes Sigurdson, notes the company is storing a mind-boggling 40 terabytes of data in some 25 million files. Just backing it all up is a major headache. "Without the information management system we have, we wouldn't be where we are," stresses Stefansson. "We have an enormous amount of information that needs to be stored, that needs to be mined. Without our informatics group, we wouldn't be here. We wouldn't exist."

As evidenced by a wall piled high with reprints from Nature, Science, the New England Journal of Medicine, and other journals, the company has made rapid progress towards a virtuous cycle of mapping and publishing disease genes and moving forwards to formulating treatments. "The rapidity and sense of purpose with which the deCODE genetics group translated their initial linkage results into a clinical trial is striking and impressive," wrote Christopher O'Donnell of Massachusetts General Hospital a few months ago in JAMA, where the company published Phase II results on its drug candidate DG-031 in heart disease.

DG-031: Heart Drug?In that trial, which looked at the effect of DG-031 on the leukotriene pathway that triggers inflammation and heart disease, deCODE started with 900 patients judged genetically and clinically eligible, with previous histories of heart attacks. Of those, 640 Icelanders consented to have their genetic material used in the trial.

At the end of the trial, lo and behold, the drug was found to reduce the levels of two biomarkers (leukotriene B4 and myeloperoxidase) associated with heart attack risk. The trial patients' blood samples were drawn, bar-coded, and analyzed in less than two hours - another glimpse of Icelandic efficiency. The company is interested in yet another biomarker C-reactive protein, and additional trials are planned.

The JAMA paper alone does not prove that deCODE's drug prevents heart attacks, of course, as some clinicians are arguing over the significance the biomarkers. And as O'Donnell points out, it's unclear whether DG-031 will work in perhaps half of all patients lacking the at-risk genetic variations. But to complete such work so quickly is a laudable achievement in an industry with molasses in its veins. deCODE licensed DG-031 from Bayer in November 2003 and published its results just 18 months later. Perhaps only GlaxoSmithKline, under Allen Roses' direction, would be in a position to match that brisk timeline.

The company has other irons in the fire. deCODE is partnering with an unidentified firm to develop an asthma drug, now in Phase II. In deCODE's DG-041 program, however, the company is going it alone, and at a pace equivalent to the DG-031 program. In just three years, deCODE has gone from mapping the disease gene, evaluating DG-041 potency in rats and mice, and launching Phase I clinical trials.

deCODE calls DG-041 a "novel, first-in-class, orally-administered small molecule for the treatment of peripheral arterial disease (PAD)." That condition, according to the company, affects 10 percent of the population in the industrialized world. The putative drug target originated in the company's gene hunting databases, built on top of genealogical records that are built on parish documents from 1,100 years ago, about the time of the island's discovery by Vikings.

Stefansson is reluctant to discuss his big three projects in much detail. There is the reasonable fear of running afoul of both investment regulators and journal editors. But with contracted work for the NIH (infectious disease), Roche (stroke), and Merck (cancer and other diseases), he has accumulated $70 million for finding his own drugs. That is an admittedly small bankroll for such an extravagantly expensive quest. But the company takes on other work such as genotyping, at which it is perhaps an order of magnitude faster than the Marshfield Clinic or Johns Hopkins.

Despite deCODE's ongoing losses, large number of therapeutic areas under investigation, and heavy costs (70 of its employees have Ph.D.s), Stefansson is confident. The company already has its own contract research organization. He muses aloud about the need to partner with some much larger company's sales force should DG-031 be proven in larger, later trials. For him, the key to such early successes in the clinic is not just the cooperative attitude of Icelandic patients, but their blood and DNA, all bar-coded and locked away in his basement.

In the bowels of deCODE's headquarters, there is a retinal scanner mounted on the wall. This unit admits Pall Gestsson, head of robotics. It turns out the scanner can detect living retinas. "You can't rip my head off and get in," Gestsson says cheerfully.

A mechanical engineer, he came from the soft drink industry. But in an indication of deCODE's uncanny ability to recruit scientific problem-solvers, Gestsson spent nine months adapting two massive Japanese automotive robots to organize a massive cold storage room now filled with tray upon tray of 500,000 blood and DNA samples. Half the population of Iceland is represented here.

The namesake of one of the robots, Kari Stefansson, acknowledges that everyone in the industry hopes to translate a specific molecular indicator of disease - a biomarker - into hard evidence of a drug's efficacy. "We, however, are in a privileged position," Stefansson contends. "We have obtained our targets with genetics. We have been blessed with biomarkers that others have associated with myocardial infarction. We have myriad biomarkers (MI) that we can use to recruit into the studies, use fewer patients, and subsequently use to analyze the data."

Talking about integrating a variety of streams of data is common. Every small, medium, and large drug company in the world claims to be able to adroitly weave together genetic data, genomic data, lab data, and clinical trial data. But usually the elegance and efficiency of such projects are difficult to evaluate.

At deCODE, Hakon Gudbjartsson (no relation to Daniel) makes matters a bit easier. Chief of IT, he marshals a small team of power users who are difficult to pigeonhole as scientists or programmers. They're both, working to study a variety of streams of data. The programmers are in the same building with deCODE's wet lab scientists, walking the same stairways.

The IT and scientific challenges facing the company are steep, as is quickly evident when Gudbjartsson rattles off a few of the details of the company's contract work for Merck.

"What we have been doing with Merck is quite fascinating," Gudbjartsson says. "We are merging expression and genotyping data. The expression profiles are derived from a collection of people. The same goes for the statistical analysis of genetic markers." The question deCODE is trying to answer for the American company: Is there a genetic locus that has a statistical correlation to the overexpression of a particular gene?

"IT is just to make that possible," Gudbjartsson continues. "It's a scientific integration of these two methodologies to understand the genes and the pathways in a better way. We are making the IT infrastructure in a way that it is possible to support this type of work." Among other tools, Gudbjartsson and his team have worked with IBM to create their own genome browser. "We tend to want to look at the whole genome quickly," he says. "That viewer is visualizing the whole genome and drilling into that."

As Gudbjartsson notes, it's child's play to study patients who already have preidentified phenotypic attributes - a particular disease, perhaps. But what if shared phenotypic characteristics are opaque, hypothesized? deCODE is using clustering algorithms to group patients in its genealogical database who share traits.

Programmer-ScientistsPlainly, the company's statistician-programmers are dealing with data from living patients more routinely. "The organization of the clinical information is very important in our system," says Gudbjartsson. "We receive updates on the clinical information in our system. Just keeping track of new and old patient lists is a mundane but very critical task."

Gudbjartsson says the company is now investigating clinical trials software from large, well-known vendors and smaller names. This time, it will not build its own. The Icelandic company is looking for both electronic data capture and clinical trial management systems. Such systems will be seamlessly tied to drug discovery databases and applications while preserving patient privacy under strict European guidelines and HIPAA.

For a taste of what the IT at deCODE can do, nothing beats a short visit with Thorgeir Thorgeirsson, director of genetics. Sitting at his computer screen, dreaming up a hypothetical research project, he quickly calls up, oh, 17,000 ordinary Icelanders in the company's database. In a flash, the company's software has drawn a genealogical chart of a subset of the 17,000. "We can find out the relationship between people without knowing who they are," Thorgeirsson notes. Chances are, Iceland being Iceland, the people in question already know that.

But deCODE can use the information in more interesting ways, connecting the shared genetic inheritance of total strangers in ways that may show how a disease does or does not manifest itself. The same computer will allow Thorgeirsson to easily order up banked, isolated DNA on the people he's selected. That alone might take someone at a major European university a year.

Launching into another hypothetical project, one with 3,000 patients suffering from anxiety disorders, Thorgeirsson quickly finds 337 with panic attacks. A few more clicks, and it's clear which of those 337 have already donated blood to the company's scientific effort. "We make decisions about which participants' DNA to isolate depending on whether they are related," he notes in a matter-of-fact way.

Linking Data, BloodIn the United States, of course, biobanking firms such as Genomics Collaborative (acquired in 2004 by SeraCare Life Sciences) and Ardais also collect blood, DNA, and tissue. But they know next to nothing about the family history or genealogical connections that link their donors. Instead, commercial samples are collected according to the vagaries of surgery schedules and academic research leftovers (see "Blood, Sweat, and Tissue," March 2004 Bio-IT World, page 57). Pharmaceutical customers can order 200 colon cancer specimens for genotyping but receive only basic demographic information about the donors.

At deCODE, the process works differently. Its system can design a linkage analysis on the fly. At every turn, under company policy and Icelandic law, the company's scientists and programmers can engage in encrypted communications with ordinary community physicians able to call up an Icelandic bus driver and request that she come in to donate another sample.

Segments of Iceland's population may not be as enamored with deCODE as when the company began operations. But there's no doubt that a cooperative national population of patients allows deCODE to quickly pursue small trials that would not be possible in Boston or San Francisco. In Iceland, it appears people trust the robust nature of the IT systems that are in place to protect individual identities. As a result, they step forward to help the scientific endeavor.

"This is terribly useful," Thorgeirsson says of the company's linkage analysis program. "You will find a spot on a chromosome that tends to be shared more often than you would expect by chance."

Drag-and-Drop PatientsDemonstrating another software tool, Disease Miner, to find association studies, Thorgeirsson notes the interface is drag and drop. Scientists at deCODE can drill down to specific patients and, for a group of patients, examine the degree of interrelatedness between two genes and determine whether it is statistically meaningful or random.

In the end, the company's IT prowess will play a larger-than-average role in its success. The financial reality is that deCODE will not be able to afford the huge trials preferred by the largest pharmaceutical companies. It must conduct smaller but equally revealing clinical trials to survive.

"We can put together a trial of 200 people with an at-risk version of a gene," deCODE director of corporate communications Edward Farmer explains. "The patients will have the biological perturbation that is targeted by the drug." The question for the future is whether the genetic stratification of the company's clinical trial recruitment will be sufficiently advantageous to allow the company to continue to pursue its goals on its budget.

For his part, Stefansson is ebullient but mildly irritated he must pay the bills with contract projects for much larger companies. "Our goal is to bring drugs to market," he says. "It is not to work for others." He's already pondering conversations with the FDA to address the pharmacogenomic aspects of his portfolio. "We will have to find a way to work with the FDA to capture not only the genetic risk but the environmental risk in the labeling," he says, referring to the myriad lifestyle factors about the patients in his trials. If it's one thing Stefansson and deCODE are comfortable with, it's risk.