Complete Genomics Targets ‘The First $1000 Genome’

Sequencing service is on track for Spring 2009.

By Kevin Davies

Nov. 12, 2008 | Complete Genomics emerged from stealth mode in early October brandishing an audacious service model for wholesale next-generation sequencing, with its first human genome already assembled and the CEO’s pledge to reach the magical “$1000 genome” price point as early as spring 2009.

“Our mission is to be the global leader in complete human genome sequencing,” chairman, president, and CEO Clifford Reid told Bio-IT World. “We are setting out to completely change the economics of genome sequencing so that we can do diagnostic quality human genome sequencing at a medically affordable price. Essentially, [we’ll] transition this genome sequencing world from a scientific and academic endeavor into a pharmaceutical and medical endeavor.”

Reid is setting out to completely change the economics of genome sequencing.

Based in Mountain View, Calif., Complete Genomics has raised $46 million in three rounds of financing since its incorporation in 2006. Complete Genomics will not be selling individual instruments, but rather offer a service aimed initially at big pharma and major genome institutes. The company is building what Reid calls “the world’s largest complete human genome sequencing center so we can sequence thousands of complete human genomes, so that researchers can conduct clinical trial-sized studies.”

If all goes according to plan, that 32,000-square-feet, $75 million facility will deliver 1,000 human genomes in 2009 and an eye-popping 20,000 genomes in 2010. But that’s just the beginning. The firm plans to build ten genomes centers in the U.S. and abroad over the next five years in partnership with various organizations and foreign governments.

The $4000 GenomeAlthough the data are unpublished—advisory board member Leroy Hood admitted he hadn’t seen the genome assembly results at the time of launch—Reid says his team produced its first human genome sequence last July. “It’s getting a bit long in the tooth already,” he jokes. “The total materials cost was about $4000.” The genome coverage was 22-fold from a total of 67 gigabases (Gb) of mapped reads. “The speed of the instrument is about ten times as fast as ABI and Illumina,” Reid claims. “This [project] ran four instruments for one run of a week. This is a 28- instrument-day experiment. By the launch of our product in Q2 [of 2009], it will be a 4-instrument-day experiment.

When the product is launched next spring, “we fully anticipate the materials cost of that genome will be just under $1000,” Reid says. “We’re going to price the genomes at $5000 each, which covers of course not only material but also instruments and labor and overhead.” Reid admits it’s an incomplete measure of cost, but it has become the industry’s standard accounting method. By those criteria, “It will be the first $1000 genome,” says Reid.

The Complete Genomics technology is based on co-founder Radoje Drmanac’s work in sequencing-by-hybridization using a ligation strategy and gridded arrays of up to one billion DNA “nanoballs.” Reid’s expertise in computer science and Drmanac’s in biochemistry proved to be a perfect complement. “The convergence of biotechnology and computing has really enabled a whole new generation of DNA sequencing that’s going to change the world,” he says.

First ServiceComplete Genomics is not the first company to explore a service model for genome sequencing (see, “Genome Corp. Born Again,” Bio-IT World, Jan. 2008), but it is the first using a next-gen sequencing technology.

Reid cites two main reasons for choosing a services model. First, he sees the key market as large pharma conducting clinical trials. “The pharmaceutical market has declared very clearly they don’t want to buy instruments,” says Reid. “They want to buy services, so that they get the data that enables them to do the discovery and development work, rather than have to own and operate a large-scale genome sequencing center.”

Petabytes of Data

The Complete Genomics platform hinges on exquisite precision in manufacturing and arraying “nanoballs” of DNA. But it will be critical to manage gargantuan quantities of data. The task of building the data center falls to vice president of software Bruce Martin, a former executive with Sun and Openwave.

“I’ve built a team that is a little microcosm of what you see in the rest of the company,” says Martin, including bioinformaticians who worked with Craig Venter on genome assembly and the HapMap project, as well as experts in data mining, indexing databases, and high-throughput computing.

The imaging steps involve measuring hundreds of millions of spots. “We are currently generating close to a gigabit a second off the imager, and that’s going to go up by a substantial amount in the next year,” says Martin. “I have not only an extremely interesting computational challenge here, but there’s just a bandwidth problem… You can’t store images at that rate onto disk drives without spending a king’s ransom in storage.”

Martin says his group has had “a very successful run” with a clustered storage system from Isilon, which he likes for its “very high performance” and ability to scale to multi-petabyte file systems. “You can manage it with a very small footprint of staff. The Broad recently deployed them as well. I couldn’t say who got there first. We both basically have selected them for similar reasons.”

Due to space, power, and cooling considerations, Martin is exploring options with several high-density blade vendors. “We want to pack as many cores and as much memory into as small a footprint as we can for economic reason,” he says.

Martin says he’s made “a significant investment in an aligner” for rapid genome alignments that can scale to thousands of processors. “I went out and found some very significant expertise in Silicon Valley in terms of high-speed, large-scale search and indexing. We have many of the leading companies in the world in that area.”

If the ramp up for 2009 sounds daunting—1,000 genomes in a center housing 5 petabytes of data—the specs for sequencing 20,000 genomes in 2010 are positively frightening. “We’ll probably be in the 60,000-processor and 30-petabyte range in that time frame,” says Martin.

A second consideration is that the new sequencing technologies “generate a breathtaking amount of data,” says Reid. “Simply selling 10 or 20 instruments to a company doesn’t solve the problem. You then have to be able to mange huge volumes of data. We are putting in a Google-style data center to manage the data.” (See sidebar: “Petabytes of Data”)

Reid plans to build a further ten centers—for about $50 million apiece—in the U.S. and abroad over the next five years, in partnership with other companies, research organizations, and countries. Those ten genome centers will produce about one million genomes over the next five years. “A nice way to think about 1 million genomes is 1,000 people with each of 1,000 diseases. By the time we’ve done that, we will understand the genetic basis of all the important human diseases,” he says.

Scale UpThe near-term goal for 2009 is to focus on pilot sequencing projects for the commercial and academic communities to validate the technology and establish workflows that will form the operational blueprint for expansion. Ten percent of the firm’s sequencing capacity in the next two years will be devoted to a collaboration with advisory board member Hood. The Institute of Systems Biology president is a partner with the Government of Luxembourg on a $200-million biobank and personalized medicine project.

As for targeting pharmaceutical companies, Reid predicts two key groups of early adopters—companies pursuing cancer and mental illness. Both groups of diseases have a strong genetic component. Says Reid: “To date, the industry has not been able to find the rare variants that are causes of diseases and drug response. That’s a new capability we’re bringing.”

Another enticing constituency for Complete Genomics is the personal genomics or consumer genomics market. Reid agrees: “Knome and 23andMe and Navigenics and all those guys will essentially buy genome services from us and add a lot of value [and] transfer it on to the consumer population.”