The DNA Data Problem

Has life science reached a tipping point in how it handles mountains of genomic information?

By Bob Grant | December 5, 2011

WIKIMEDIA COMMONS, SILKY M

Gene sequencing technology has advanced in leaps and bounds in the past couple of decades. But as genomicists and others involved in research projects that generate reams of DNA, RNA, or proteomic data know well, storing and analyzing all that information is rapidly becoming an intractable problem.

A recent article in TheNew York Times highlights the difficulty, citing many leading researchers airing their frustrations with discrepancies in the pace of innovation between sequencing and data handling technologies. "Data handling is now the bottleneck," David Haussler, director of the center for biomolecular science and engineering at the University of California, Santa Cruz, told the Times. "It costs more to analyze a genome than to sequence a genome."

Indeed, though the price of sequencing an entire human genome is expected to decrease to the long-anticipated $1,000 mark in the next couple of years, that cost is dwarfed by the mounting expenses of storing and analyzing genomic data.

And the data deluge (which The Scientist covered in its October issue) may also cause the shuttering of federal repositories designed to store the information. The amount of data stored one such database has more than tripled in the past year alone, according to the Times article, bulging at the seams with 300 trillion nitrogenous bases occupying almost 700 trillion bytes of computer memory.

"We have these giant piles of data and no way to connect them," Steven Wiley, a biologist at the Pacific Northwest National Laboratory, told the Times. "I'm sitting in front of a pile of data that we’ve been trying to analyze for the last year and a half."

Pavlovdemonstrated effecting placebo phenomena in multi celled organisms bymanipulation of their drives-reactions. Now placebo and imagination phenomenaare demonstrated also in Earthâ€™s smallest, base organisms, in the genes andgenomes of multi-celled organisms, in our primal 1st stratum and 2nd stratumbase organisms.

A very goodreason to smile.

Now an interestingchain is exposed to our view, the Genes-Virtual Reality Chain, a mostintriguing cultural evolution chain extending from the genesis of our genes tonowadays, throughout life, a virtual reality existence, and by virtual realityphenomena, exploitations and manipulations.

Pavlovdemonstrated effecting placebo phenomena in multi celled organisms bymanipulation of their drives-reactions. Now placebo and imagination phenomenaare demonstrated also in Earthâ€™s smallest, base organisms, in the genes andgenomes of multi-celled organisms, in our primal 1st stratum and 2nd stratumbase organisms.

A very goodreason to smile.

Now an interestingchain is exposed to our view, the Genes-Virtual Reality Chain, a mostintriguing cultural evolution chain extending from the genesis of our genes tonowadays, throughout life, a virtual reality existence, and by virtual realityphenomena, exploitations and manipulations.

Pavlovdemonstrated effecting placebo phenomena in multi celled organisms bymanipulation of their drives-reactions. Now placebo and imagination phenomenaare demonstrated also in Earthâ€™s smallest, base organisms, in the genes andgenomes of multi-celled organisms, in our primal 1st stratum and 2nd stratumbase organisms.

A very goodreason to smile.

Now an interestingchain is exposed to our view, the Genes-Virtual Reality Chain, a mostintriguing cultural evolution chain extending from the genesis of our genes tonowadays, throughout life, a virtual reality existence, and by virtual realityphenomena, exploitations and manipulations.

This problem was highlighted in a recent conference here in South Africa, where two researchers using high-throughput sequencing to study virus diversity both said independently that they did not know what to do with the "waste" data.Â Presently they just dumped it, but they thought this was tantamount to being criminal, given that others could well find use for it at some later date.Â And we have never had any kind of national data storage....

This problem was highlighted in a recent conference here in South Africa, where two researchers using high-throughput sequencing to study virus diversity both said independently that they did not know what to do with the "waste" data.Â Presently they just dumped it, but they thought this was tantamount to being criminal, given that others could well find use for it at some later date.Â And we have never had any kind of national data storage....

This problem was highlighted in a recent conference here in South Africa, where two researchers using high-throughput sequencing to study virus diversity both said independently that they did not know what to do with the "waste" data.Â Presently they just dumped it, but they thought this was tantamount to being criminal, given that others could well find use for it at some later date.Â And we have never had any kind of national data storage....