Home
> Uncategorized > Scientists have stored audio and text on fragments of DNA and then retrieved them with near-perfect fidelity—a technique that one day may provide a new way to handle the overwhelming data of the digital age.

Scientists have stored audio and text on fragments of DNA and then retrieved them with near-perfect fidelity—a technique that one day may provide a new way to handle the overwhelming data of the digital age.

The scientists encoded in DNA an audio clip of Martin Luther King Jr.’s "I Have a Dream" speech, a photograph, a copy of Crick and Watson’s famous "double helix" scientific paper from 1953 and Shakespeare’s 154 sonnets. They were then able to retrieve them with 99.99% accuracy. The experiment was reported Wednesday in the journal Nature.

Getty Images

A copy of Crick and Watson’s famous "double helix" scientific paper from 1953 was among items scientists successfully encoded. Above, a DNA model.

"All we’re doing is adapting what nature has hit upon—a very good way of storing information," said Nick Goldman, a computational biologist at the European Bioinformatics Institute in Hinxton, England, and lead author of the Nature paper.

DNA—the molecule that contains the genetic instructions for all living things—is one of the most effective storehouses of data. It is stable, durable and dense. A cup of DNA theoretically could store about 100 million hours of high-definition video and last for tens of thousands of years.

While DNA-based storage remains a long way from being commercially viable—high cost is one major hurdle—the scientific barriers are starting to fall. Last August, researchers at Harvard University reported in the journal Science the encoding of an entire 54,000-word book on strands of DNA.

"The experiments are very similar," said George Church, a molecular geneticist at Harvard and senior researcher for the project reported in Science. "Because these are truly independent efforts we’ve shown there’s a real field here rather than just one group."

Both experiments encoded similar amounts of information and had roughly similar accuracy rates, according to Dr. Church.

The European Bioinformatics Institute is part of the European Molecular Biology Laboratory, Europe’s flagship life sciences lab. EMBL is funded by public research money from 20 European member states.

Companies, governments and universities face an enormous challenge storing the ever-growing flood of digital information. Magnetic tapes can degrade within a decade, while hard disks are expensive and need a constant supply of electricity. Some computer experts have looked for answers in biology.

In recent years, they have found ways to encode trademarks in cells and poetry in bacteria, as well as store snippets of music in the genetic code of micro-organisms. DNA, though, offers a key advantage over the other nature-inspired methods: since DNA isn’t a living thing, it can sit passively in a test tube where it is less subject to biological changes.

Dr. Goldman and his colleagues first downloaded onto a computer a 26-second-clip of Dr. King’s "I Have a Dream" speech, the sonnets and the other things to be stored. The data was in normal computer code—a long string of ones and zeros. A software program devised by Dr. Goldman’s team converted those ones and zeros into the letters A, C, G and T, the four chemical bases that make up DNA.

The single, long DNA-based string was chopped up into about 150,000 fragments, each 120 letters long. Each fragment contained about 100 letters encoding the data. The remaining 20 letters were a sort-of index—instructions for later restoring the fragments in the right order.

The information was sent to Agilent Technologies Inc. A -0.76%of Santa Clara, Calif., where a laboratory machine used the data and appropriate chemicals to manufacture physical strings of DNA. Those fragments were shipped to Dr. Goldman’s lab in England.

"I thought the vial was empty when it arrived," said Dr. Goldman. But the DNA was there—it lay like a speck of dust at the bottom of the vial, almost impossible to see.

After some lab work, the DNA was dispatched to an EMBL lab in Heidelberg, Germany. There, a DNA-sequencing machine fired lasers at the fragments and read their genetic code, yielding a computer file in the form of As, Cs, Gs, and Ts.

Back in Hinxton, a computer program reassembled the fragments in the right order, and then converted them back into ones and zeros. When run on a laptop, those ones and zeroes were interpreted as the original audio clip, sonnets and other items—when the clip of Dr. King’s speech was played back, it sounded just like the original version, said Dr. Goldman.

There are plenty of challenges before DNA storage could become a useful technique. Writing DNA is still extremely expensive. And for the method to be commercially successful, it would have to be automated and turned into a reliable, industrialized process.

"In 10 years it’s probably going to be about 100 times cheaper," said Dr. Goldman. "At that time it probably becomes economically viable."