DNA data storage: One gram of DNA can store 215 petabytes of data

Microsoft has claimed a record by storing 200MB of data onto synthetic DNA strands (DNA data storage), but now researchers have gone much further by storing a 215 Petabytes (1,048,576 GB) of data onto single gram of DNA.

The scientists from Columbia University and the New York Genome Center (NYGC) have invented a new coding system, dubbed DNA Fountain. Which is capable of stuffing 215 petabytes (1,048,576 GB) of data onto single gram of DNA.

According to Science Daily, that is about 100 times more than previous researchers have stored on DNA, and it was achieved by customizing an algorithm for streaming video on a smartphone.

Benefits of DNA Data Storage:

High data storage density.

Withstand extreme environmental conditions.

High memory space.

Secure as invisible to human eye.

Effective power usage.

Can store data by long periods by protecting from Water and Oxygen.

The same four chemical building blocks behind of all life on earth could be used replace traditional computer storage someday.

DNA holds promise for data storage because of its superior density to tape, disk, and optical media. It can also store information for thousands of years if it’s kept in the right conditions. While information in digital devices is written as 1 and 0’s, researchers have found different algorithms for encoding data to conform with DNA’s four base nucleotides: adenine, A, guanine, G, cytosine, C, and thymine, T. Using this method, last year Microsoft has claimed a record by storing 200MB of data including a music video, on synthetic DNA strands.

While information in digital devices is written as 1 and 0’s, researchers have found different algorithms for encoding data to conform with DNA’s four base nucleotides: adenine, A, guanine, G, cytosine, C, and thymine, T. Using this method, last year Microsoft has claimed a record by storing 200MB of data including a music video, on synthetic DNA strands.

DNA Fountain was created by Yaniv Erlich, a computer science professor at Columbia Engineering, he’s also a core member of NYGC, and Dina Zielinski, an associate scientist at NYGC.

Within the 2MB compressed file, they wrote to DNA included graphical operating system KolibriOS, an old French film, a $50 Amazon gift card, a computer virus, and a Pioneer plaque. It also included the 1948 study, A Mathematical Theory of Communication, by Bell Lab information theorist Claude Shannon, in a nod to his pioneering work on encoding, noise, and decoding in information transmission.

The researchers looked at the challenge through Shannon’s theory on the information capacity of DNA storage, which says the maximum capacity each nucleotide could reach in an ideal world is two bits. However, as with communications, DNA storage capacity is obstructed by various noise factors.”DNA storage is basically a communication channel,” write Erlich and Zielinski. “We transmit information over the channel by synthesizing DNA oligos. We receive information by sequencing the oligos and decoding the sequencing data. The channel is noisy due to various experimental factors, including DNA synthesis imperfections, PCR dropout, stutter noise, degradation of DNA molecules over time, and sequencing errors.”

“DNA storage is basically a communication channel,” write Erlich and Zielinski. “We transmit information over the channel by synthesizing DNA oligos. We receive information by sequencing the oligos and decoding the sequencing data. The channel is noisy due to various experimental factors, including DNA synthesis imperfections, PCR drops out, stutter noise, degradation of DNA molecules over time, and sequencing errors.”

They say DNA Fountain, so named because it uses fountain codes, which are used for video streaming to mobile devices “approaches the Shannon capacity while providing robustness against data corruption”.

According to Science Daily, they used DNA Fountain to generate 72,000 DNA strands or oligos that were sent to Twist Bioscience, the DNA synthesis firm that supplied Microsoft’s synthetic DNA.