Scientists Store an OS, a Movie and a Computer Virus on DNA

Do you know — 1 Gram of DNA Can Store 1,000,000,000 Terabyte of Data for 1000+ Years.

Just last year, Microsoft purchased 10 Million strands of synthetic DNA from San Francisco DNA synthesis startup called Twist Bioscience and collaborated with researchers from the University of Washington to focus on using DNA as a data storage medium.The researchers believe that DNA is the perfect storage medium – as it is ultra-compact and can last hundreds of thousands of years if kept cool and dry – and suggests this is the "highest-density data-storage device ever created."

However, in the latest experiments, a pair of researchers from Columbia University and the New York Genome Center (NYGC) have come up with a new technique to store massive amounts of data on DNA, and the results are marvelous. The duo successfully stored 214 petabytes of data per gram of DNA, encoding a total number of six files, which include:

A full computer operating system

An 1895 French movie "Arrival of a Train at La Ciotat"

A $50 Amazon gift card

A computer virus

A Pioneer plaque

A 1948 study by information theorist Claude Shannon

The new research, which comes courtesy of Yaniv Erlich and Dina Zielinski, has been published in the journal Science.Movie Stored and Retrieved from DNA MoleculesA copy of this 1895 French film, “Arrival of a train at La Ciotat,” was encoded into synthetic DNA molecules and later retrieved using a new coding strategy developed by Yaniv Erlich and Dina Zielinski at Columbia University and New York Genome Center.Calling their process a "DNA Fountain," the researchers first compressed all the data into a single master archive and split it into short strings of binary digits, made up of ones and zeros.Next, the duo used an "erasure-correcting algorithm called fountain codes" to randomly packaged the strings into droplets. Each droplet contains a barcode in the sequence that helped the researchers reassembling the file.The researchers then "mapped the ones and zeros in each droplet to the four nucleotide bases in DNA: A, G, C and T," and ended up with a digital list of 72,000 DNA strands that contained the encoded data.This code was then sent in a text file to Twist Biosciences, the same DNA synthesis startup from which Microsoft purchased 10 Million strands of synthetic DNA last year, that then turned that digital information into biological DNA.Since the digital universe is large and by 2020 containing nearly as many digital bits as there are stars in the universe, the data will reach 44 zettabytes or 44 trillion gigabytes.So, DNA data storage could help big organizations store an enormous amount of information in a way that one can still be able to read it in a hundred years.However, cost is still an issue. The researchers spent around $7,000 to synthesize the 2MB of data and another $2,000 to read that data.