Data Storage Systems in the Living World

by
Brian Thomas, M.S. *

The world has witnessed an explosion of digital technology in the past few decades. With these advances comes the question of how to preserve the digital information that is constantly being generated.

Dr. Francine Berman, digital data expert and director of the San Diego Supercomputer Center at the University of California, has published a guide for how digital data should be stored.1 Ideally, she proposes data storage systems (“data cyberinfrastructures”) that are open, organized, stable, predictable, cost-effective, manageable, accessible, and secure.2

The similarities are striking between the features she lists and the data storage strategies already in place in living things. For instance, number one on Berman’s top ten guidelines for data management is: “Make a plan. Create an explicit strategy for stewardship and preservation for your data, from its inception to the end of its lifetime; explicitly consider what that lifetime may be.”2 After sin entered the world and introduced death and decay, it seems that God planned for biological data to be able to last for several thousand years, but not millions. Even with DNA repair mechanisms, biological information erodes until it collapses, after 500 generations or fewer.3

Berman’s third guideline states, “Associate metadata with your data. Metadata is needed to be able to find and use your data immediately and for years to come. Identify relevant standards for data/metadata content and format, following them to ensure the data can be used by others.”2 Indeed, each genome contains both data (including genes) and metadata (including epigenetic factors); the latter is required to properly access the former.4 Exactly where cellular metadata is located is currently being investigated.

Next, Berman advises, “Make multiple copies of valuable data. Store some of them off-site and in different systems.” This principle can also be found in living organisms. In plants and animals that undergo sexual reproduction, each cell in every individual has two complete sets of gene-containing chromosomes—called diploid cells. Other biological data duplicates also exist, including multiple copies of often-used “housekeeping” genes, and, within tissues, whole cells serve as backups.

Number seven states, “Determine the level of ‘trust’ required when choosing how to archive data. Are the resources of the U.S. National Archives and Records Administration necessary or will Google do?” Likewise, different biological information is stored at different levels of protection within genomes. Some genes are in sections of DNA that are wound too tightly for DNA-manipulating enzymes to easily access, but other genes (those to which the cell needs ready and immediate access) are in less-guarded DNA regions.

The ninth guideline warns, “Pay attention to security. Be aware of what you must do to maintain the integrity of your data.” The Creator thought of this also. He provided an array of DNA damage repair enzymes. “Without a repair system in place, it is likely that excessive mutations would have dramatically shortened our lifespans, and would have caused our extinction long ago.”5

If cyberinfrastructures require intelligence and effort to produce, as Dr. Berman’s guide implies by virtue of its very existence, then why would biological information infrastructures have required anything less? Surely the One who laid the foundations for safe and stable biological data is the One who made the world—just as He told us in His Word. The organization and complexity of data storage in living organisms point to an ultimate Data Expert.