Data Preservation vs. Data Archive

Eighty percent of an organization’s data is active and 15% of it really matters. Archiving is the management of the 80%. Data preservation is the management 15%. While organizations often need help identifying the 80% of data that is not changing, they typically know exactly what data falls into the preservation category. However, they often use the same tool to manage both types of data.

What Makes Data Preservation Different?

The key point of differentiation between archived data and data the organization needs to preserve is the reason for recall. Preserved data is often recalled to address a specific legal or regulatory request, or it is a set of data that needs to be analyzed to help the organization make a better business decision or create a new product.

In both cases, the inability to deliver preserved data can be very costly for the organization. If the request is legal or compliance in nature, failure to deliver that data could lead to significant fines. If the data is needed for analysis, it could cost the organization time and money caused by making decisions without all the relevant data.

Preserved data needs to be handled differently than archived data. After identification, it needs to be moved into a long term storage repository. During the transfer, it needs to be encrypted and set read only to prevent modification. Also, there needs to be a log of anyone accessing that data. Finally, the preservation system needs to provide continuous verification of the data. Most archive systems only provide the identification and movement of data to an archive, they don’t have encryption, change prevention and access logging. Data preservation software should provide all of these capabilities.

Where Should You Preserve Data?

The next question is, “Where should the organization store its preserved data?” The default answer for most organizations is tape media, which is moved offsite to a physical facility like Iron Mountain or similar service. While secure, these services suffer when it comes to meeting the organization’s recall needs. For example, if an organization needs data, it has to know which tape, or tapes, it is on, then communicate that to the service, then wait for the service to find, pull and ship the tape or tapes back to the organization. Once that’s done, the tapes have to be mounted and the organization has to wait for the tape to be searched to find the required data.

The cloud is a viable option. It provides similar long term storage, easier verification and will transmit only the data required to meet the recall request. Security is a concern. But that concern goes away if the data preservation software can add encryption and file access auditing.

Share this:

Like this:

Related

Twelve years ago George Crump founded Storage Switzerland with one simple goal; to educate IT professionals about all aspects of data center storage. He is the primary contributor to Storage Switzerland and is a heavily sought after public speaker. With over 25 years of experience designing storage solutions for data centers across the US, he has seen the birth of such technologies as RAID, NAS and SAN, Virtualization, Cloud and Enterprise Flash. Prior to founding Storage Switzerland he was CTO at one of the nation's largest storage integrators where he was in charge of technology testing, integration and product selection.