By submitting my Email address I confirm that I have read and accepted the Terms of Use and Declaration of Consent.

By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.

You also agree that your personal information may be transferred and processed in the United States, and that you have read and agree to the Terms of Use and the Privacy Policy.

data. We need to add intelligence to data to help it protect itself.

There's a lot of cool stuff going on with backup these days, with a lot of interesting technical innovation making backup easier and more flexible than ever. Yet despite these new developments in data protection, I can't help but wonder if the efforts are somehow missing the mark.

Key data protection methods like deduplication and snapshot management continue to develop, while other techs -- like continuous data protection (CDP) -- are getting dusted off, spiffed up and finding new niches. The idea of backup-less backups is starting to take hold, but most companies are still looking for ways to streamline or simply enable more traditional approaches to data protection, like daily incremental backups plus weekly fulls.

We need to figure out ways to squeeze a ton of backup into a 10- to 12-hour window or even eschew windows altogether in favor of CDP methods that do backup in dribs and drabs instead of a torrent of data at day's end.

Most data management revolves around the age of the data, if it happens at all. New stuff is backed up, older stuff gets archived and so on. It's a process based on a single dimension of the data and it's purely reactionary. Granted, it's worked for years and still works at plenty of companies, but a changing IT environment is applying new pressures to old tools. It used to be that a storage manager's job was to give data a place to live and ensure it was protected by tucking away copies in case the originals got lost or damaged. But with smartphones, tablets and all manner of bring-your-own-device (BYOD) things invading the enterprise, it's not enough to house and protect data -- you have to make sure no one lets it wander away.

So while the tools are barely adequate to maintain the status quo, they're definitely going to fall short in the future; maybe they're already starting to break down at your company.

I think that's because our typical approach to data management is quickly becoming obsolete. Most of what IT has done and continues to do assumes it has control over the data, and that the amount of data is controllable. Neither case is true anymore.

Dealing with data using software or hardware tools is essentially a process that's external to the data. The intelligence in the process doesn't come from the data, but from the tools used to move, copy and otherwise deal with data. And today, tools aren't all that smart. So the data itself needs to be smarter; it needs to know what it's supposed to do, how long it should do it, and where it can and can't go.

It's time for metadata on steroids -- mega-metadata or whatever you want to call it -- that will add enough intelligence to data so that we humans using other relatively crude tools don't have to worry about it.

This mega-metadata would be self-describing and able to trigger autonomic actions, such as delete, move/don't move and copy. The goal would be to create data that is smart enough to take care of itself after a little initial guidance, and would know what would keep it safe and what would get it into trouble.

We're already part of the way there, with technologies such as the expanded metadata enabled by object storages, and the application of Active Directory or LDAP to data management. On the BYOD front, being able to remotely wipe a phone or tablet is an effective data security response although it requires too much human intervention.

If all the necessary disposition information was packed in with the data, the data itself would know what to do. Of course, it would need cooperative operating systems, file systems and applications.

For example, a file might be tagged to remain "live" until a certain date. It would also be able to let a backup or replication application know how many copies to make, be able to tell an iPhone user that it couldn't be copied or sent to a sync-and-share cloud, and it would let the archiver know when it was time to retire or simply shuffle off this mortal coil.

My mega-metadata fantasy continues with an "ultimate dashboard" that displays daily alerts describing what files are scheduled for self-destruction, which ones will be moved near-line or off-line, and which files have been moved to different user devices. It would be a kind of "Where's Waldo?" for data, but you wouldn't have to search to find all the instances of Waldo.

A lot of those things happen now, but you have to pull the levers and push the buttons on a myriad of tools. And just keeping track of everything is a full-time job.

There are probably a bunch of products out there that I'm just not aware of that do some of the things I've described. But for this vision to become reality, applications have to get smarter, and OSes and file systems have to become more aware.

If data is the most important thing -- and I can't imagine anyone would dispute that -- then we have to focus more on the data itself if we're ever going to protect and secure it in an information-crazy, ever-mobile world.

About the author:Rich Castagna is editorial director of TechTarget's Storage Media Group.

2 comments

Register

Login

Forgot your password?

Your password has been sent to:

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

I like the idea of the mega-metadata. It reminds me (somewhat) of the self-verifying data that we often use as an oracle in testing, where we embed the correct answer in the test data itself. The difficulty comes in with how we encode the instructions that allow the data to “know” what to do.

Hello again, mcorum--you're exactly right. Bundling up all the disposition metadata with the data is only part of the picture--a pretty small part if the apps that deal with the data aren't aware of the metadata and know how to interpret and act on it.