Keeping data valuable

By Luther Martin —
November 6, 2008

According to the Association for Information and Image Management, only 20 percent of corporate data is structured data, but structured data consumes 75 percent of corporate IT resources. Examples of structured data are the information in ERP systems, CRM systems, and finance systems. In each of these cases, the data is well understood. It’s fairly easy to know both where it is and its exact format. This makes protecting it easy, whether you’re using encryption or some alternative. Maybe saying that it’s easy is an oversimplification.

It can still be tricky to integrate encryption with legacy systems that handle structured data because the size and format of data typically changes when you encrypt it. The new technology of Format-preserving Encryption goes a long way towards making legacy computing environments simpler to deal with, but even that will only protect 20 percent of your data. The remaining 80 percent is much harder to protect.

The remaining 80 percent of corporate data is unstructured. Examples of unstructured data are the information in e-mail, documents, spreadsheets. Even voicemails count as unstructured data. With unstructured data, you often don’t know exactly where it is or what it’s exact format it.

Suppose that you encrypt all of your unstructured data. Maybe you can do this with the DLP technology offered by vendors like EMC. Once you’ve encrypted your data, however, it may become much less valuable than it once was because you’re probably unable to search it.

A significant part of the value of Google, after all, is due to the fact that they let you search lots of the world’s data. If you couldn’t do this, the world’s data would be much less useful and much less valuable, and Google is valuable because they make the world’s data more valuable. Similarly, if you can’t search your corporate data then it’s less valuable than it could otherwise be. If you believe that most of the value of modern businesses is determined by the value of their information, this might make you think twice about trying to encrypt unstructured data. On the other hand, using identity-based encryption may provide a good way to encrypt data, yet still keep it searchable.

One feature of IBE is that all keys are calculated as needed. This means that you don’t need to keep a database of private keys to do key recovery. This is because you can recalculate any private keys when they’re needed. This means there’s no need to store private in a secure key archive to do useful things like key recovery. This also lets you do clever things like doing content filtering of encrypted e-mail by delegating the permission to get IBE private keys to a filtering appliance.

It also can let you easily search encrypted data in much the same way. Delegate the permission to get IBE private keys to a search appliance and it can decrypt encrypted unstructured data, search it, and return the results of the search. You can even restrict the result of such a search to users that are authorized to decrypt the encrypted data that the search finds. By doing this, you can protect your unstructured data without greatly reducing its value, staying compliant the data security and privacy laws that complicate business these days, but keeping the value of one of your most valuable assets.

Unstructured data isn’t commonly encrypted today, but when it is, I wouldn’t be surprised to find that encrypting it is a good application for IBE.