What's in my file share?

It's a common enterprise question, but often a difficult one to answer. This blog aims to explore the different ways that enterprises can leverage their file shares just by having insight into them and control over what lies within. While information governance and data management initiatives have become more and more popular, files have often been left out of those projects. And as a result, companies simply aren't sure what files they have, or where their files are.

Through products like ZL’s File Analysis and Management, organizations can bring governance to their file share environments through tools like metadata analysis, content analysis and advanced visualization capabilities. And as our blog posts show, these technologies can be leveraged by enterprises to fix any number of file management related problems.

The Future is Dark...Data 80% of all data is completely unused

A couple of days ago -- or at least a couple of days from when I started writing this -- CRN did an article on Ed Harbour, the illustrious VP for IBM’s Watson Group, where they talked cognitive computing and this conspicuous little thing called dark data.

Now, usually when the terms “dark” and anything revolving around something digital come around, some not-so-nice things pop into people’s heads, mainly due to horror stories about the “dark” or “deep” web, as the unindexed part of the Internet has become known. But for all of its potentially nasty associations, “dark data” is surprisingly benign -- well, at least at first glance.

Gartner.com defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing)” and blah, blah, blah...it’s basically unused or unprocessed and usually unstructured data that sits in storage. And while that might sound not so bad, it actually gets a little nefarious when you really start looking into it.

The thing is an estimated 80% of stored data is dark data -- shoutout to our good friend, Vilfredo Pareto -- and that starts getting really bad really fast when you consider how much new data is created each day and how many tera- or petabytes of data exist in, say, your average Fortune 500. Or just your average business, even.

And even if you aren’t the CEO of a startup or multi-billion dollar corporation and are more like a casual reader who stumbled across this article, this should still be bad news to you. Why? Well, because your personal information exists somewhere in this black sea of unused bits and bytes and if you’re going to become the victim of fraud or identity theft or some cybercrime in the future, there’s a good chance it’s going to be due to the misplacement of this so-called dark data.

So, let’s have a look at the numbers. The New York Times says that 90% of energy that’s used by data centers is waste, pure and simple, and IBM says that 60% of data loses it’s value almost immediately. And amongst organizations in Europe, the Middle East, and Africa, it’s estimated that the cost of all this is upwards near a trillion dollars -- and that’s not even considering the costs that might come from something like a legal suit because of what I just said a paragraph ago...

…not good, I know.

So why does dark data even exist? Why even have it in the first place? Well, a lot of it comes from people’s nature. We have a tendency to collect things. The same way that you’ve had that treadmill in your basement for the last 5 years because, “hey, I might use it some day” is the same way that a lot of organizations approach data storage, unfortunately.

The other reason is that not all dark data is inherently bad. In fact, it’s believed that dark data will be all the rage in the near future, especially with the development of cognitive computing a la IBM’s Watson. Of course, technologies also already exist to isolate and get rid of redundant and useless dark data.

The point, really, is the future is dark. And whether that’s a good or a bad thing, remains to be seen.

Sony Pictures, JP Morgan, Target, Anthem, Ashley Madison… the list of debilitating cybersecurity breaches that have flooded the news over the last two years goes on and on. Yet with the business focus on data breaches coming from outside the firewall, organizations have quite successfully managed to bolt the front door, all while leaving the backdoor and windows wide open!

In short, there is a lot of focus on the external threats to business data, but not nearly enough focus on internal risks.

The Internal Threat

At a SINET conference two years ago, I heard a former CIA Director cite that while 98% of the security market is focused on protecting the firewall – that is, locking the front door – historically, 90% of the most harmful data breaches have been internal. Imagine that: only 2% of the market is focused on 90% of the problem! His prime example, of course? Who else but Mr. Edward Snowden. Few data breaches can compare to the havoc he single-handedly wreaked by just walking out with some of the most sensitive information the NSA was holding.

For 17 years, ZL has had a laser-focused mission of helping organizations protect, secure, and govern their most prized asset: unstructured information. While much of the focus has been on managing this data for the purposes of eDiscovery, compliance, records management, and archiving, ZL has also realized that the majority of the risk organizations face is due to the troves of data that was left unsecured, much of which shouldn’t have been retained in the first place. In particular, we saw one of our major enterprise customers go through one of the most publicized data breaches in history: a breach that could have been prevented if the organization had taken better routine care and security of files. Hence, the advent of ZL’s powerful file analysis platform.

The “Snowden risk” remains the number one challenge for the enterprise where unstructured data such as file shares continue to erupt in volume – completely unmanaged and ungoverned. For years, we have collectively highlighted the “eDiscovery review cost” and “storage savings” argument in an attempt to clean out our mountains of unmanaged data, but at the heart of this issue lies governance and security. Expenses for eDiscovery and storage do indeed pale in comparison to an internal data breach that has seen companies shave off billions of dollars in value within weeks. Ironically, many of these internal breaches are NOT intentional, but that fact holds little ground with regard to the implications of employees walking out with highly sensitive information assets – unintentional or not.

Managing the Unknown

Unstructured information – such as files, email, social media, instant messages – hold truly unique value as “human created” content, distinct from the structured data generated by machines. Nothing matters more to an organization than their human capital; the petabytes of unstructured data in file shares is a true manifestation of just that. While we have taken extensive measures to protect our CRM systems and other repositories of “sensitive information,” our most prized intellectual property is sitting in file shares unbeknownst to us. And it’s often unsecured for anyone to access and walk out with!

The ZL Unified Archive platform has helped hundreds of organization get a handle on managing information they are aware of but as big data – largely unstructured content -- continues to explode, the key challenge has become “we don’t know what we don’t know.” The result? Petabytes of “dark data” that holds as much risk as it does value: neither of which is adequately addressed.

After all, how can we put policies in place without visibility into the very data we want to govern?

Solving the Dark Data Problem

ZL strives to empower clients with the information they need to make the most effective business and governance decisions. Through File Analysis and Management, ZL has managed to shine a very bright light on the organization’s “dark data” to allow for:

Securing of critical information assets

Ensuring data is secured and accessible on a “need to know basis”

Remediating user access to sensitive data that in fact they shouldn’t have access to

Identifying critical data for the business

Clearly identifying records that need to be secured and retained long-term

Data governance is the lynchpin to securing information assets and ensuring the organization leverages its data in the most optimal manner possible. With file analysis, ZL can offers a complete end-to-end solution for data governance, regardless of the type of data or its location. Most importantly, this brings hidden “dark data” to light, in support of all areas of governance: records management, eDiscovery, regulatory compliance and true data security.

At the end of the day, we cannot govern our data without complete visibility into it. After all, how can we slam door the backdoor and windows without knowing where they’re located? With file analysis, we can accomplish just that.

Farid has extensive experience working with Fortune 500 organizations in establishing and implementing governance strategies spanning across eDiscovery, Records Management, Regulatory Compliance, Data Analytics, and broader Enterprise Data Management initiatives. In particular, he has focused on working with regulated industries such as financial services, insurance, healthcare and pharmaceuticals to implement robust, long-term technologies to complement their overall governance strategy. Farid is a graduate of the The Wharton School at the University of Pennsylvania.

File Share Lost and Found Making sense of the dark data mess

We have all experienced the frustration of losing something in a public place. At first, you retrace your steps hoping to find that jacket you misplaced. As you backtrack, fighting the crowd with your eyes peeled for any signs of a blue jacket, you realize that someone must have found it. They must have turned it in to the lost and found.

While that may give you a glimmer of hope in that scenario, should it? Even if your jacket is “found,” you still need someone to retrieve it from a timeless stash of forgotten items.

Sometimes the quest to find a specific document or file can feel like a game of 20 Questions. “Which blue jacket was that?” “Does it have your name on it?” “When did you lose it?” That’s exactly how I treat my folders in the share drive. “What was that contract ID number?” “Did I save it to this folder or that one?” “When did I save it last?”

Now extrapolate that unwieldy methodology out to your entire organization. Is everybody treating their documents and storage that way? They are? Uh oh. That’s thousands and thousands of people running their own mismatched, uncatalogued lost and founds. As individuals, we’re probably capable of managing several hundred (even a few thousand) files in such a hodge-podge fashion. But at a company level? Good luck.

For years now, we’ve been hearing customers and prospective customers alike ask for a way to find, track, and analyze “dark data.” Gaining control and organization of such content is key to improving productivity and access to data. Finally, that capability is here. ZL File Analysis and Management is designed to convert your lost and found from a disorganized cardboard bin into a meticulously governed digital library of content. No more guessing about what people may looking for or accessing. ZL Enterprise Analytics™ gives you the insight and key statistics you need to maintain more up-to-date ACL permissions, or a more efficient and defensible retention schedule… just to name a few examples.

It’s time to shine a light on your dark data. Let’s make sure it’s actually found immediately when it’s needed, and not just (lost and) “found.”

I work to assist organizations in both the public and private sectors tackle issues of information management in the most secure and economic ways possible. My goal is to see every ZL customer get the most out of our solution, and use my experiences to help keep ZL’s software on the cutting edge. I draw from my background in politics and economics to maintain a fresh outlook on the industry, and exercise my analytical muscles using my healthy obsession with basketball statistics.