To Succeed with Big Data, Start Small

While it isn’t hard to argue the value of analyzing big data, it is intimidating to figure out what to do first. There are many unknowns when working with data that your organization has never used before — the streams of unstructured information from the web, for example. Which elements of the data hold value? What are the most important metrics the data can generate? What quality issues exist? As a result of these unknowns, the costs and time required to achieve success can be hard to estimate.

As an organization gains experience with specific types of data, certain issues will fade, but there will always be another new data source with the same unknowns waiting in the wings. The key to success is to start small. It’s a lower-risk way to see what big data can do for your firm and to test your firm’s readiness to use it.

The Traditional Way

In most organizations, big data projects get their start when an executive becomes convinced that the company is missing out on opportunities in data. Perhaps it’s the CMO looking to glean new insight into customer behavior from web data, for example. That conviction leads to an exhaustive and time-consuming process by which the CMO’s team might work with the CIO’s team to specify and scope the precise insights to be pursued and the associated analytics to get them.

Next, the organization launches a major IT project. The CIO’s team designs and implements complex processes to capture all the raw web data needed and transform it into usable (structured) information that can then be analyzed.

Once analytic professionals start using the data, they’ll find problems with the approach. This triggers another iteration of the IT project. Repeat a few times and everyone will be pulling their hair out and questioning why they ever decided to try to analyze the web data in the first place. This is a scenario I have seen play out many times in many organizations.

A Better Approach

The process I just described doesn’t work for big data initiatives because it’s designed for cases where all the facts are known, all the risks are identified, and all steps are clear — exactly what you won’t find with a big data initiative. After all, you’re applying a new data source to new problems in a new way.

Again, my best advice is to start small. First, define a few relatively simple analytics that won’t take much time or data to run. For example, an online retailer might start by identifying what products each customer viewed so that the company can send a follow-up offer if they don’t purchase. A few intuitive examples like this allow the organization to see what the data can do. More importantly, this approach yields results that are easy to test to see what type of lift the analytics provide.

Next, instead of setting up formal processes to capture, process, and analyze all of the data all of the time, capture some of the data in a one-off fashion. Perhaps a month’s worth for one division for a certain subset of products. If you capture only the data you need to perform the test, you’ll find the initial data volume easier to manage and you won’t muddy the water with a bunch of other data — a problem that plagues many big data initiatives.

At this point, it is time to turn analytic professionals loose on the data. Remember: they’re used to dealing with raw data in an unfriendly format. They can zero in on what they need and ignore the rest. They can create test and control groups to whom they can send the follow-up offers, and then they can help analyze the results. During this process, they’ll also learn an awful lot about the data and how to make use of it. This kind of targeted prototyping is invaluable when it comes to identifying trouble and firming up a broader effort.

Successful prototypes also make it far easier to get the support required for the larger effort. Best of all, the full effort will now be less risky because the data is better understood and the value is already partially proven. It’s also worthwhile to learn that the initial analytics aren’t as valuable as hoped. It tells you to focus effort elsewhere before you’ve wasted many months and a lot of money.

Pursuing big data with small, targeted steps can actually be the fastest, least expensive, and most effective way to go. It enables an organization to prove there’s value in major investment before making it and to understand better how to make a big data program pay off for the long term. _____________________