Navigating the "big data" challenge

Twenty years ago, business leaders wanted more data to inform their decisions. Today they have so much data available—through indiscriminate collection and easy access to millions of data sets—that decision making has actually become more difficult. Significant insights don’t spring from raw data. Someone has to know which questions to ask of the data and where to find the answers.

Many companies face this problem, as they look to these vast stores of data to learn more about their markets, their customers and their opportunities. In addition to being creative, curious and business savvy, effective decision makers are learning to become more comfortable working with data and more capable of drawing insights from analysis.

For executives who have long wished for better answers, “big data” can look like an easy win. They might be tempted to think that an expensive big data solution, on its own, can sort through data and deliver the new insights they are looking for.

As with any technological innovation, the question of how to use it is ultimately a business question. No big data product or service can substitute for the rigorous and demanding process of figuring out what questions to ask.

We find that when companies go through the work of determining the right questions to ask and where to find the answers, they typically find that the tools are already inhouse, either in business intelligence software or in existing database tools. To help executives decide whether they need to invest in new talent, tools and capabilities, we have developed a framework that describes four characteristics of big data (see Figure 1). If executives are not facing at least three of these four issues, it’s unlikely they’re confronting a real big data problem, which means their organization’s existing capabilities and tools may be sufficient for now.

Are you facing a big data problem?

Volume: Does the problem require that large volumes (currently, from tens of terabytes up through petabytes) be analyzed simultaneously? Some applications demand it. Consider Amazon’s recommendation engine, which aggregates and analyzes hundreds of terabytes of shopping cart and click-through data from millions of users to determine which products are related. Combining this information with a user’s online behavior generates real-time product recommendations personalized to each Amazon customer.

However, online retailers can often deliver fairly accurate recommendations by analyzing just a statistically relevant sample of a large data set. To reduce the size of the data set, most e-commerce sites get their recommendations by noting products that were purchased together. While this approach lacks the personalization available to Amazon, most retailers consider it good enough because it provides many of the same up-sell and cross-sell opportunities at a fraction of the computational cost.

Velocity: Does the problem require analysis in real time? Wall Street traders need to analyze and execute trades in fractions of a second. They pay millions to gain millisecond advantages by locating their servers as close as possible to the stock exchange, and their firms are developing proprietary big data solutions. For them, big data tools that reduce the processing times from milliseconds to microseconds—and enable throughputs from hundreds of thousands to millions of transactions—offer enough return to justify the expenditure.

Few organizations need to make decisions that quickly. In brick-and-mortar retail for example, only a handful of companies monitor their supply chain as closely as Wal-Mart does, with its highly publicized RFID (radio frequency identification) tags that track products through their distribution network. But even those organizations still make decisions based on time increments of a few minutes or hours. Spending big money to generate instantaneous insights would be a waste today, because those insights are put to use only in a more traditional time cycle. That may change in the future, but for now a far better investment for most retailers would be to improve their analytical capabilities to make more effective use of existing data and tools, to help them avoid running out of inventory or to group deliveries into fewer shipments.

Data types: Are the data easy to split into meaningful units that can be sorted, evaluated and compared? Traditional analytical software doesn’t cope well with unstructured data, including multilingual audio, video, images and text. National security organizations face this challenge every day, as they process petabytes of communication data, security video footage, calls and other raw data. To meet this challenge, governments spend hundreds of millions of dollars to develop and operate big data computer clusters and analytics.

Analytical complexity: Think of the complexity of a business problem as the number of operations required to transform a set of data into actionable insights. One example is Lexis- Nexis’s program to determine whether individuals are related, using a variety of data types, including court records, birth certificates and online records. Financial institutions depend on LexisNexis’s system to help prevent identify theft and fraud.

By comparison, determining a mobile phone subscriber’s monthly bill is comparatively simple, even though the telecom is dealing with massive amounts of call data describing call duration, location, interconnections and roaming. It is relatively simple for the telecom to identify every call (due to the unique mobile number) and aggregate total usage and fees at the end of the month. The analytic process doesn’t have to cope with fuzziness, just simple yes or no rules.

Where to invest now

Companies should focus investments on hiring and developing data scientists who can ask the right questions using their current data systems to determine answers, rather than purchasing new solutions that may be more than they need.

Improve your data capabilities. Decision makers and their teams will need to develop the capability to ask questions that push the business forward. It won’t be easy: Figuring out which questions to ask requires creativity, a deep understanding of the available data and a thorough knowledge of the business. Managers should place a premium on recognizing opportunities in data and on thinking about what could be possible if there were no constraints to getting answers to the questions they might ask. Metrics that show how well teams are using data to meet their goals will become increasingly important.

Don’t mistake reports for insights. The risk of being overloaded by information becomes greater with every terabyte stored. With storage costs continually declining, organizations may be tempted to save everything indiscriminately. But that practice can actually increase the challenge of locating the data that really matters. Managers and teams will need periodically to ask themselves if the data they are storing and reporting is yielding meaningful insights that affect the success of the business. They may even want to stop issuing reports to see who notices and complains—which is a proven and effective way to show that data handling was geared toward busywork and reports, overlooking opportunities to create valuable business insights.

Restructure your organization to focus on insights from data. No organization can hope to gain insights that move their business forward simply by handing their data problem over to the IT department and purchasing a big data solution. The goals of the business will continue to guide the way that companies use data, just as it guides the way they use all their resources. Consider pairing traditional executives with quantitative people who understand and are comfortable working with data. Pairing their complementary skills can help guide teams to decisions that exploit the data opportunity in service of the business’s progress.

Whether you face a true big data challenge or just have lots of data, the key success factor is building strong capabilities that move the business. Talented decision makers, solid analytical skills and good technology are essential. But the ultimate source of competitive advantage lies in the art of abstracting the potential value within data and turning it into meaningful insights, which should be the goal of any transformation aimed at overcoming your data challenge.

Getting better insights from data—Bain’s approach

Audit your current data systems. Are you capturing and storing the types of information that will help you gain the insights you need?

Benchmark insights and analysis. How do your insights compare with those of competitors and with businesses in other industries that have identified value in their data?

Identify and prioritize the opportunities for improving data utilization. How well do you use your data? Do you see opportunities for developing better analytics or asking better questions? Will you need more or different data? Will you need more or faster processing?

Identify the resources necessary to realize those opportunities. What new tools, people, analyses, systems and service providers will you need to address the opportunities?

Rasmus Wegener is a partner with Bain & Company in Atlanta. Velu Sinha is a partner in Bain’s Palo Alto office. Both are members of Bain’s Technology practice.