Fractal Analytics Blog

Unstructured Data Extends Beyond Social

By Sejal SuraNovember 9, 2015

Even as early as 2003, before Facebook and Twitter were founded and before the release of the first iPhone, unstructured data comprised 85% of all data (Computerweekly.com). However, it was the explosion of social data that prompted companies to invest in mining unstructured data. Companies which limit the mining of unstructured data to social media fail to obtain a holistic view of the customer, resulting in creating fragmented customer experiences

There are many other unstructured data sources, which provide more history and context that should be leveraged to address business needs.

Examples of Unstructured Data Sources:

Call Center data

Email messages

PowerPoint slides/word documents

Sensory data

Telephony data

Web log data

Web page content

Video/audio/image files

A newer unstructured data source that is growing exponentially is open data. This data is freely available to everyone to use and republish. Most frequently available open data is from the government and science sectors. The purpose of open data is to provide transparency, new knowledge from combined data sources and fuel innovation.

In this article I will present three use cases, which illustrate how leveraging a variety of unstructured data sources across different industries has empowered businesses to save lives, reduce fraud and increase revenue. Then, I will address the major challenges encountered by organizations in utilizing unstructured data. Lastly, I will provide best practices to overcome these business challenges.

Power of Harnessing Unstructured Data

Companies can benefit from mining unstructured data by identifying which business problems they want to address. Below are three different industry examples.

Healthcare Industry

The Healthcare ecosystem is complex, consisting of many players and continues to evolve with the adoption of the affordable healthcare act and the increase in wearable devices. This results in an assortment of data collection. The benefit of hospitals to standardize and integrate all of this data is threefold

Improve accurate patient diagnosis

Provide patient treatment in a timely manner

Increase positive patient outcomes

These business objectives ultimately save lives while reducing healthcare costs.

According to a study conducted by University of North Carolina Health Care and Seaton Healthcare, “left ventricle ejection fraction values, which measure volume of blood pumped by the heart to help evaluate heart failure patients are found [only] two percent of the time in structured data. However, these are found nearly 75% of the time in unstructured data.” If you are a heart failure patient that 73% difference significantly increases your odds of spending another holiday with family.

Even though every organization is not in the business to save lives, they are still able to utilize unstructured data in meaningful ways.

Federal Government

The US government, like many other federal governments, is dependent on taxes to provide many services to its people. With the increasing deficit it is even more critical for the US government to identify methods to effectively collect taxes. Unfortunately, big data is not used to locate secret accounts in the Cayman Islands to reduce the federal deficit! Instead the Internal Revenue Services (IRS) is leveraging big data to focus on three areas: (Federal News Radio)

Stop fraud and improper payments

Reduce the tax gap between the number of people who are paying taxes and the number of people who should be paying taxes

Ensure core compliance with tax rules and laws

The IRS implemented a strategy to integrate big data, analytics and operations to solve business problems, which resulted in recovering $2 billion in the last three years. Based on their predictive algorithms they were also able to identify 23% more fraud cases and make recommendations to prosecute 84% of these cases. Even though the IRS is unable to control federal spending it is able to assist in more accurate tax collection to fund federal projects.

Travel Industry

Empty seats and unoccupied hotel rooms are lost revenue for airlines and hotels. How can travel companies more effectively compete in this space? Red Roof Inn identified a market need; stranded passengers at airports due to weather related cancelled flights. They knew these passengers were most likely to use mobile devices or tablets to find a hotel nearby. They were also fortunate to have hotels located near major airports. Therefore, the strategy they deployed was crafting a real-time mobile campaign based on open data – weather information, flight cancellations and customer location. They “applied an algorithm that considered the varying travel conditions, time of day and volume of cancellations to determine the most opportune time” that a mobile ad would appeal to stranded passengers and personalized these ads to each customer. This mobile ad campaign resulted in 10% increase in annual revenue. Direct Marketing News.

A lack of tools that provide efficient text parsing, analytics, taxonomy and metadata management to classify unstructured data on the fly

Inconsistency of activating unstructured data in one channel while using structured data in another channel; creating fragmented customer experiences. For example, when companies limit the use of unstructured data to social.

Five Best Practices to Utilize Unstructured Data

Define the business problem(s). It is critical for businesses to identify what issues they intend to address to determine which sources of unstructured data are most relevant to integrate.

Assess technical capabilities. In order to store and utilize unstructured data the appropriate technical infrastructure is required. Factors important to choosing data storage and retrieval often depend on scalability, volume, variety and the desire to address business needs in real-time.

Provide structure to unstructured data. Define a process to classify this data. Identify a technique, such as text mining or Natural Language Processing (NLP) to find patterns to extract meaning and structure the data. Common approaches for structuring text usually involve manual tagging with metadata or part of speech tagging.

Apply predictive analytics to integrated data set to identify opportunities and capitalize on insights in real-time. Develop models to capture relationships between variables from past occurrences and exploit them to predict the unknown outcome. Performing this calculation in real-time enables businesses to retain customers, detect fraud, cross-sell, manage risk, support medical decision making at the point of care, based on anticipated future response.

Develop a deployment strategy based on the velocity, volume and variety of data involved. This data will continue to grow at a higher clip and the variety of data will also morph over time so it’s critical to develop a process that enables your company to adopt to these changes seamlessly.

With increasing competition in every industry, one point of differentiation is how your organization is able to leverage unstructured data in a timely fashion to meet the changing needs of the marketplace. Understanding the correlation between your product/service sales and otherwise undetected factors such as weather, consumer sentiment, web search trends will enable you to take advantage of these untapped opportunities with precise actions that drive improved performance. The competitive advantage for market leaders will be the ability to predict product trends, forecast demand, optimize pricing and promotion and anticipate and respond to market shifts. Equip your organization to be the market leader by creating and operationalizing a strategy utilizing unstructured data to power decision making.

Sejal Sura is a Director of the Integrated Marketing Effectiveness practice at Fractal Analytics with expertise in Interactive Marketing, Strategy and Analytics. She has a successful track record of developing methodologies, frameworks and processes and building consensus across stakeholders to solve business challenges for a variety of Fortune 500 companies. She holds an MBA in Marketing from Kellogg Graduate School of Management at Northwestern University and BBA in Finance from University of Iowa.