Implementing Unstructured Data Analytics

by Lyndsay Wise, President, WiseAnalyticsMonday, February 28, 2011

The implementation of business intelligence can be quite expansive and involves looking at operations and bringing structured data into a centralized data store to use as the base for an analytics framework. With advances in technology and the ability to apply a wider range of analytics to broader data sets, the question to ask is whether there is additional value in throwing a wider net around an organization’s analytics platform. Due to the increasing complexity of decision making and larger and broader data sets, unstructured data is becoming an important consideration with many organizations.

Decision making approaches used by organizations often differ, based on familiarity with business intelligence and analytics. For organizations with mature BI infrastructures incorporating unstructured data may be a natural next step, while companies looking at BI for the first time may feel overwhelmed analyzing information that falls outside traditional data sources. However, both types of companies can benefit from adding unstructured data content within their analytics. The type of data that is examined depends upon the organization’s industry, goals, and current BI infrastructure.

This article explores unstructured data and the benefits of including it within a BI platform. In addition this article provides an outline of the issues that should be considered before enterprises decide to embark upon the integration of unstructured data within an analytics environment.

Why Add Unstructured Information to Analytics

With so many diverse applications of BI and analytics, organizations may not see the value of including unstructured data as part of their business intelligence framework. Once a company is successfully applying analytics within their company, the question of how to take BI to the next level arises. The reality is that retaining a continual competitive edge requires a plan that includes constant evaluation and improvement. Whether this means intensifying the levels of depth achieved through analytics or adding new data sources to expand use or adoption, being able to answer business questions is essential to proactively addressing and solving business pains.

The addition of unstructured data helps address these issues. Adding information sources that include documents, emails, content management systems, notes, etc. can lead to deeper insights because of the broader range of datasets and greater depth of potential knowledge. Aside from abstract BI benefits such as broader visibility and better decision making, the combination of structured and unstructured data enables a more in-depth view of a business. For instance, traditional BI provides insights into financial performance, marketing analytics, and sales trends identification, while unstructured data analysis allows for examination of customer sentiment and the identification of potential fraud.

Considerations of Unstructured Data Analysis

All IT projects require careful consideration before altering the status quo. In addition to a business driver, several technical requirements exist in relation to expanding any BI environment. The reality is that data integration and the processes involved in adding additional data sources within a current BI infrastructure are rarely as simple as they seem. This means that IT needs to identify the current infrastructure and how everything integrates together. Some of these considerations include:

Data sources - Each data source is unique in the sense that it may require specific requirements for integration. Widely used data sources offered by companies such as SAP, Salesforce, or Microsoft are examples of sources in which many vendors develop APIs to make integration easier. Although not always the case, the use of APIs helps with the integration of multiple data sources and makes it easier to get a broader picture of decision making within the organization. Adding unstructured information creates more complications as it is not always easy to create structured definitions of data stored within non-traditional data sources. Organizations should identify what processes are required to integrate unstructured data within a structured BI environment while maintaining value and data integrity.

Structuring unstructured content - Part of the difficulty with unstructured data is the ability to structure the information being collected. Although organizations want to expand their understanding of their customers or identify risk before it occurs, databases were designed to manage structured data sources. The addition of unstructured data creates logistical difficulties that make it near impossible to integrate unstructured information within a BI framework without changing or expanding the current environment. Few solutions exist that provide general analytics which take into account the integration of structured and unstructured data. However, best of breed text analytics and industry specific solutions do exist, so companies requiring niche applications are well poised to incorporate unstructured information. Alternatively, more traditional applications of business intelligence do not take unstructured data into account when looking at integration with data sources.

Expansion of analytics - Most unstructured data use will occur as an extension of a traditional BI initiative. With general expertise in BI, IT departments can ease into expanding the types of data used in relation to analytics. For instance, expanding marketing analytics to take into account social network analysis and identify general product reviews and recommendations means adding to traditional marketing analytics and taking into account Twitter or Facebook threads and reviews. This expansion leads to robust technical considerations that are largely based on integration. Because of these considerations, it becomes important for the organization to identify the specific business pain being addressed, otherwise the addition of unstructured data within a traditional BI infrastructure may cause more harm than good when looking at the context of the analytics and its expansion into broader realms of data collection.

Context - All BI and analytics projects require proper context to get value out of information. With unstructured data this is even more important as the data collected requires transformation to become valuable within a database, and as a result business rules and definitions should be identified to make sure that valid outputs can be achieved.

Obtaining value requires context and a solid understanding of how various pieces of information integrate with one another to add to the decision making and performance management processes.

The Road to Increasing Visibility and Better Analytics

Obviously the road to unstructured data analysis is complex. Businesses are required to consider business and technical requirements and they need to identify whether their current infrastructure can support an expanded approach to analytics. In addition business context and understanding how disparate data interrelates requires the support of both business units and IT. Therefore, before an organization considers unstructured data, there should be a broad analysis of associated costs and benefits to see whether the benefits outweigh the risks of such an endeavor.

About the Author

Lyndsay Wise is an industry analyst for business intelligence. For over seven years, she has assisted clients in business systems analysis, software selection and implementation of enterprise applications. Lyndsay is the channel expert for BI for the Mid-Market at B-eye-Network and conducts research of leading technologies, products and vendors in business intelligence, marketing performance management, master data management, and unstructured data. She can be reached at lwise@wiseanalytics.com. And please visit Lyndsay's blog at myblog.wiseanalytics.com.