Big Data Analytics: Descriptive Vs. Predictive Vs. Prescriptive

What distinguishes these three key types of analytics? A data scientist explains the differences.

10 Biggest Tech Disappointments Of 2013

(click image for larger view and for slideshow)

The majority of raw data, particularly big data, doesn't offer a lot of value in its unprocessed state. Of course, by applying the right set of tools, we can pull powerful insights from this stockpile of bits.

In any big data setup, the first step is to capture lots of digital information, "which there's no shortage of," said Dr. Michael Wu, chief scientist of San Francisco-based Lithium Technologies, which develops social customer experience management software for businesses.

With data in hand, you can begin doing analytics. But where do you begin? And which type of analytics is most appropriate for your big data environment?

In a phone interview with InformationWeek, Wu explained how descriptive, predictive, and prescriptive analytics differ, and how they provide value to organizations.

"Once you have enough data, you start to see patterns," he said. "You can build a model of how these data work. Once you build a model, you can predict."

In a March 2013 blog series on this topic, Wu called descriptive analytics "the simplest class of analytics," one that allows you to condense big data into smaller, more useful nuggets of information.

"Remember, most raw data, especially big data, are not suitable for human consumption, but the information we derived from the data is," Wu wrote.

The purpose of descriptive analytics is to summarize what happened. Wu estimated that more than 80% of business analytics -- most notably social analytics -- are descriptive.

"For example, number of posts, mentions, fans, followers, page views, kudos, +1s, check-ins, pins, etc. There are literally thousands of these metrics -- it's pointless to list them -- but they are all just simple event counters," he wrote in another post on the topic.

Predictive analytics is the next step up in data reduction. It utilizes a variety of statistical, modeling, data mining, and machine learning techniques to study recent and historical data, thereby allowing analysts to make predictions about the future.

"The purpose of predictive analytics is NOT to tell you what will happen in the future," Wu blogged. "It cannot do that. In fact, no analytics can do that. Predictive analytics can only forecast what might happen in the future, because all predictive analytics are probabilistic in nature."

In the most general cases of predictive analytics, "you basically take data that you have to predict data you don't have," Wu told InformationWeek.

Sentiment analysis, for instance, is a common type of predictive analytics:

"The input to the model is plain text," Wu said. "And the output of that model is a sentiment score, whether it's positive, negative, or something between +1 or -1."

In this case, the model computes the score, but it's not necessarily predicting the future. Rather, it's "predicting data that we don't have, which is the sentiment label, whether it's a positive or negative sentiment," said Wu.

The emerging technology of prescriptive analytics goes beyond descriptive and predictive models by recommending one or more courses of action -- and showing the likely outcome of each decision.

"Prescriptive analytics is a type of predictive analytics," Wu said. "It's basically when we need to prescribe an action, so the business decision-maker can take this information and act."

He added that predictive analytics doesn't predict one possible future, but rather "multiple futures" based on the decision-maker's actions.

In addition, prescriptive analytics requires a predictive model with two additional components: actionable data and a feedback system that tracks the outcome produced by the action taken.

"Since a prescriptive model is able to predict the possible consequences based on different choice of action, it can also recommend the best course of action for any pre-specified outcome," Wu wrote.

For a deeper dive into descriptive, predictive, and prescriptive analytics, check out Wu's blog posts.

Jeff Bertolucci is a technology journalist in Los Angeles who writes mostly for Kiplinger's Personal Finance, the Saturday Evening Post, and InformationWeek.

Interop Las Vegas, March 31 - April 4, 2014, brings together thousands of technology professionals to discover the most current and cutting–edge technology innovations and strategies to drive their organizations' success, including BYOD security, the latest cloud and virtualization technologies, SDN, the Internet of things, Apple in the enterprise, and more. Attend educational sessions in eight tracks, hear inspirational and industry-centric keynotes, and visit an Expo Floor that brings over 350 top vendors together. Register for Interop Las Vegas with Discount Code MPIWK for $200 off Total Access and Conference Passes.

The distinctions between descriptive, predictive and prescriptive are important, particularly since so many vendors are now throwing the term "analytics" around for things like reporting that really qualify as business intelligence. On predictive, a sentiment-analysis score isn't strictly descriptive as you're calculating a value that is your best guess at an individual's sentiment. But you're not really predicting something is going to happen, as in more classic examples of prediction such as customer churn or risk. Will X customer leave or will Y customer pay off a loan? Now that's a prediction. Prescriptive analytics go one step further, telling us what type of offer to sent to which churn candidate to promote retention or what loans to approve to maintain an acceptable level of risk.

Another way of interpreting descriptive data is as "content without context" ie. facts that describe a particular phenomenon, without meaning or relevance.

To have meaning or relevance, context must be included.

For example, knowing that 30% of our employees use twitter at least 3 times a day is descriptive data. Not much meaning or relevance though

However, if we also know that those 30% of employees also have higher Klout scores THAN the others then that becomes predictive information - the data has meaning within a context (in this case Klout score). It says employees that tweet more are likely to have higher Klout scores

Context with multiple dimensions such as time, location, and device type (PC vs Tablet vs smartphone) help convert descriptive data into predictive or even prescriptive analysis

Big data analytics is going to be mainstream with increased adoption among every industry and forma virtuous cycle with more people wanting access to even bigger data.

However, often the requirements for big data analysis are really not well understood by the developers and business owners, thus creating an undesirable product.

For organizations to not waste precious time and money and manpower over these issues, there is a need to develop expertise and process of creating small scale prototypes quickly and test them to demonstrate its correctness, matching with business goals.

Following up on this, I came across and registered for a webinar on Deploy Big Data solutions Rapidly in Cloud through Harbinger's ABC model (Agile-Big Data-Cloud), it looks a promising one http://j.mp/19xJ6ew

inductive analytics is a better denomination than predictive, for the seemingly obvious reason that algorithms induce values from known data. That is what statistics and DM algorithms do.

embedded analytics is a better denomination than prescriptive. A specific analytics can be either descriptive or inductive, and the relevant fact here is that any of those two kinds can be embedded into an application and be used (together with "business rules" specific to the app) to take action on something.

In summary, these are orthogonal qualities, and one can visualize a 2 by 2 matrix wherreby in the rows are (descrtiptive, inductive) and the columns are (non-embedded, embedded).

Sentiment analysis is descriptive, not predictive (as long as it describes the current, general sentiment related a topic - predicting future sentiment based on past sentiment and potentially other factors would be predictive).

Analyses are exact or probabilistic. Predictive analysis is generally probabilistic because we rarely know the future for sure.

Most descriptive analyses are exact (number of likes, number of clicks, Klout score, S&P index, etc.) because they are defined by a single, deterministic method or a single authority, which does not allow contradicting results.

Sentiment analysis is a probabilisticdescriptive analysis because it attempts to measure something and fails to measure it exactly. Sentiment analysis generally attempts to measure the ratio of positive vs. negative sentiment related to a topic in a given corpus. It is based on natural language processing, which is typically not perfectly accurate.

The Author tried to explain three concepts and He did an excellent job. Sentiment analysis is descriptive if only summary is provided by the data analyst but it is a starting point of predictive analysis as why positive sentiment and what are the key behavior impacting positive sentiment will provide predictive analysis of respective behavior associated with sentiment. Based on historical timeline and respective sentiment of user can be predicted applying time series models.

To learn more about what organizations are doing to tackle attacks and threats we surveyed a group of 300 IT and infosec professionals to find out what their biggest IT security challenges are and what they're doing to defend against today's threats. Download the report to see what they're saying.

Chances are your organization is adopting cloud computing in one way or another -- or in multiple ways. Understanding the skills you need and how cloud affects IT operations and networking will help you adapt.