What is Data Science?

Data science refers to the process of uncovering patterns and insights hidden in huge volumes of messy data using techniques such as machine learning, data mining, predictive analytics, deep learning, and cognitive computing, among others. Unlike traditional business intelligence and related approaches, data science isn’t confined to structured data, doesn’t require data to be organized into neat rows and tables, and isn’t limited to small data sets. Rather, data science techniques can be applied at scale to massive volumes of semi-structured and unstructured data such as text-based data, machine data, sensor data, and social media data. Thanks to this less restrictive approach to analyzing data, data science allows organizations to find answers to questions they didn’t even know to ask, leading to potentially breakthrough insights that drive competitive advantage.

Why Data Science Matters

&FilledSmallSquare;

Unlock the value of data

Modern approaches to data management, such as Hadoop and cloud-based storage, make it more affordable than ever to store vast amounts of data. But storing data doesn’t provide value in and of itself. Applying data science unlocks the value of data by uncovering actionable insights.

&FilledSmallSquare;

Be predictive and proactive

By predicting the likelihood of events before they occur, data science allows companies to be proactive and take actions to optimize outcomes rather than being reactive to events after the fact.

&FilledSmallSquare;

Continuous learning

Data science isn’t a one-off event. As data science-driven insights are put into action, the results of those actions are fed back into the system of predictive models and algorithms. The result is a self-learning system that is continuously improving.

&FilledSmallSquare;

Data science applies to all industries

Data science has application across virtually all industries. Farmers use data science to determine the best times to plant crops. Retailers use it to personalize offers to customers. Industrial companies use data science to prevent equipment malfunctions. From financial services and insurance to healthcare and energy, every industry is being transformed by data science.

“Most organizations that are achieving results from their prescriptive analytics are gaining insight through data science activities. More mature organizations are exploring machine learning and evaluating real-time and near-real-time deployments.“

Considering Data Science? What to Keep in Mind

Data science is required to extract value from massive volumes of data and can lead to game-changing insights. Yet developing a data science practice isn’t trivial. If your enterprise decides to move forward, be sure you know the answers to these questions in advance.

Are you prepared to pay for data science talent?

Data scientists, who possess a blend of statistical, analytical and math skills, are a rare breed and are in high demand. Enterprises starting data science practices must be prepared to pay a premium for top data science talent.

How will you organize your data science teams?

In some enterprises, data scientists are part of a centralized, shared service that supports the entire organization. In others, data scientists are embedded in business units. Both approaches have their pros and cons. Consider which approach is a better fit for your organizational structure and culture.

Can you scale your data science workloads?

All things being equal, more data equals better data science results. The good news is there is plenty of data, both inside and outside your enterprise, to analyze. But running data science algorithms and models on massive data volumes is challenging. You need the technology and skills to operate data science at scale.

What is your strategy for operationalizing data science?

Insights derived from data science aren’t of any value if they aren’t operationalized. This is also a sure fire way to demoralize data scientists, who want their work to have impact. Make sure you have a plan for putting predictive models and other data science output to work solving real business challenges.

How will you govern the use of data?

Just because something is possible thanks to data science doesn’t mean you should do it. Enterprises embarking on data science must also develop ground rules for how data is used based on both ethical (non-binding) and legal (binding) standards.

Flexible. Platforms that support data science are adept at quickly and easily adding new data from various types of source systems.

Rigid. Making changes to existing BI systems, such as adding new data sources or asking new questions, is complex and time consuming.

Scalable. Platforms that support data science must be highly scalable, both from a data storage and compute perspective. Data science algorithms and models are most effective when run across all data, not samples.

Non-scalable. Traditional data warehouse appliances that support business intelligence are often unable to scale to meet the demanding storage and processing needs of Big Data.

Fraud detection

Customer segmentation

Develop fine-grain customer segmentation based on behavioral, transactional, social and other data analysis.

&FilledSmallSquare;

Customer churn

Identify patterns that indicate a customer is likely to leave a product or service and take steps to stop it.

&FilledSmallSquare;

Predictive maintenance

Predict the likelihood of part failure in cars, industrial equipment and other machines so preventative action can be taken.

&FilledSmallSquare;

Sentiment analysis

Analyze text-based data like email content and social media updates to glean user and customer sentiment.

&FilledSmallSquare;

Cybersecurity

Identify potentially malicious attacks and other online threats to IT and other networks and take preventative action.

&FilledSmallSquare;

Recommendation engine

Suggest targeted products, services and action items to users based on analysis of past buying behavior and other data.

&FilledSmallSquare;

Demand forecasting

Forecast demand for products and parts in advance to maintain optimal inventory levels.

The Eightfold Path of Data Science – Four Phases & Four Differentiating Factors

Problem Formulation
Make sure you formulate a problemthat is relevant to the goals and painpoints of the stakeholders.PHASE 1
Modeling Step
This is where you move fromanswering what, where and when toanswering why and what if?PHASE 3
Data Step
Build the right feature set making fulluse of the volume, variety, and velocityof all available data.PHASE 2
Application
Create a framework for integrating themodel with decision making processesand taking action.PHASE 4

Four Differentiating Factors

Each of the four differentiating factors applies to all four phases of the data science lifecycle on the left.

Iterative Approach

Perform each phase in an agile manner, team up with domain experts and SMEs, and iterate as required.

Creative

Take the opportunity to innovate at every phase.

Building a Narrative

Create a fact-based narrative that clearly communicates insights to stakeholders.

Technology Selection

Select the right platform and the right set of tools for solving the problem at hand.

Data Science at Pivotal

Pivotal’s team of data scientists works with clients across industries to solve their most pressing business challenges and take advantage of timely market opportunities.

During Pivotal data science engagements, teams typically:

&FilledSmallSquare;

Assess existing analytic capabilities and data sources

&FilledSmallSquare;

Determine an executable use case with high business value

&FilledSmallSquare;

Iteratively develop, evolve, and refine analytic models

&FilledSmallSquare;

Operationalize by embedding predictive insights into business logic and smart applications

Pivotal data scientists also help clients learn and develop their own Agile data science skills over the course of engagements so they can continue tackling new use cases.

Pivotal Moments

With the help of Pivotal data scientists, Synchrony Financial developed a “next best offer” feature for its mobile app that predicts likely purchases and delivers targeted offers to customers.

Comcast worked with Pivotal data scientists to develop algorithms that enable it to identify and stop suspicious activity on its network, such as unapproved file sharing.

As part of a data science engagement with Pivotal, Fiat Chrysler is developing customer sentiment analysis capabilities to enable the car maker to reduce customer churn and increase customer loyalty.