Decision Modeling and CRISP-DM for Modern Data Science Projects

Posted By: Meri Gruber | Posted On: 22nd February 2017 |

Many data science projects use the popular and well established CRISP-DM methodology. However, CRISP-DM has limitations especially regarding business understanding and deployment. The decision modeling process and the graphical decision requirements diagram addresses these challenges.

“CRISP-DM remains the most popular methodology for analytics, data mining, and data science projects, with 43% share in latest KDnuggets Poll, but a replacement for unmaintained CRISP-DM is long overdue.

The 6 high-level phases of CRISP-DM are still a good description for the analytics process, but the details and specifics need to be updated. CRISP-DM does not seem to be maintained and adapted to the challenges of Big Data and modern data science.”

Modern data science teams recognize enlisting business partners in a shared collaboration is essential for delivering business value. They also recognized that too many analytics models are not deployed, or don’t deliver the expected business value. Data science teams are adopting decision modeling to address these gaps.

A Shared Business UnderstandingCRISP-DM and other methods stress the importance of business understanding but lack a repeatable, understandable format. Decision modeling fills this gap. Decision modeling is a successful technique that develops a richer, more complete business understanding earlier. Decision modeling using the Decision Model and Notation (DMN) standard results in a clear business target, as well as an understanding of how the results will be used and deployed, and by whom.

An example decision requirements model

Data science teams report tremendous benefits in quickly establishing a shared understanding with their business partners using decision modeling. Business partners have a much clearer vision of what the predictive model could do, and how it can be improved in an iterative process as more information is made available. It also clarifies for the data science team what model would best serve the business problem, and it is often different than what was initially considered.

Plan for Deployment
Many analytical results aren’t deployed because the deployment context was not fully understood and articulated. This results in analytic models that can be technically correct but don’t solve the business problem, or can’t be deployed for various business process, system or organizational reasons. The deployment gap is one of the reasons many executives are asking, “How do I get value from my analytics investments?” Data science teams are also asking, “How do I demonstrate the value of our analytic results?”

Capturing the decision context is fundamental to decision modeling and also serves to define and plan for deployment. Decision requirements models ensure that the requirements for deployment and usage are clear before analytics are developed. They also show how analytics will add value and deliver business impact allowing business cases to be developed and projects to be compared.