Commercial churn modelling

Churn modelling is easy; commercial churn modelling is hard. Let us compare the two to explain what we mean by the latter.

Consider any subscription or subscription-like business such as telephones, broadband, television, credit cards, insurance, retail banking, mail order or any other business with substantial repeat purchases.
If you run any business like this then you will be overrun with companies offering you churn modelling and promising that they have the secret sauce that will make the model a little better than any other model and save you millions or billions.

(We consultants like to sell churn models because the business case for a one percentage point improvement in churn reduction is usually an astronomical sum that dwarfs anything else you may be doing.)

But here are the dirty little secrets of the churn modelling business that you should know:

Traditional churn modelling is easy

Having a few extra decimals on the prediction of a traditional churn model will only make a meaningful difference to your business if you are in land-grab mode in a price-led market.

On the first point, you can create a churn model at over 75% effectiveness (which for my technical readers I define as the area under the sensitivity–specificity curve) as a purely mathematical exercise without knowing anything about your business data. The KDD Cup Challenge 2009 shows this for a data set from the telecommunications company Orange where all the data attributes have been hidden. There are at the time of writing over 200 submissions with a churn score over 75%, and one of the original winners used only a couple of laptops to achieve this.

We do not use laptops but scalable clusters of computers for our analysis, but we can show similar results for several other industries.

Traditional churn modelling is really easy.

And if you apply your understanding of the business, customers, and markets to the problem (as you would outside an artificial competition like the KDD Cup mentioned), then you will of course get even better results, with 80-85+% not uncommon in true subscriptions businesses like telecommunications.

Again: Traditional churn modelling is easy.

It is also mostly useless from a commercial perspective.

What are you going to do if you think someone is about to leave in the next month/quarter/year, other than to offer him a discount?

If all you know is that he will leave, all you can really do is to pay him to stay. Margin erosion, commoditisation of your products, and much gnashing of teeth is certain to follow if you adopt this as your churn strategy.
If you are trying to buy market share against less well-funded competitors, this may be a rational approach. But in most circumstance is it something you want to minimise and manage carefully (few industries can avoid it completely).

It is true that in industries like mobile telecommunications the marginal cost of adding one more subscriber to the network is near zero and Economics 101 predicts that this will eventually be the price of the product. However, as business people we normally want to fight this slide every single step of the way to maintain margins, recoup large investments (in mobile networks, say), and fund future developments.

Commercial churn modelling

We call our approach commercial churn modelling, which may not be a great name (suggestions for better names in the comments, please) but at least it tries to bring the exercise back to where it needs to be: actionable, and actionable in a way that delivers commercial results.

If a customer is about to leave and there is nothing at all I can do about it, nothing I could offer or do to make him stay, then I don’t really need to know about it. Knowing may be useful for budgeting and forecasting, but that is not how we make the big money or the big impact in society. (Accountants who might disagree can comment below.)

We don’t really want to know that a customer may leave, we want to know what would make him stay. The business challenge is then to deliver that incentive (or a substitute) in a profitable way.

Consider a mobile telecommunications company. We could hypothesise that the four main reasons for a customer leaving were price, handset, coverage, and network (data) speed. Knowing that a customer is likely to leave and the reason why, is actionable. If we are genuinely more expensive than the competition, I could offer a discount or explain the extra benefits of staying with us. If we do not have the latest handsets, I could offer an even better one that we do have at a competitive price. At one company we worked for, I even went out and bought an unlocked iPhone (which we didn’t offer) and shipped it personally to one very loyal and profitable customer. Coverage is a little harder to do something about, but maybe I could offer a femtocell that uses the customer’s own broadband to provide coverage, as Vodafone is currently doing, and for the slow data network I was able to offer nationwide WiFi hotspot access when at the company where we sent the iPhone.

The point being that knowing the likelihood to churn and the (most probable) reason for leaving is actionable by the business in a way that knowing only the first component will never be. Only one of the reasons above is eroding my core pricing and even there I can dig further into why there is a price gap and maybe do something about that instead of a unthinking margin reduction. (For example, the customer’s friends may be on another network and the cheaper on-net calls are the reasons for the price differential which is certainly something I can proactively prevent through social network analysis and reactively try to change through refer-a-friend or similar programmes.)

This is what we do here at CYBAEA and what we mean by commercial churn modelling: predicting not just that a given customer is about to leave, but what you can do about it right now. We additionally develop analytics that predict changes in the customer’s behaviour after accepting an offer, and therefore the change in revenues and profitability, which is what you need to make a rational commercial decision about what to do with each customer at each moment in time.

The reasons will vary between industries as will the technical details about how you model them (independent or conditional models, for example), but the approach is applicable everywhere you have a subscription or subscription business, and the results have the potential to transform the profitability of the business.

So just say no to the traditional churn modelling and make sure your provider really understands your commercial environment. Ask hard questions. How will I use this model in the business? What will I actually and specifically do with the output?

Follow CYBAEA

Recent posts

Analytics for Marketing online training 25 - 28 September 2012

I am excited to be giving the Analytics for Marketing online training course on 25-28 September 2012. Sign up before 25 August 2012 for the early bird discount. Our friends at Revolution Analytics who will provide the infrastructure to host the event.

Update: For clarification, this is an online, instructor led training course. We are using the Cisco WebEx Training Center to provide the training room. This allows us to keep the interactivity of classroom training without everybody having to physically travel. There is a limit on the number of participants so book early to ensure your seat (and for the early bird discount).

When Big Data Matters

Big Data is a buzzword, but is it real: does it address real business issues or is it just an excuse to sell more computers, software, and consulting services?

We argue that it is real and it does matter, but only in some well-defined circumstances: it is not a universal solution or requirement to every problem. We provide a framework for determining where the Big Data applications are within your work and where traditional approaches apply.

R code for Chapter 2 of Non-Life Insurance Pricing with GLM

We continue working our way through the examples, case studies, and exercises of what is affectionately known here as “the two bears book” (Swedish björn = bear) and more formally as Non-Life Insurance Pricing with Generalized Linear Models by Esbjörn Ohlsson and Börn Johansson (Amazon UK | US).

At this stage, our purpose is to reproduce the analysis from the book using the R statistical computing and analysis platform, and to answer the data analysis elements of the exercises and case studies. Any critique of the approach and of pricing and modeling in the Insurance industry in general will wait for a later article.

R code for Chapter 1 of Non-Life Insurance Pricing with GLM

Insurance pricing is backwards and primitive, harking back to an era before computers. One standard (and good) textbook on the topic is Non-Life Insurance Pricing with Generalized Linear Models by Esbjorn Ohlsson and Born Johansson. We have been doing some work in this area recently. Needing a robust internal training course and documented methodology, we have been working our way through the book again and converting the examples and exercises to R, the statistical computing and analysis platform. This is part of a series of posts containing elements of the R code.

doSMP pulled

They have finally pulled that buggy unreliable piece of code that was doSMP from the CRAN mirrors while (I hear) Revolutions are re-writing it. To use all your cores for analysis on the Windows platform, you can try doSNOW instead; my code is something like the fragment below. Neither option is as attractive as doMC on anything-but-Windows platforms, but sometimes you have to work with legacy systems.