Are you good at scoring?

Tips for reducing the time-to-model and improving the result

By Anders Langgaard, Business Advisor, SAS

Credit scoring is the foundation for evaluating clients who apply for a loan (or other types of exposure for the bank). Building a credit scoring model is an exercise in statistics typically involving big (or even huge) amounts of historical data. In many banks it is not unusual for it to take up to 12 months from the moment you decide to build a new model until the model is applied in the production environment.

And all the while the model is losing precision; the world continues to change and the significance of the model parameters changes with it. And lower model precision, in turn, leads to erroneous ratings of particular clients, which can cause the bank to take on risky or non-profitable loans.

A (business) case of shorter time-to-model

SAS ran a business case with a small American bank that had total assets of US $1 billion. The bank had average credit losses of $25 million per year. Using credit scoring tools that enabled higher performance in the scoring process – not least in the model development phase – meant that the bank was able to reduce the modelling cycle from 4 months to 2 months. The effect was that the bank was able to reduce losses by 5 percent – just by being able to use better score cards sooner.

But how can you shorten the model lifecycle? First, here are the steps of the lifecycle:

Formulate a hypothesis.

Prepare the data to build the model on.

Explore the data to ensure that the quality is sufficient and that it contains the needed information.

Transform the data (It is likely that some of the data will need to be transformed into useful variables)

Build the model using statistical tools.

Validate the model (ensure that the model still performs on another set of data that the set it was built on).

Deploy the model in production – perform the scoring of the clients.

Evaluate and monitor performance of the model.

Be aware that there are many potential time thieves in the model lifecycle. For instance, in many financial institutions the analyst (the model developer) must ask the IT department for the needed data. The IT department then gathers the data from a number of sources and delivers it to the analyst – this can take 2-4 weeks. If the analyst finds out that something is missing in the data, the process starts over. A good starting point for achieving a leaner process is to enable the analyst to directly build her own data set.

You should also give the analyst tools that make it easy and quick to test hypotheses. Examples include data visualization techniques and in-memory processing, both running on Hadoop. In combination, you get that easy view into the data. You also reduce the time it takes to provide answers because the analysis can be done inside Hadoop with high performance analytics – lightning fast.

Automate and integrate

The final time snatcher that I will mention this time around is the steps that need to be taken from the moment you have a validated model until you can take it into production. In a large number of financial institutions there is no link between the development environment and the production environment where scoring needs to be done. This means that the model is often (manually!) carried from the analyst back to IT, where the model is recoded to another programming language so that it can finally be implemented in production. This, of course, gives rise to a number of operational risks – including the risk of model coding errors. Additionally it makes model validation and performance measurement difficult to carry out.

It’s easy to see that streamlining the model development process will reduce losses and increase earnings. And it will also reduce the operational risk of the institution. The combination of these effects will help validate the business case for investing in the improvements.