Why internal credit scoring?

Personal credit scores are normally computed from information available in
credit reports collected by external credit bureaus and ratings agencies.
Credit scores may indicate personal financial history and current situation.
However, it does not tell you exactly what constitutes a "good"
score from a "bad" score. More specifically, it does not tell you
the level of risk for the lending you may be considering.
Internal credit scoring methods described in this page address the problem.
It is noted that internal credit scoring techniques can be applied to
commercial credits as well.

Credit Risk Analysis and Modeling

In this page, the following credit risk analysis methods are described;

Credit risk factors (hotspot) profiling or loans default analysis.

Credit risk predictive modeling or loans default predictive modeling.

Credit risk modeling or finance risk modeling.

Internal credit risk scoring.

Hotspot Profiling of Risky Credit Segments

Credit risk profiling (finance risk profiling) is very important.
The Pareto principle suggests that 80%~90% of the credit defaults
may come from 10%~20% of the lending segments. Profiling the segments
can reveal useful information for credit risk management.
Credit providers often collect a vast amount of information on credit users.
Information on credit users (or borrowers) often consists of dozens or
even hundreds of variables, involving both categorical and numerical data
with noisy information.
Hotspot profiling is to identify factors or variables that best summarize
the segments.

Combinational factor analysis and Combinatorial blowout!

Analyzing such vast information is an extremely difficult and challenging task!
In conventional methods, factor analysis is performed on a few (to several)
variables at a time using statistical software.
As the total number of variables increases,
the number of combinations to be examined in this way grows combinatorially.
When a large number of variables is involved, the number of combinations is
too large to be examined manually. Thorough systematic accurate analysis is
all but impossible! A conventional method to this problem is to examine
combinations that are likely to
have influence. However, hunch can leave out important factors without being noticed.

Fortunately, this problem can be overcome with
Hotspot Profiling Analysis Software Tools.
Hotspot profiling analysis drills-down data systematically and detects
important relationships, co-factors,
interactions, dependencies and associations amongst many variables and values
accurately using Artificial Intelligence techniques,
and generate profiles of most interesting segments.
Hotspot analysis can identify profiles of high (and low) risk loans accurately
through thorough systematic analysis of all available data.

Credit Risk Predictive Modeling

If past is any guide for predicting future events, predictive modeling is
an excellent technique for credit risk management.
Predictive models are developed from past historical records of credit loans,
containing financial, demographic, psychographic, geographic information,
etc. From the past credit information, predictive models can learn
patterns of different credit default/delinquency ratios, and can be used to predict
risk levels of future credit loans. It is important to note that
statistical process requires a substantially large number of
past historical records (or customer loans) containing useful
information. Useful information is something that can be
a factor that differentially affects credit default/delinquency ratios.

Neural Network is a very
powerful modeling tool.
It generally offers most accurate and versatile
models. It's very easy to develop neural network predictive models with CMSR.
Network visualization tools will guide users from configuration, training, testing,
and more importantly direct application to databases.

Cramer Decision Tree
produces most compact and thus most general decision trees.
Decision tree can be used for predicting segmentation-based
statistical probability of credit loan defaults.

Regression produces
mathematical functions for predicting default risk levels.
It can be very limiting to be used as
general-purpose credit risk predictive modeling methods.
However when it is used with above methods, it can be a very
useful method.

RME (Rule-based Model Evaluation) is a powerful model integration tool.
It can be used to combine
a number of predictive models into a single model,
producing combined predictions such as maximum, minimum, average, etc. In addition,
it can be used to classify combined predictions into classes such as "Very high risk",
"High risk", "Medium risk", "Low risk", etc.

Does Predictive Modeling Work?

Effectiveness of predictive modeling depends on the quality of historical data.
If historical data contains information
that can predict customer tendencies and behaviors, predictive modeling can be very effective.
Otherwise reliable predictive models will be difficult to obtain.
How can you know whether your customer data contain predictive information?
You need to perform variable relevancy analysis and build models and test!

Free Test Trial Program

If your organization has data that can be used to develop predictive models, please
write us by filling the form CMSR Data Miner Download Application.
We will provide software and email technical support up to a year free.
You will also receive "Predictive Modeling Guide to Credit and Insurance Risk Scoring" ebooks.
If you are unsure of how predictive models can be used, please try
MyDataSay Android App.

Credit Risk Scoring

Credit risk score
is a risk rating of credit loans. It measures
the level of risk of being defaulted/delinquent. The level of default/delinquency risk can be best
predicted with predictive modeling. Credit scores can be measured in term of
default/delinquency probability and/or relative numerical ratings.
The following subsections outline credit risk scoring methods;

Why Neural Network?

A commonly used method used in risk prediction is regression.
Regression works well if information structure is functional and simple.
However it does not perform well on complex information with many categorical variables.
Another commonly used method is decision tree.
Decision tree is not suitable if dependent variables have heavy
skews. Credit loan data have this skew. This leads neural network to
be the choice for credit risk modeling.
The following figure shows a neural network model;

Neural network weight-links are computed in such a way that given input
values, network produces certain output value(s) for output layer node(s).
This process is called as network training. This is performed using past data.
Neural network is a heuristic predictive system.

Bias nodes are similar to coefficients in regression. They have
value 1 and tend to improve network's learning capability.

In the above chart, positive value weight-links are colored in red. Negative value weight-links
are colored in blue. Colors are scaled according to absolute value ratios against
the largest absolute value. Absolute value zero is colored in white. Largest
absolute value is colored in pure red or blue color. The rest are scaled
accordingly.

It is noted that neural network is not good at predicting unseen information.
It can make very wild predictions. Thus good training data is very important.

In the following sections, credit risk modeling steps are described.

Step 1: Develop Neural Network Models

Predictive models infer predictions from a set of variables called independent variables.
To develop models, the first step is to analyze which variables contain predictive
information through relevancy analysis. Once relevant variables are identified,
(neural network) models can be configured and trained using past historical data.
Neural network training is a repetitive process which may take long. Fast computer
may be needed. Fully trained models should be tested using past historical data
before using them. Single models can have bias and weakness. To overcome this,
multiple models can be developed and combined as described in the next
section.

Step 2: Combine Neural Network Models

Once models are fully trained and tested, they can be integrated to produce combined outputs
such as largest (=maximum), smallest (=minimum), average, average without largest and smallest values,
etc. This can be done using RME (Rule-based Model Evaluation available in CMSR Data Miner) easily.
The following histogram shows largest(=maximum) scores and risk distribution in past historical data.
"RSCORE1" represents the combined largest(=maximum) values horizontally.
Vertically risk proportion is shown.
It clearly shows that higher scores have higher proportion of risk in the past historical data.
So the models are effective and useful.
Note that the neural network models are trained to predict values between 0 and 1. This can
be a bit higher and a bit lower value as seen in the histogram.

Step 3: Risk Scores to Risk Classification

Risk scores produced by neural network and RME models can be confusing to
users. It will be better if they are verbalized into more easily understood
vocabularies such as "Very high risk", "High risk", "Medium risk", "Low risk", etc.
The above histogram clearly shows that if maximum risk score is equal or greater
than 0.6, it has 100% risk. So it can be coded as "Very high risk".
The next class is if maximum risk score is equal greater than 0.3, it has "High risk".
The next class is if maximum risk score is equal greater than 0.2, it has "Medium risk".
The rest has "Low risk".
This classification produces risk distribution as in the following chart.

This chart shows how each class had risk in the past historical data.
This classification is coded using an RME model. You need two RME models:
One is to combine scores to produce maximum scores for analysis.
The next model is to produce classification and to deploy.

* Note that charts used in the page are based on artificially generated data.
Your data may not produce similar outcome.

* Expanded documentation of the above model can be found in
MyDataSay Android Application. Download is available here
MyDataSay Android App.

Step 4: Deploy Models for Users

Once models are fully trained, tested and combined into RME models, they
are ready to deploy for customer-facing users. We provide the following
deployment options;

Web Server: Rosella BI Server provides predictive models to users through web.
It can support a large number of users. It is optimized for small screen devices such as
smart phones and tablets as well as normal computers. In addition, it can be incorporated into
your web-based business applications using Java JSP pages.

Android Application: MyDataSay is an Android application which can be used by an unlimited
number of users.
Download is available here MyDataSay Android App.
You are recommended to download and try MyDataSay. You can learn how
predictive modeling can be used.

Android App for Credit Risk Predictive Models (Downloads)

Android App for predictive models is available for download.
You can install and try out how predictive models are used in credit risk management
on your Android phones and tablets. It's a perfect app for deploying your delinquency/default
predicting models for customer-facing staffs. Eventually delinquency/default predicting models should
be used by them.
Download is available here MyDataSay Android App.