“Please, explain.” Interpretability of machine learning models

In February 2019 Polish government added an amendment to a banking law that gives a customer a right to receive an explanation in case of a negative credit decision.

It’s one of the direct consequences of implementing GDPR in EU.

This means that a bank needs to be able to explain why the loan wasn’t granted if the decision process was automatic.

In October 2018 world headlines reported about Amazon AI recruiting tool that favored men.

Amazon’s model was trained on biased data that were skewed towards male candidates.

It has built rules that penalized résumés that included the word “women’s”.

What is common for the two examples above is that both models in the banking industry and the one built by Amazon are very complex tools, so-called black-box classifiers, that don’t offer straightforward and human-interpretable decision rules.

Financial institutions will have to invest in model interpretability research if they want to continue using ML-based solutions.

And they probably will, because such algorithms are more accurate in predicting credit risk.

Amazon on the other hand, could have saved a lot of money and bad press if the model was properly validated and understood.

Machine learning has continued to stay on the top of Gartner’s Hype Cycle since 2014, to be replaced by the Deep Learning (a form of ML) in 2018 suggesting the adoption hasn’t reached its peak yet.

SourceMachine learning growth is predicted to further accelerate.

Based on the report by Univa 96% of the companies are expected to use ML in production in the next 2 years.

ML adoption growth is accompanied by the increase in ML-interpretability research driven by regulations like GDPR, EU’s “right to explain”, concerns about safety (medicine, autonomous vehicles), reproducibility and bias or end-users expectations (debug the model to improve it or learn something new about the studied subject).

SourceAs data scientists, we should be able to provide an explanation to end users about how a model works.

However, this not necessarily means understanding every piece of the model or generating a set of decision rules.

There could also be a case where this is not required:If we look at the results from the Kaggle’s Machine Learning and Data Science Survey from 2018, around 60% of respondents think they could explain most of machine learning models (some models were still hard to explain for them).

The most common approach used to ML understanding is analyzing model features by looking at feature importance and feature correlations.

Feature importance analysis offers first good insights into what the model is learning and what factors might be important.

However, this technique can be unreliable if features are correlated.

It can provide good insights only if model variables are interpretable.