Machine Learning & its impact on the future for Insurance

The interest in machine learning and the associated appetite to drive business outcomes from such investments continues to build. I’ve been talking to many insurance organisations over the past 18 months around machine learning and four consistent areas tend to arise as organisations grapple with the application and value of machine learning.

First and foremost, machine learning WILL change the way insurers do business. The insurance industry is founded on forecasting future events and estimating the value/impact of those events and has used established predictive modeling practices – especially in claims loss prediction and pricing – for some time now. With big data and new data sources such as sensors/telematics, external data sources (Data.gov), digital (interactions), social and Web (sentiment), the opportunity to apply machine learning techniques has never been greater across new areas of insurance operations.

Machine Learning has now become an essential tool for insurers and it is used extensively across the core value chain to understand risk, claims and customer experience. Specifically, it is enabling insurance companies to yield higher predictive accuracy, as it can fit more flexible/complex models. As opposed to traditional statistical methods, machine learning takes advantage of the power of data analytics and is capable of computing seemingly unrelated datasets whether structured, semi-structured or unstructured.

By way of an example, predictive models based upon machine learning now take into consideration:

Now more than ever, insurers have the ability to evaluate mass amounts of underwriting/claims notes and diary (unstructured data), in addition to more standard documentation.

Pricing risk, estimating losses and monitoring fraud are critical areas that machine learning can support. Insurers have introduced machine learning algorithms primarily to handle risk similarity analytics, risk appetite and premium leakage. However, it is also widely used to aid the frequency/severity of claims, manage expenses, subrogation (general insurance), litigation and fraud.

One of the most impactful machine learning use cases is the ability to learn from audits of closed claims, as for the very first time leakage becomes controllable by the insurer. Claim audits are traditionally a manual process by nature, however, machine learning techniques provide an up-lift in the ability to learn from those by applying enhanced scoring and process methods throughout the claims lifecycle.

Those claim handling algorithms can be also used to monitoring and detecting fraud; however, one of the limiting factors may be the number of claims fraud cases/instances an insurance company has as the fraud datasets are fundamental for both traditional and machine learning models.

I’m often asked if machine learning can deliver a tangible decline in fraud rates and I do believe it can have an impact on earlier identification, or ‘counter-fraud’ techniques. The key element is to reduce the false positives and to apply machine learning algorithms to help determine which claims are potentially fraudulent vs. those that are legitimate.

Insurance companies applying this technique are reducing fraud in two aspects: earlier identification of the fraud and allocating resource time on the claim fraud investigation vs. spending on valid claims. This also increase customer satisfaction as valid claims are paid faster.

Nothing evidences the impact of any technology more than how it is applied in the real world and we are seeing those as relates to insurance fraud. Using machine learning, insurers can load claims data (whether structured, unstructured and semi-structured data) into a huge repository, often called “data lake”. This method differs from traditional predictive models which only leverage structured data. Claims notes, diaries and documents are key in discovering fraud and developing fraud models. In case of fraud detection, the procedure would consist on:

Learning Phase: where you are learning from “training data” or claims which are fraudulent and those which are valid. it consists on pre-processing (normalization, dimension reduction, image processing if you are using photos, aerial images etc), learning (supervised, unsupervised, minimization, etc.) and error analysis (precision, recall, overfitting, test/cross validation, etc.).

Prediction Phase: here one uses the model from the learning phase and apply it to new data and is deployed for detecting and flagging fraud.

Continuous Learning Phase: it is key to continuously recalibrate your models with new data and behaviors.

In addition to machine learning, the usage of Graph Analytics is also rapidly becoming popular because of its ability to visualise fraud patterns.

The usage of Graph Analytics with Apache Spark/GraphX is a newer method being leveraged as it enables the usage of neural network and social networks which is key in claims fraud analysis. This method is becoming quite popular vs. traditional claims scoring or business rules as these methods (considered a “flagging model”) and may result in too many false positives.

A Graph Analytics technique can help you understand the data relationships and is also used for investigating individual claims fraud cases. This method allows insurance companies to more quickly visualize fraud patterns vs. traditional scoring models.

Tags:

Comments

Really interesting article and well written! I am an Actuarial Science/ Computer Science student at Drake University. Personally, I predict that the implementation of machine learning to actuarial science/ insurance field will come sooner that anyone’d thought! With the new uprising google’s machine learning library –Tensorflow & Keras, I can foresee all my actuarial science examination will prepare me for nothing now. These neural network implementation can predict mortality risk, evaluate future pricing, and even estimate optimal retirement plan at ease. I am not saying this without reference here, i provided some link below for your reference:

Hi, Thanks for the article. I am a business consultant working in Insurance industry. Can you provide examples of some of the machine learning based commercial products which are being used or getting tried out in the industry today? I always get IBM Watson’s name from everybody but not many other commercial products are heard in this space.

Suyash — for commercially available products take a look at SynerScope with their ixiwa and iximeer products; H20ai, DataRobot; and CognitiveScale. Apache Spark would definitely be an open source option which you will see many of commercially available products leveraging in their solutions.

Your email address will not be published. Required fields are marked *

Comment

Name *

Email *

Website

Related Posts

BLOG

11.17.17

Building a global data lake for...

Financial institutions need to leverage all the information they can gather to guide future investments, reduce risk and detect fraud. These objectives directly influence an institution’s bottom line and have become more challenging with the rising volumes and varieties of Big Data. To keep up, financial institutions are continuously adapting their data architectures to support…

Strata Data Conference New York –...

The annual Strata Data Conference will make its next stop in New York the week of September 25-29. The core theme of the conference this year is around driving business transformations through the power of data. And in the world of data, few topics excite as much as data science, machine learning, and deep learning.…

Top 5 Use Cases of HDF...

With the explosion of the Internet of Things (IoT), businesses need to reevaluate their existing data strategies and adopt a more modern data architecture. Building a near-real time streaming application that take advantage of data-in-motion can be a daunting journey to undertake. Most streaming applications require a considerable amount of coding, testing and time to…

Hospitality and the Big Data Window...

What eventually drives us to purchase something online? This isn’t a hypothetical or philosophical question, as businesses are forced to find answers every day. Consumers across every industry have access to almost anything they can find on a search engine, however with such a big supply of goods, businesses must manage the Big Data that’s…

How the Leading Transportation Tech Provider...

Last week we published a story on how the leading transportation tech provider is able to solve the last mile problem for the carrier industry. This article expands on this and how companies need to have clean data. How Clean Is Your Data? By now, we all know that data is Big. We know that…

Denodo Platform 6.0 Certification with HDP...

This guest blog is from our partner Denodo who is a leader in data virtualization and a Hortonworks partner for many years. Denodo helps customers who have data from multiple, heterogeneous sources to quickly, easily and cost-effectively integrate it to derive business insights and positively change their strategy to become more data-driven. In addition to…

How Data Fuels the Breakneck Speed...

The world is moving faster than ever before, and companies are finding themselves scrambling to keep up. Our society has become centered on convenience and everything happening in real time. Data insights need to happen in real time as well so that companies can maximize their ability to better serve their customers. Last week I…

How the Leading Transportation Tech Provider...

In many industries, the last mile problem is the most difficult problem to crack. This is especially true in the carrier industry, where 91% of the total carriers operate fleets of six vehicles or less. TMW Systems (TMW) solves the last mile problem, and an array of other transportation and business challenges, by consolidating the…

GDPR Compliance Presents New Business Opportunities

The General Data Protection Regulation (GDPR) is creating new business opportunities by making it easier for companies worldwide to operate throughout the European Union. GDPR compliance will force companies to review their data-handling practices, and to analyze and understand their customer data types. Now, the conversation is changing from the burden of compliance to the…