External data presents a signiﬁcant set of challenges to insurers looking to bolster existing analytics platforms and better deﬁne core customers. These barriers have a common thread of increasing time, cost and effort while not guaranteeing a return on investment.

These challenges must be overcome because adding in external data is “more than a ‘nice to have,’ it’s a ‘must have,’” said Vandenberg. “There’s a general impression that you need to have external data to get a good model.”

The same set of barriers to implementing external data analytics exists for marketing, pricing, underwriting, claims and risk management in the property and casualty insur¬ance market.

New Data choices

It is very unlikely that an insurer will be able to know what data is important and predictive before the analysis pro¬cess has begun. Choosing the right data to include in new analysis is never going to be a perfect process.

This barrier is made more difficult by the sheer amount of data available on the market. Not only are new sources crop¬ping up that collect information on public records or aggre¬gate social media speech around new topics, but relevant information often needs extraction from large datasets.

The process of selecting, processing and integrating new external data adds time and expense to any analysis. Useful¬ness is a balance of cost, quality, and any impact to customer service that may come from shifting models or products.

To use this data successfully, it also must be verifiable.

This will require the insurer to find relevant data that can support some of the same conclusions or subject matter experts with whom it can discuss these trends.

Privacy concerns

For public concerns, the largest hurdle is ensuring that external datasets are properly gathered and monitored for privacy protections. Both the insurer and their data vendor must make sure they do not violate any terms of service or other conditions set by a digital service provider – such as Twitter for aggregated tweet data.

Privacy violations can do substantial harm to a company’s reputation and safety. They can also present an unclear danger because public reaction is notoriously hard to predict when it comes to information not readily thought of as publicly available.

Insurers must be conﬁdent that they are using the data in an appropriate and acceptable way. “You have to make sure that appropriate privacy protections are in place and that data is being used in an ethical manner,” said Johnson.

Time Delays

Adding new data to a current analytics process necessi¬tates adding time for the process to complete and for data to become actionable. If this analysis takes too long, the insurer will miss the opportunity provided.

For example, property and casualty insurers can target users moving to high residential areas or urban areas with a high population density for new homeowner or renter insurance plans. This information may be announced on social media sites directly by users, but if processing it takes two months the consumer may have made the move and selected their insurance before new marketing reaches them.

Renter’s insurance can be a prerequisite to signing a lease, so the external data pointing to a move must become actionable before the move itself takes place.

If a company has invested time in reviewing and process¬ing a dataset but it turns out to have no correlative data or is predictive but only has a small influence – and is not worth the complexity of integrating into existing systems – then that is simply “burnt time,” said Vandenberg.

Increasing Data Volume

Handling an increasing amount of data is a major chal-lenge for insurers; the problems of collecting the data itself have been outstripped by problems associated with its maintenance.

The most commonly expressed concern for data volume is that analytics cannot maintain the same pace as data gath¬ering and integration when adding in external sources. Insurers will need a clear structure or plan for maintaining actionable data to scale.

“Unstructured data is a new frontier in potentially valuable data; at the same time you have to apply different tools and techniques to it” but the increased volume is exacer¬bated by the fact that fewer people can analyze it properly and companies need to have storage in place as soon as the data is selected for processing, said Vandenberg.

Volume-capable infrastructure

Insurers often use legacy infrastructure that cannot meet the storage and processor demands of data analysis. While costs in storage have declined, available and collected data volumes have exploded.

This has prompted a move to cloud storage because it’s relatively low-cost and can scale as need increases. Without cloud options, few insurers could adopt external data analytics.

Cloud systems greatly vary across vendors and provid¬ers, adding in security concerns for data the lives and is processed outside of an insurer’s own, private network. In the event of a security breach, the amount of data an insurer provides to a cloud analytics platform can increase its potential to do harm.

Adding in cloud platforms can also exacerbate existing infrastructure problems that leave data siloed in different departments. Underwriting and claims departments must collaborate to develop a case for expanding platforms that can bridge their respective departments in order to lever¬age the data they own; the plan must include a strategy to afford current costs and future costs or upgrades in a reasonable way to get an insurer’s executives on board.

Room to innovate

Insurers tend to be risk averse, lending little room to inno¬vate or take chances developing and testing new systems that may not work or have the potential to unnecessarily change rate structure or customer composition.

This means that a cultural change can often be the biggest impediment to adding in external data. Many insurers view their industry and their products as “so unique that external data insights would only have a small impact, a viewpoint that limits the scope data can assist their work and models,” said Belhe.

To combat that, a chief recommendation by industry analysts is to develop a testing lab or an innovation space. However, this increases costs by adding in additional sim¬ulation equipment and staff while extending time-to-mar¬ket because of the general practice of including outside stakeholders in the development process.

Costs also rise as companies test different types of new datasets because of the time and skill it takes to verify that vendor-provided information can be adapted or integrated to ﬁt within existing systems.

These labs can improve idea generation and development, but their cost may still be viewed as prohibitive.

The other barrier to innovation is that these labs and tests must be performed on cohorts within the insurer’s existing pool in order to be valid for its customer base. While some testing can focus on lead generation and new client acqui-sition, using analytics to enhance existing underwriting or coverage requires testing on current clients if the lessons and governance are to have any meaningful impact near-term.

Pre-determining needs

The best data model for property and casualty insurers is to have a small amount of data points that are very actionable and correlate well to known issues around customer risk, engagement and conversion. To reach an optimal level of data for this model when starting a new project or expand¬ing datasets, insurers need to have the proper data before a project starts.

The catch for insurers is that they must ﬁrst analyze data in re¬lation to their goals and existing information before they can determine if new datasets fall into a valuable use-category.

“The one fundamental thing about external data is that when we need to have it is actually before the project starts,” said Vandenberg.

A trend is for insurers to use smaller batches as test cases and try innovation in smaller settings, but this increases overall investment if initial data selections are inappropri¬ate. Insurers must also view potential insights from every aspect of their business, increasing the team and expertise required for testing.

Insurers must move beyond risk aversion that causes reliance on existing internal data, potentially causing a loss of existing and new customers to an improved competitor who takes advantage of external data mining beneﬁts.

Defining Bias

Social media sources are limited in scope and very con¬text-speciﬁc, so their inclusion in internal data systems must be couched with the possibility of sample bias and other validity concerns. Correlations can be validated by discussing the ﬁndings with subject matter experts, but this too expands the time required.

Pursuing a quick return-on-investment return may lead the insurer to over-reliance on the data for marketing and customer engagement while using up the allocated bud¬get and creating a cost-prohibitive position for trend and cohort veriﬁcation through other external data.

Social media data is often touted as a new resource in un-derstanding risk and fraud determination, but the insurers we spoke to do not yet feel it is reliable enough because of its high propensity toward bias.

“When you’re talking about the social media data set, prob-ably 95% is fluff,” said Belhe. “Effectively coming to that 3% to 4% where you can ﬁnd something that is noteworthy is still not easy”.