Natura non facit saltus

(see John Wilkins’ article on the – interesting – history of that phrase http://scienceblogs.com/evolvingthoughts/…). We will see, this week in class, several smoothing techniques, for insurance ratemaking. As a starting point, assume that we do not want to use segmentation techniques: everyone will pay exactly the same price.

no segmentation of the premium

And that price should be related to the pure premium, which is proportional to the frequency (or the annualized frequency, as discussed previously), since

The probability measure is mentioned here just to recall that we can use any measure. Even (based on some covariates). Without any covariate, the expected frequency should be

Thus, if we do not want to take into account potential heterogeneity, we should assume that where is closed to 7.28%. Yes, as mentioned in class, it is rather common to see as a percentage, i.e. a probability, since

i.e. can be interpreted as the probability of not have a claim (see also the law of small numbers). Let us visualize this as a function of the age of the driver,

It is possible to compute not the expected frequency , but the ratio .

Above the horizontal blue line, the premium will be higher than the one obtained without segmentation, and (of course) lower below. Here, drivers younger than 44 year old will pay more, while driver older than 44 year old will be less. We have discussed, in the introduction, the necessity of segmentation. If we consider two companies, one segmenting, while the other one has a flat rate, then older drivers will go to the first company (since insurance is cheaper) while younger ones will go to the second one (again, it is cheaper). The problem is that the second company implicitly hopes that older drivers will compensate the risk. But since they’re gone, insurance will be too cheap, and the company will loose money (if not goes bankrupt). So companies have to use segmentation techniques to survive. Now, the problem is that we cannot be sure that this exponential decay of the premium is the proper way the premium should evolve as a function of the age. An alternative can be to use nonparametric techniques to visualize to true influence of the age on claims frequency.

A pure nonparametric model

A first model can be to consider a premium, per age. This can be done considering the age of the driver as a factor in the regression,

Here, the forecast for our 40 year old driver is slightly lower than be previous one, but the confidence interval is much larger (since we focus on a very small subclass of the portfolio: drivers aged exactly 40)

So here, we did not remove the discontinuity problem. An idea here can be to consider moving regions: if the goal is to predict the frequency for a 40 year old driver, perhaps the class should be (somehow) centered around 40. And center the interval around 35 for drivers aged 35. Etc.

Moving average

Thus, it is natural to consider some local regressions, where only drivers aged almost 40 should be considered. This almost concept is related to the bandwidth. For instance, drivers between 35 and 45 can be considered as being almost40. In practice we can either consider a subset function, or we can use weights in the regressions

We do obtain a curve that can be interpreted as a local regression. But here, we do not take into account that 35 is not as close to 40 as 39 could be. An here, 34 is assumed to be very far away from 40. Clearly, we could improve that technique: kernel functions can considered, i.e. the closer to 40, the larger the weight.

Somehow, one way or another, all those models are valid. So perhaps we should compare them,

On the graph above, we can visualize the upper and the lower bound of the prediction, for the 9 models. The horizontal line is the predicted value without taking into account heterogeneity. It is possible to consider relative values, with respect to this value,

Some
sort of unpretentious (academic) blog, by a surreptitious economist and
born-again mathematician. A blog activist, and an actuary, too. Always curious.

Used to live in Paris (France),
Leuven (Belgium), Hong-Kong (China), and Montréal (Canada). Professor and researcher in
Montréal, currently back in Rennes (France). Addicted to R. ENSAE ParisTech & KU Leuven Alumni