R code for Chapter 1 of Non-Life Insurance Pricing with GLM

Insurance pricing is backwards and primitive, harking back to an era before computers. One standard (and good) textbook on the topic is Non-Life Insurance Pricing with Generalized Linear Models by Esbjorn Ohlsson and Born Johansson (Amazon UK | US). We have been doing some work in this area recently. Needing a robust internal training course and documented methodology, we have been working our way through the book again and converting the examples and exercises to R, the statistical computing and analysis platform. This is part of a series of posts containing elements of the R code.

Example 1.3

Here we are concerned with replicating Table 1.4. We do it slowly, step-by-step, for pedagogical reasons.

################### Example 1.3if(!exists("table.1.2"))load("table.1.2.RData")## We calculate each of the columns individually and slowly here## to show each step## First we have simply the labels of the tablerating.factor<-with(table.1.2,c(rep("Vehicle class",nlevels(premiekl)),rep("Vehicle age",nlevels(moptva)),rep("Zone",nlevels(zon))))## The Class columnclass.num<-with(table.1.2,c(levels(premiekl),levels(moptva),levels(zon)))## The Duration is the sum of durations within each classduration.total<-c(with(table.1.2,tapply(dur,premiekl,sum)),with(table.1.2,tapply(dur,moptva,sum)),with(table.1.2,tapply(dur,zon,sum)))## Calculate relativities in the tariff## The denominator of the fraction is the class with the highest exposure## (i.e. the maximum total duration): we make that explicit with the## which.max() construct. We also set the contrasts to use this as the base,## which will be useful for the glm() model later.class.base<-which.max(duration.total[1:2])age.base<-which.max(duration.total[3:4])zone.base<-which.max(duration.total[5:11])rt.class<-with(table.1.2,tapply(helpre,premiekl,sum))rt.class<-rt.class/rt.class[class.base]rt.age<-with(table.1.2,tapply(helpre,moptva,sum))rt.age<-rt.age/rt.age[age.base]rt.zone<-with(table.1.2,tapply(helpre,zon,sum))rt.zone<-rt.zone/rt.zone[zone.base]contrasts(table.1.2$premiekl)<-contr.treatment(nlevels(table.1.2$premiekl))[rank(-duration.total[1:2],ties.method="first"),]contrasts(table.1.2$moptva)<-contr.treatment(nlevels(table.1.2$moptva))[rank(-duration.total[3:4],ties.method="first"),]contrasts(table.1.2$zon)<-contr.treatment(nlevels(table.1.2$zon))[rank(-duration.total[5:11],ties.method="first"),]

The contrasts could also have been set with the base= argument, e.g. contrasts(table.1.2$zon) <- contr.treatment(nlevels(table.1.2$zon), base = zone.base), which would be closer in spirit to the SAS code. But I like the idiom presented here where we follow the duration order; it also extends well to other (i.e. not treatment) contrasts. I just wish rank() had a decreasing= argument like order() which I think would be clearer than using rank(-x) to get a decreasing sort order.

That was the easy part. At this stage in the book you are not really expected to understand the next step so do not despair! We just show how easy it is to replicate the SAS code in R. An alternative approach using direct optimization is outlined in Exercise 1.3 below.

## Relativities of MMT; we use the glm approach here as per the book’s## SAS code at http://www2.math.su.se/~esbj/GLMbook/moppe.sasm<-glm(riskpre~premiekl+moptva+zon,data=table.1.2,family=poisson("log"),weights=dur)## If the next line is a mystery then you need to## (1) read up on contrasts or## (2) remember that the link function is log() which is why we use exp hererels<-exp(coef(m)[1]+coef(m)[-1])/exp(coef(m)[1])rm.class<-c(1,rels[1])# See rm.zone below for therm.age<-c(rels[2],1)# general approachrm.zone<-c(1,rels[3:8])[rank(-duration.total[5:11],ties.method="first")]## Create and save the data frametable.1.4<-data.frame(Rating.factor=rating.factor,Class=class.num,Duration=duration.total,Rel.tariff=c(rt.class,rt.age,rt.zone),Rel.MMT=c(rm.class,rm.age,rm.zone))save(table.1.4,file="table.1.4.RData")print(table.1.4,digits=3)rm(rating.factor,class.num,duration.total,class.base,age.base,zone.base,rt.class,rt.age,rt.zone,rm.class,rm.age,rm.zone,m,rels)################

The result is something like this:

Rating.factor

Class

Duration

Rel.tariff

Rel.MMT

1

Vehicle class

1

9833.20

1.00

1.00

2

Vehicle class

2

8825.10

0.50

0.43

3

Vehicle age

1

1918.40

1.67

2.73

4

Vehicle age

2

16739.90

1.00

1.00

5

Zone

1

1451.40

5.17

8.97

6

Zone

2

2486.30

3.10

4.19

7

Zone

3

2888.70

1.92

2.52

8

Zone

4

10069.10

1.00

1.00

9

Zone

5

246.10

2.50

1.24

10

Zone

6

1369.20

1.50

0.74

11

Zone

7

147.50

1.00

1.23

Note the rather unusual and apparently inconsistent rounding in the book: 147, 1.66, and 5.16 would be better as 148 (the value is 147.5), 1.67, and 5.17.

Exercise 1.3

Here it gets interesting as we get a different value from the authors. Possibly a small bug on our part but at least we provide the code for you to check. So if you spot a problem let us know in the comments.

################## Exercise 1.3## The values from the bookg0<-0.03305g12<-2.01231g22<-0.74288dim.names<-list(Milage=c("Low","High"),Age=c("New","Old"))pyears<-matrix(c(47039,56455,190513,28612),nrow=2,dimnames=dim.names)claims<-matrix(c(0.033,0.067,0.025,0.049),nrow=2,dimnames=dim.names)## Function to calculate the error of the estimateGvalsError<-function(gvals){## The current estimatesg0<-gvals[1]g12<-gvals[2]g22<-gvals[3]## The current estimates in convenient matrix formG<-matrix(c(1,1,g12,g22),nrow=2)G1<-matrix(c(1,g12),nrow=2,ncol=2)G2<-matrix(c(1,g22),nrow=2,ncol=2,byrow=TRUE)## The calculated valuesG0<-addmargins(claims*pyears)["Sum","Sum"]/(sum(pyears*G1*G2))G12<-addmargins(claims*pyears)["High","Sum"]/(g0*addmargins(pyears*G2)["High","Sum"])G22<-addmargins(claims*pyears)["Sum","Old"]/(g0*addmargins(pyears*G1)["Sum","Old"])## The sum of squared errorserror<-(g0-G0)^2+(g12-G12)^2+(g22-G22)^2return(error)}## Minimize the error function to obtain our estimategamma<-optim(c(g0,g12,g22),GvalsError)stopifnot(gamma$convergence==0)gamma<-gamma$parvalues<-data.frame(legend=c("Our calculation","Book value"),g0=c(gamma[1],g0),g12=c(gamma[2],g12),g22=c(gamma[3],g22),row.names="legend")print(values,digits=4)## Close, but not the same.rm(g0,g12,g22,dim.names,pyears,claims,gamma,values)################

The resulting table is something like:

g0

g12

g22

Our calculation

0.0334

1.9951

0.7452

Book value

0.0331

2.0123

0.7429

Close, but not the same. Perhaps they used a different error function.

3 CX priorities for 8 company cultures:
Customer Experience (CX) initiatives often have a strong focus on changing the corporate culture. And rightly so. But culture is hard and we have found...
(~1092 words)

CX Conversations: Are yours unhelpful?:
Cassandra Goodman makes a passionate plea for Customer Experience (CX) leaders to focus on what matters: improving the business through better customer- and employee experience....
(~246 words)

CYBAEA

CYBAEA are value and growth architects for the data economy. We are passionate about value creation and delivering commercial results. We help organizations identify and act upon opportunities in Customer Value Management (CVM), Customer Experience (CX) and Advocacy, and Innovation and Growth.