Meta

Maintainers

Classifiers

Project description

apc

This package is for age-period-cohort and extended chain-ladder analysis. It allows for model estimation and inference, visualization, misspecification testing, distribution forecasting and simulation. The package covers binomial, (generalized) log-normal, normal, over-dispersed Poisson and Poisson models. The common factor is a linear age-period-cohort predictor. The package uses the identification method by Kuang et al. (2008) implemented as described by Nielsen (2015) who also discusses the use of the R package apc which inspired this package.

Latest changes

Version 1.0.1 fixes some typos and refactors production code.

Version 1.0.0 adds a number of new features. Among them are

Vignettes to replicate papers (start with those!)

Distribution forecasting

Misspecification tests

More plotting

Simulating from an estimated model

Sub-sampling

Usage

import package: import apc

Set up a model: model = apc.Model()

Attach and format the data: model.data_from_df(pandas.DataFrame)

Plot data

Plot data sums: model.plot_data_sums()

Plot data heatmaps: model.plot_data_heatmaps()

Plot data groups of one time-scale across another: model.plot_data_within()

Included Data

Asbestos

These data are for counts of mesothelioma deaths in the UK in age-period space. They may be modeled with a Poisson model with "APC" or "AC" predictor. The data can be loaded by calling apc.asbestos().

Source: Martinez Miranda et al. (2015).

Belgian Lung Cancer

These data includes counts of deaths from lung cancer in Belgium in age-period space. This dataset includes a measure for exposure. It can be analyzed using a Poisson model with an “APC”, “AC”, “AP” or “Ad” predictor. The data can be loaded by calling apc.Belgian_lung_cancer().

Source: Clayton and Schifflers (1987).

Run-off triangle by Barnett and Zehnwirth (2000)

Data for an insurance run-off triangle in cohort-age (accident-development year) space. This data is pre-formatted. These data are well known to require a period/calendar effect for modeling. They may be modeled with an over-dispersed Poisson "APC" predictor. The data can be loaded by calling apc.loss_BZ().

Source: Barnett and Zehnwirth (2000).

Run-off triangle by Taylor and Ashe (1983)

Data for an insurance run-off triangle in cohort-age (accident-development year) space. This data is pre-formatted.
May be modeled with an over-dispersed Poisson model, for instance with "AC" predictor. The data can be loaded by calling apc.loss_TA().

Source: Taylor and Ashe (1983).

Run-off triangle by Verrall et al. (2010)

Data for insurance run-off triangle of paid amounts (units not reported) in cohort-age (accident-development year) space.
Data from Codan, Danish subsidiary of Royal & Sun Alliance.
It is a portfolio of third party liability from motor policies. The time units are in years.
Apart from the paid amounts, counts for the number of reported claims are available. The paid amounts may be modeled with an over-dispersed Poisson model with "APC" predictor. The data can be loaded by calling apc.loss_VNJ().

Source: Verrall et al. (2010).

Run-off triangle by Kuang and Nielsen (2018)

These US casualty data are from the insurer XL Group. Entries are gross paid and reported loss and allocated loss adjustment expense in 1000 USD. Kuang and Nielsen (2018) consider a generalized log-normal model with "AC" predictor for these data. The data can be loaded by calling apc.loss_KN().

Known Issues

Index-ranges such as 1955-1959 don't work with forecasting if the initial data_format was not CA or AC. The problem is that the forecasting design is generated by first casting the data into an AC array from which the future period index is generated.

Index-ranges, such as 1955-1959 in data_vector as output by Model().data_as_df() are strings. Thus, sorting may yield unintuitive results for breaks in length of the range components. For example, sorting 1-3, 4-9, 10-11 yields the ordering 1-3, 10-11, 4-9. This results in mislabeling of the coefficient names later on since those are taken from sorted indices. A possible, if ugly, fix could be to pad the ranges with zeros as needed.

Martinez Miranda, M. D., Nielsen, B., & Nielsen, J. P. (2015). Inference and forecasting in the age-period-cohort model with unknown exposure with an application to mesothelioma mortality. Journal of the Royal Statistical Society: Series A (Statistics in Society), 178(1), 29–55.