Author Information

*Patrick Brockett is the Gus S. Wortham chaired professor of risk management at the University of Texas at Austin. Richard Derrig is senior vice president, Automobile Insurers Bureau of Massachusetts and vice president of research, Insurance Fraud Bureau of Massachusetts. Linda Golden is the Marlene and Morton Meyerson Centennial Professor in Business at the University of Texas. Arnold Levine is professor emeritus in the Department of Mathematics, Tulane University. Mark Alpert is a professor of marketing at the University of Texas.

According to a study of Massachusetts auto insurance, 72 percent of auto injury tort claims filed in 1991 did not result in litigation. Of those that did result in litigation, about 99 percent were settled before an actual jury verdict was reached (Weisberg and Derrig, 1991). Consequently, the number of situations in which one can actually observe the true value of the dependent variable (legal determination of fraud or legally not provable as fraud) is quite small (0.28 percent). This is different from situations such as random audits by the IRS, which can be useful for developing classification functions in that in these latter cases a sample of known results exists upon which to base modeling estimates. Randomly pursuing litigation for fraud in insurance claims can result in excessive legal fees, terrible public relations, regulatory sanctions, and multiple damages!

The underlying available information assumptions necessary to perform PRIDIT analysis differ from (are less stringent than) those required of regression, probit, logit, discriminant analysis, or other ‘‘classification’’ methodologies that rely on having ‘‘training samples’’ from the fraud and the nonfraud groups. The PRIDIT methodology does not require a knowledge of which respondents indulged in fraud in order to implement it; that is, it does not require delineation of a sample of cases in which the occurrence of fraud together with its covariates are known as well as a delineation of a sample of cases in which the absence of fraud together with its covariates are known—this is how it differs from regression, discriminant analysis, and so on. One cannot fit a regression model without a dependent variable. In the artificial intelligence literature the regression, probit, logit, discriminant, and neural network analysis information scenario is known as ‘‘supervised learning.‘’ One knows the correct classification for some group of respondents and can ‘‘supervise,‘’ or optimize, the estimation process for selecting parameter values in such a manner that the model optimally distinguishes between the known correctly classified subset of cases. Regression, probit, logit, discriminant, and neural network analyses require both the existence and the knowledge of an already classified set of data, whereas PRIDIT analysis does not. In the artificial intelligence literature this latter situation is known as ‘‘unsupervised learning,‘’ wherein one does not have a training set containing the covariates and the dependent variable. Consequently, one cannot supervise or optimize the estimation process for selecting parameter values in such a manner that the model optimally distinguishes between a known correctly classified subset of cases.

Like the PRIDIT methodology introduced in this article, the Kohonen Self-Organizing Feature Map methodology uses unsupervised learning; that is, without feeding back knowledge of the dependent variable, in a neural network-type format. However, the Kohonen method is more difficult to implement and fails to give the same information to the fraud investigator as the proposed methodology.

Fraud indicators in this automobile insurance context are often binary response statements about the circumstances of a claim, with a positive response indicating increased likelihood of fraud(Canadian Coalition Against Insurance Fraud, 1997).

A brief description of an epidemiological study using this methodology is given subsequently.

This score is actually a linear transformation of the RIDIT scoring method first introduced by Bross (1958) for epidemiological studies, but the version in Equation (1) is more convenient for our analysis.

The description of all fraud indicators used in the study is attached as Appendix B.

This initial value of suspicion scores corresponds to the naive model of counting the number of fraud indicators. This simple counting method for indicator variables is common in several areas (such as in an early warning system for insurer insolvency used previously by the Texas Department of Insurance). As we shall see, this simple summative scoring method for claim files can be improved by further analysis.

Note that the use of regression or other supervised learning type techniques would be impossible in this unsupervised learning environment (as is also the case in fraud for bodily injury claims). Only exogenous validation is possible.

Rigorous statistical methods also exist for estimating θ and for reassigning membership in the two groups based on the estimate. See Levine (1991) for details. However, by this methodology, only those with scores close to zero may be reassigned. For practical purposes, this reassignment does not affect the procedures proposed here.

In this case, θ was known in the sense that 62 claims were assessed to be fraud by at least one coder (27 by adjuster but not investigator, 16 by investigator but not adjuster, and 19 by both adjuster and the investigator) and 65 claims were randomly chosen from claims assessed by all coders as nonfraud (Weisberg and Derrig, 1998).

Appendix B shows the 65 weights when considered by category and when all indicators are considered at once. Table 1 and Appendix B give the verbal description of the variables.

Significant regression coefficients are used as weights in a scoring model on claims subsequent to the training (regression) data.

The internal inconsistency of the claims adjusters and claims investigators is also noted inBrockett et al. (1998) and occurs, in part, because investigators and adjusters focus differently on variables. The PRIDIT method is, however, nonsubjective and does not focus on specific variables, but rather gives higher weight to variables that seem to distinguish the high group (nonfraud) from the low group (fraud), yielding an ultimate claim file score. In addition, the regression weights reflect, in part, the subjective evaluation of the investigator or adjuster. In fact, the turnover rate of adjusters and investigators (whose subjective opinions are dependent upon training and economic conditions other uncontrollable variables) means that when a model such as regression is trained using the subjective opinions of a particular evaluator, the change of this evaluator may make the parametric ‘‘trained’’ model no longer applicable to a newly hired evaluator. PRIDIT analysis does not have this weakness and can even be used for continuous monitoring to determine when significant changes in the underlying claim variable weights or RIDIT scores have occurred, necessitating a recalibration of the model. The PRIDIT weights can thus be thought of as more reliable.

Note, however, that both objective fraud indicators and subjective fraud indicators exist, and that experienced professionals are still needed to decide how to score certain of the individual subjective fraud indicator variables, but the level of expertise needed to score the individual variables is less. Moreover, psychological literature has established that raters are more reliable for rating particular attributes or specific components of an overall multiattribute problem than they are for performing the overall problem, so there is still a gain to be gotten here.

These types of consistency checks apply to any other similar data scheme for which PRIDIT analysis is feasible and a separate, but related, ranking or classification is present (such as when exogenous classifications exist, as with insurance investigators or adjusters).

The PRIDIT weights are shown in Appendix B. An Excel Workbook is available from the authors containing the data and calculations for these comparisons.

The use of natural integer scoring produces lumpy rankings with many ties by virtue of the discretized zero-to-ten scores.

We suggest correlations of 0.50 and above (moderate to full consistency).

The cross-product, or odds ratio, for a 2 × 2 table of counts or probabilities (mij) equals m11m22/m12m21, assuming all cells are nonzero. Several additional statistics measuring similarity, such as Yule's Q, chi-squared, and Guttman's Lamda could also be considered (Upton, 1978).

The models were estimated on nearly identical fraud indicators applied to data four accident years removed (1989 versus 1993) from the 127 claim data. For this illustration, the regression models derived using the 1993 data are treated as the old models.

In practice, sample sizes would be chosen to be much larger than the 127-claim example shown here. Since the variation in the cross-product is a function of the sum of the reciprocals, 95 percent intervals for practical settings would be much smaller than shown here.

Many fraud indicators contain linguistic variables (old, low-valued car) that may be more accurately reported as fuzzy numbers, but that is beyond the scope of this article.

Accordingly, if one is forced to use standard statistical methods, we would advocate using RIDIT scoring as a precursor to applying the statistical method in the situation wherein one encounters rank-ordered categorical but not interval-level data in a supervised learning context (that is, actual training samples are available from the fraud and nonfraud groups). One could, for example, extend regression or probit or logit analysis as has been used previously for fraud detection modeling to the situation wherein rank-ordered non-interval categorical fraud variables are used (see Golden and Brockett, 1987).

There has been a long history of using these variables by experienced claims adjusters and investigators. Moreover, in those situations in which one has a categorical but non-ordinal variable, one can often rearrange the variable categories in such a way that the newly rearranged variable satisfies the stochastic dominance assumption. This can be done as follows: First run the PRIDIT analysis without this variable and then rank-order the resulting claim files in accordance with overall score. Then take the upper and lower quartiles of the ordered files and calculate a probability for each categorical response for each quartile. Finally, rearrange the categories in such a manner than the stochastic dominance relation holds. Since order means nothing in non-ordinal categorical data, this transforms the variable in a manner appropriate for the assumptions of the model to be valid.

Note that in PRIDIT analysis, the tasks of determining variable discriminatory ability, scoring of variables, and obtaining overall fraud suspicion scores are not done separately as in most other fraud analyses. Rather, our analysis shows that the principal component analysis of RIDITs technique introduced here provides an inner consistency between these aspects of fraud suspicion analysis, which is also related to measures of external validity.

Abstract

This article introduces to the statistical and insurance literature a mathematical technique for an a priori classification of objects when no training sample exists for which the exact correct group membership is known. The article also provides an example of the empirical application of the methodology to fraud detection for bodily injury claims in automobile insurance. With this technique, principal component analysis of RIDIT scores (PRIDIT), an insurance fraud detector can reduce uncertainty and increase the chances of targeting the appropriate claims so that an organization will be more likely to allocate investigative resources efficiently to uncover insurance fraud. In addition, other (exogenous) empirical models can be validated relative to the PRIDIT-derived weights for optimal ranking of fraud/nonfraud claims and/or profiling. The technique at once gives measures of the individual fraud indicator variables’ worth and a measure of individual claim file suspicion level for the entire claim file that can be used to cogently direct further fraud investigation resources. Moreover, the technique does so at a lower cost than utilizing human insurance investigators, or insurance adjusters, but with similar outcomes. More generally, this technique is applicable to other commonly encountered managerial settings in which a large number of assignment decisions are made subjectively based on ‘‘clues,‘’ which may change dramatically over time. This article explores the application of these techniques to injury insurance claims for automobile bodily injury in detail.