Risk assessment tools and criminal reoffending: Does bias determine who is “high risk”?

In The Netherlands in 1993, a man named Thomas was convicted of an arson in which no one was injured. Thomas was sentenced to four years in prison and involuntary treatment in a secure mental health facility. Based solely on professional opinion, Thomas was deemed too dangerous for release. If risk assessment tools had been available and used to evaluate Thomas, he would likely have been deemed a low risk to reoffend [1]. On the other hand, the professionals were firmly convinced that Thomas was dangerous - would risk assessment tools, therefore, have led to a different outcome?

What is the likelihood that someone who has committed a crime will reoffend? Courts and review boards (e.g., parole boards) seeking an answer to this question often want the input of a criminal justice or mental health professionals, whose decisions affect both the freedom of individuals and public safety [2].However, unstructured professional judgments about risk, which are based solely on a professional’s expertise, are frequently wrong. In fact, a report published in 1981 indicated clinicians’ predictions about the risk of future violent behavior by mentally ill offenders who were released into the community were correct in only about one out of three cases [3]. Therefore, researchers have made significant efforts in developing and improving structured risk assessment tools with the
goal of improving the accuracy of risk predictions.

The use of risk assessment tools is now common practice in many aspects of criminal justice decision-making. Risk assessment tools are used by police, probation officers, psychologists, and psychiatrists to assess the risk of criminal offending, sexual offending, and violent offending in at least 44 countries [4]. These tools are also often used by professionals who make recommendations about whether an offender should be placed in long-term psychiatric care in a number of countries (e.g.,
sex offender civil
commitment in the United States [5]; preventive detention in Canada [6]; and court-ordered hospitalization for long-term treatment (TBS) in the Netherlands [7]).

What risk assessment tools are and why they are used

Extensive research indicates that risk assessment tools improve the accuracy of professional judgments about the likelihood of future criminal behavior, violence, and sexual offending [8].Actuarial risk assessment instruments (ARAIs) are one type of tool, and they are based on statistical models of weighted factors supported by research as being predictive of the likelihood of future offending. A risk score is calculated by assigning numeric values to risk factors such as criminal history, mental illness, and substance abuse problems, among many others. Some actuarial risk assessment tools include only static/historical risk factors, such as age of the offender and criminal history. However, some ARAIs also measure dynamic, changeable factors, such as pro-criminal attitudes. Actuarial risk assessment instruments tend to vary widely regarding the precise risk factors that are measured. Decisions about which factors to include depend on the tool developer, research findings, and the type of risk being measured. A second category of risk assessment tools involves structured professional judgment (SPJ), which guides the evaluator’s consideration of both static and dynamic risk factors, as well as the evaluator’s formulation of a risk management plan. However, unlike ARAIs, SPJ tools do not produce an automatic “risk score”; the evaluator determines the final risk estimate and makes suggestions about how the risk might be managed [9].

Structured risk assessment tools are also believed to reduce the likelihood the evaluator’s estimate of an offender’s reoffending risk will be influenced by bias. Bias is a systematic error in reasoning or logic that occurs as the result of the
automaticity with which the human mind processes information based on expectations and experience [10]. Perhaps the most well-known example of this phenomenon is confirmation bias, which occurs when attention is drawn to evidence that supports a favored scenario or outcome, while evidence that weakens or contradicts the preferred hypothesis is discounted or ignored altogether [11]. For example, research suggests that criminal investigators and police trainees tend to view evidence such as witness statements,
DNA evidence, and photo evidence as less credible and reliable if the evidence contradicts their beliefs about the guilt of the suspect [12] [13].

Structured risk assessment tools and bias

Bias on the part of an evaluator engaged in unstructured professional judgment [14] is believed to contribute to inaccurate predictions of risk [15]. Bias on the part of justice system officials who use their professional judgment to assess an individual’s risk of reoffending is also thought to contribute to the unfair treatment of minority groups [16] [17] [18].Therefore, proponents of the use of risk assessment tools (particularly ARAIs) in criminal justice decision-making believe the tools will reduce the chances that a criminal offender will be treated unfairly based on
stereotypes and bias [19].

However, there is currently limited information about whether risk assessment tools cure the influence of bias [20]. In fact, as discussed in this article, research about the use of ARAIs indicates several common practices that compromise both the accuracy and objectivity of risk evaluations and recommendations about interventions or treatment to help reduce risk. These findings suggest that a presumption of objectivity in the risk estimate or recommendations, solely because they appear to be the product of a structured risk assessment tool, is unwarranted.

There are two primary categories of research that address questions of whether and how bias affects the use of structured risk assessment tools. The first category we will refer to as “objectivity research.” Objectivity research addresses the influence of internal and external factors on evaluators’ use of risk assessment tools.

The second category we will refer to as “outcome research.” Outcome research relates to how risk assessment tools are completed and how risk scores are used by criminal justice actors (e.g., judges, probation officers) to make their decisions. When decisions are not aligned with risk scores, this suggests the potential influence of bias, which would undermine the accuracy of the very tools intended to curb its influence. In other words, the simple fact of completing a structured risk assessment instrument may not necessarily eliminate the influence of bias in an evaluator’s conclusions about an offender’s level of risk for reoffending.

Objectivity research: Moral judgments and being on the “right side”

The process of a risk evaluation can be influenced by the personal characteristics or attitudes of the evaluator (internal) and external sources of information that are of little or no relevance to accurate predictions of risk [21]. An example of an internal source of bias is when an evaluator is more strongly influenced by his or her moral judgment of the offender or his actions than by risk factors relevant to reoffending. Context is an example of an external source of potential bias, as is demonstrated when different evaluators reach different conclusions about the same offender, depending on the conclusion most favorable to the party by whom the evaluator is being paid.

Moral judgments by the evaluator about sexual offenders may affect his or her judgments about an offender’s risk of reoffending. In one recent study, for example, forensic psychologists and psychiatrists (n = 151) in the Netherlands generally assigned more importance to risk factors for sexual offenses that carry a negative moral connotation than to factors that do not [22]. Two factors with a negative moral connotation are an offender’s lack of
empathy for the victim and lack of motivation for treatment. Experts assigned more weight to these factors than to other factors that are empirically more predictive of risk, such as never having an intimate relationship or exhibiting behavioral problems at school [23]. There is no demonstrated empirical link between lack of victim
empathy and sexual offense recidivism [24], but this lack of
empathy may be perceived as morally wrong (i.e., going against generally accepted values), thereby influencing the evaluators to consider these factors as important. Although this particular study did not utilize a risk assessment tool, the standard among the participating professionals is to utilize SPJ tools that require professional judgment about the importance of various risk factors [25]. These findings suggest that some experts may be influenced more by the moral dimensions of certain risk factors than by those factors for which there is more substantive scientific support.

Contextual factors as potential facilitators of bias can influence what an evaluator observes, his or her
perception or interpretation of information, and thereby the conclusions he or she makes [26]. Studies about the scoring of ARAIs indicate that evaluators are not as objective regarding their observations and interpretations as they might believe [27]. For example, research regarding legal context has uncovered an “adversarial allegiance effect” [28] [29] [30]. The adversarial allegiance effect suggests ARAIs can be influenced by an evaluator’s
commitment to a particular legal outcome [31], or subtle pressure to reach a particular conclusion [32]. In plain terms, evaluators appear to be influenced in their risk evaluations by the side that hired them (i.e., the prosecution or the defense) [33]. Although this effect appears to be more significant with factors that require professional judgment on the part of the evaluator (e.g., judgments of attitudes or mental status of the offender), the effect is still measurable with the use of tools that assess only static factors (e.g., number of previous offenses) [34]. What is surprising about this finding is that static factors should generate the same score among evaluators.

The research regarding moral judgments and adversarial allegiance does not imply that professionals intentionally manipulate results, but rather that they may not always be aware of the factors that influence their coding decisions or ultimate risk judgments. Limiting evaluator exposure to potentially biasing information (e.g., the referral source) and conscious efforts by the evaluator to consider alternative explanations early in the process may help minimize the effects of bias [35].

Case study: At the time Thomas was convicted, many psychological professionals believed that people who committed arson were very disturbed and sexually perverse individuals at very high risk to reoffend [36]. How might these views have affected scoring if risk assessment tools had been used to evaluate him?

Outcome research: Overrides, outgroups, and inertia

Criminal justice professionals who use ARAIs sometimes question the results and decide to override them [37]. A professional override in an ARAI is a discretionary decision to lower or increase the final risk judgment obtained by scoring the ARAI. When overrides are used to increase the risk category, this may be in an effort to protect the evaluator from potential blame if an offender goes on to commit a new crime [38]. Although some manuals for administering ARAIs allow evaluators to override the results based on their professional judgment [39], research indicates these decisions tend to significantly decrease the accuracy of the risk prediction [40] [41] [42], and therefore should be used only in exceptional cases. The inadequately justified use of overrides tends to decrease the predictive accuracy of ARAIs [43] [44], and runs contrary to one of the primary reasons why risk assessment tools were developed.

Unfortunately, professional override decisions may also create opportunities for bias to overshadow the presumed objective nature of ARAIs [45]. For example, research indicates that overrides are not applied equally across different demographic groups. At least two published studies have found that overrides in risk assessment of juveniles for decisions about detention were associated with demographic characteristics, such as race and
gender [46] [47]. For example, in one study, African American youths were approximately 33% less likely to receive a downward override than white youths [48]. It is troubling that if overrides are applied in a biased manner, racial and ethnic minorities will likely face the negative consequences of reduced accuracy related to overrides.

Evaluators also tend to use professional overrides to increase, rather than decrease, the risk level more often for some types of offenders than for others [49]. For example, studies reveal overrides are used in 33-74% of cases involving sexual offenders compared to 15-41% of nonsexual offenders [50] [51]. Professional overrides may therefore implicate biased judgments about an offender. Requiring evaluators to document the justification for an override should improve accountability in these risk judgments by making the stated reasons for the override available for review. Oversight of the stated justifications for overrides would enable identification of override patterns and when their use is appropriate.

Case study: Given the views about arsonists at the time Thomas was convicted, an evaluator might nevertheless have decided to override the results of a risk assessment tool based on prevailing inaccurate beliefs about arson offenders.

Sometimes professionals simply disregard the results of the tool completely in making treatment and supervision recommendations [52] [53] [54] [55]. For example, a survey of over one thousand probation officers in the United States revealed that despite scoring a risk assessment instrument, about 40% of the officers reported they were unlikely to base their decisions or recommendations on the score [56]. Although using a risk assessment tool is no guarantee of objectivity in the evaluation, deliberately ignoring the results to substitute one’s own judgment compromises any potential benefits that might be realized by using a structured risk assessment approach.

Case study: If a risk assessment tool indicated that Thomas was a low risk to reoffend, that still may not have changed the judges’ decision that Thomas should spend an indeterminate amount of time in confinement in a psychiatric treatment facility.

Conclusion

Making accurate predictions about the likelihood of future criminal behavior is a complex task, and even the best statistical models of risk factors under ideal conditions yield accurate predictions in only about 66-74% of cases [57]. Furthermore, when evaluators are influenced by information unrelated to risk factors, or when they make adjustments to risk scores, the accuracy of the risk prediction decreases. Thomas was finally released in 2008, after spending fifteen years in confinement. Inaccurate risk assessment, as illustrated by Thomas’s case, can lead to significant unnecessary costs in terms of money and years of life lost.

The current objectivity and outcome research [58] [59] [60] [61] [62] suggests that when an evaluator uses a risk assessment tool, the results do not necessarily reflect an objective evaluation. Therefore, the identification of the sources and operation of evaluator bias and testing of the efficacy of debiasing strategies should be research priorities. Although the use of risk assessment tools improves accuracy over unstructured clinical judgment in estimating the likelihood of reoffending, the tools are not a panacea for bias. Risk assessment tools should be used to identify ways to manage and reduce risk - but their vulnerability to evaluator bias indicates they should not be used to justify significant deprivations of freedom.