Background: The aim of this study was to evaluate the calibration and discriminatory power of three predictive
models of breast cancer risk.
Methods: We included 13,760 women who were first-time participants in the Sabadell-Cerdanyola Breast Cancer
Screening Program, in Catalonia, Spain. Projections of risk were obtained at three and five years for invasive cancer
using the Gail, Chen and Barlow models. Incidence and mortality data were obtained from the Catalan registries.
The calibration and discrimination of the models were assessed using the Hosmer-Lemeshow C statistic, the area
under the receiver operating characteristic curve (AUC) and the Harrell’s C statistic.
Results: The Gail and Chen models showed good calibration while the Barlow model overestimated the number of
cases: the ratio between estimated and observed values at 5 years ranged from 0.86 to 1.55 for the first two models
and from 1.82 to 3.44 for the Barlow model. The 5-year projection for the Chen and Barlow models had the highest
discrimination, with an AUC around 0.58. The Harrell’s C statistic showed very similar values in the 5-year projection
for each of the models. Although they passed the calibration test, the Gail and Chen models overestimated the
number of cases in some breast density categories.
Conclusions: These models cannot be used as a measure of individual risk in early detection programs to
customize screening strategies. The inclusion of longitudinal measures of breast density or other risk factors in joint
models of survival and longitudinal data may be a step towards personalized early detection of BC.