Abstract

There are two main questions when assessing a woman for interventions to reduce her
risks of developing or dying from breast cancer, the answers of which will determine
her access: What are her chances of carrying a mutation in a high-risk gene such as
BRCA1 or BRCA2? What are her risks of developing breast cancer with or without such a mutation?
These risks taken together with the risks and benefits of the intervention will then
determine whether an intervention is appropriate. A number of models have been developed
for assessing these risks with varying degrees of validation. With further improvements
in our knowledge of how to integrate risk factors and to eventually integrate further
genetic variants into these models, we are confident we will be able to discriminate
with far greater accuracy which women are most likely to develop breast cancer.

Introduction

Breast cancer is the most common form of cancer affecting women. One in eight to one
in 12 women will develop the disease in their lifetime in the developed world. Every
year over 44,000 women develop the disease in the UK (population 61 million) and more
than 12,500 die from it [1].

While the widely quoted general population risk of breast cancer (one in eight to
one in 12) is a lifetime risk, the risk in any given decade is never greater than
one in 30. Furthermore, the proportion of all female deaths due to breast cancer per
decade is never greater than 20%. The proportion is greatest in middle age, from 35
to 55 years, with cardiovascular deaths exceeding breast cancer deaths at all older
ages and lung cancer causing more cancer deaths in women in the age group 60–85 years
[2]. These comparisons underline the need for risk models for breast cancer and also
the need to put these risks in the perspective of other diseases.

The presence of a significant family history is a highly important risk factor for
the development of breast cancer. Even at extremes of age, the presence of a BRCA1 mutation will confer much greater risk than population risks. For instance, a 25-year-old
woman who carries a mutation in BRCA1 has a greater risk within the next decade than a woman aged 70 years from the general
population. About 4–5% of breast cancer is thought to be due to inheritance of a dominant
cancer-predisposing gene [3,4]. While hereditary factors are virtually certain to play a part in a high proportion
of the remainder, these are harder to evaluate at present, although genome-wide association
studies are likely to unravel these in the next 10 years [5]. Except in very rare cases such as Cowden's disease [6], there are no phenotypic clues that help to identify those who carry pathogenic mutations.
Evaluation of the family history therefore remains necessary to assess the likelihood
that a predisposing gene is present within a family. Inheritance of a germline mutation
or deletion in a predisposing gene results in early-onset, and frequently bilateral,
breast cancer. Certain mutations also confer an increased susceptibility to other
malignancies, such as cancers of the ovary, and sarcomas [7-9]. Multiple primary cancers in one individual or related early-onset cancers in a pedigree
are highly suggestive of a predisposing gene. Indeed, we have recently shown that
at least 20% of breast cancer patients aged 30 years and younger are due to mutations
in the known high-risk genes BRCA1, BRCA2 and TP53 [10].

Although deaths from breast cancer have been decreasing in many western countries,
the incidence of the disease is continuing to increase. In particular, breast cancer
rates are rising rapidly in countries with an historically low incidence, making it
presently the world's most prevalent cancer [2]. The increase in incidence is almost certainly related to changes in dietary and
reproductive patterns associated with western lifestyles. These are not just a reflection
of an ageing population obtaining extra surveillance, as age-specific risks in eastern
countries adopting more western lifestyles are increasing [2]. Indeed, there is evidence from genetic studies in the United States, Iceland and
the United Kingdom of a threefold increase in incidence not only in the general population,
but also in those at the highest level of risk with BRCA1 and BRCA2 mutations in the past 80 years [11-13]. There is a need not only to predict which women will develop the disease, but also
to apply drug and lifestyle measures in order to prevent the disease.

Types of risk assessment

There are two main types of risk assessment:

• The chances of developing breast cancer over a given timespan, including the lifetime.

• The chances of their being a mutation in a known high-risk gene such as BRCA1 or BRCA2.

While some risk-assessment models are aimed primarily at solving one of the questions,
many also have an output for the other. For instance, the BRCAPRO model is primarily
aimed at assessing the mutation probability but can have an output to assess breast
cancer risk over time. The Cuzick–Tyrer model was developed to assess breast cancer
risk over time but does have a readout for BRCA1/2 probability for the individual. To assess breast cancer risks over time as accurately
as possible, all known risk factors for breast cancer need to be assessed.

Risk factors

Family history of breast cancer in relatives

• Age at onset of breast cancer.

• Bilateral disease.

• Degree of relationship (first or greater).

• Multiple cases in the family (particularly on one side).

• Other related early-onset tumours (for example, ovary, sarcoma).

• Number of unaffected individuals (large families with many unaffected relatives
will be less likely to harbour a high-risk gene mutation).

Hormonal and reproductive risk factors

Hormonal and reproductive factors have long been recognised to be important in the
development of breast cancer. Prolonged exposure to endogenous oestrogens is an adverse
risk factor for breast cancer [14-22]. Early menarche and late menopause increase breast cancer risk as they prolong exposure
to oestrogen and progesterone.

Long-term combined hormone replacement therapy treatment (> 5 years) after the menopause
is associated with a significant increase in risk. However, shorter treatments may
still be associated with risk to those with a family history [15]. In a large meta-analysis the risk appeared to increase cumulatively by 1–2% per
year, but to disappear within 5 years of cessation [16]. The risk from oestrogen-only hormone replacement therapy appears much less and may
be risk neutral [17-20]. Another meta-analysis also suggested there may be a 24% increase in the risk of
breast cancer both during current use of the combined oral contraceptive and 10 years
post use [14].

The age at first pregnancy influences the relative risk of breast cancer as pregnancy
transforms breast parenchymal cells into a more stable state, potentially resulting
in less proliferation in the second half of the menstrual cycle. As a result, early
first pregnancy offers some protection, while women having their first child over
the age of 30 have double the risk of women delivering their first child under the
age of 20 years.

Hormonal factors may indeed have different effects on different genetic backgrounds.
It has been suggested that in BRCA2 mutation carriers, for example, an early pregnancy does not confer protection against
breast cancer [21]. Most studies are now, however, showing that risk factors in the general population
have a similar effect on those women with BRCA1/2 mutations [22,23].

Other risk factors

A number of other risk factors for breast cancer are being further validated. Obesity,
diet and exercise are probably interlinked [24,25]. Mammographic density is perhaps the single largest risk factor that is assessable
but may have a substantial heritable component [26]. Other risk factors such as alcohol intake have a fairly small effect, and protective
factors such as breast-feeding are also of small effect unless a number of years of
total feeding have taken place. None of these factors are currently incorporated into
available risk assessment models

Risk factors included

Current risk-prediction models are based on combinations of risk factors and have
good overall predictive power, but are still weak at predicting which particular women
will develop the disease. New risk-prediction methods are likely to result from examination
of a range of high-risk genes as well as single nucleotide polymorphisms in several
genes associated with lower risks [5]. These methods would be married in a prediction programme with other known risk factors
to provide a far more accurate individual prediction.

At present, many of the known nonfamily-history risk factors are not included in risk
models (Table 1). In particular, perhaps the greatest factor apart from age – mammographic density
[26] – is not yet included. Further studies are in progress to determine whether inclusion
of additional factors into existing models, such as mammographic density, weight gain
[25] and serum steroid hormone measurements [26], will improve prediction. These are not straightforward additions as there may be
significant interactions between risk factors and, although breast density is an independent
risk factor for BRCA1 and BRCA2 cancer risk [25], the density itself may be heritable and may not increase risk in a similar way in
the context of family history of breast cancer alone.

Table 1. Known risk factors and their incorporation into existing risk models

Breast cancer risk over time

Manual risk estimation

One of the best ways to assess risk is to consider the strongest risk factor, which
in many assessment clinics is family history. If first-line risk can be assessed on
this basis, then adjustments can be made for other factors [28,29].

Manual breast cancer risk assessment is largely based on the published Claus risk
tables [30] and use of data in clearcut BRCA1/2 families from penetrance data for breast cancer [31]. At the very least, a good manual assessment will alert the assessor to any spurious
readout from a computer model.

Risk estimation models

Until recently the two most frequently used models were the Gail model and the Claus
model.

Gail model

Gail and colleagues [32,33] described a risk-assessment model that focuses primarily on nongenetic risk factors,
with limited information on family history. The model is an interactive tool designed
by scientists at the National Cancer Institute and at the National Surgical Adjuvant
Breast and Bowel Project to estimate a woman's risk of developing invasive breast
cancer. The risk factors used were age at menarche, age at first live birth, number
of previous breast biopsies, and number of first-degree relatives with breast cancer.
A model of relative risks for various combinations of these factors was developed
from case–control data from the Breast Cancer Detection Demonstration Project.

Individualised breast cancer probabilities from information on relative risks and
the baseline hazard rate are generated. These calculations take competing risks and
the interval of risk into account. The data depend on having periodic breast examinations.
The Gail model was originally designed to determine eligibility for the Breast Cancer
Prevention Trial, and has since been modified (in part to adjust for race) and made
available on the National Cancer Institute website [34]. The model has been validated in a number of settings and probably works best in
general assessment clinics, where family history is not the main reason for referral
[27,33,35].

The major limitation of the Gail model is the inclusion of only first-degree relatives,
which results in underestimating risk in the 50% of families with cancer in the paternal
lineage and also takes no account of the age of onset of breast cancer. As such it
performed less well in our own validation set from a family history clinic (Table
1), substantially underestimating risk overall and in most subgroups assessed [27].

Claus model

Claus and colleagues developed a risk model for familial risk of breast cancer in
a large population-based, case–control study conducted by the Centers for Disease
Control [3]. The data were based on 4,730 histologically confirmed breast cancer cases aged 20–54
years and on 4,688 controls who were frequency matched to cases on the basis of both
geographic region and 5-year categories of age. Family histories were obtained through
interviews with the cases and controls, regarding breast cancer in mothers and sisters.

The authors' segregation analysis provided evidence for the existence of a single
rare autosomal dominant allele carried by one in 300 people leading to increased susceptibility
to breast cancer. The effect of genotype on the risk of breast cancer was shown to
be a function of a woman's age. Carriers of the risk allele were at greater risk at
all ages, although the ratio of age-specific risks was greatest at young ages and
declined steadily thereafter. The proportion of cases predicted to carry the allele
was highest (36%) among cases aged 20–29 years. This proportion gradually decreased
to 1% among cases aged 80 years or older. The cumulative lifetime risk of breast cancer
for women who carried the susceptibility allele was predicted to be high, approximately
92%, while the cumulative lifetime risk for noncarriers was estimated to be 10% [3].

Three years after publication of the model, lifetime risk tables for most combinations
of affected first-degree and second-degree relatives were published [30]. Although these do not give figures for some combinations of relatives (for example,
mother and maternal grandmother), an estimation of this risk can be garnered using
the mother–maternal aunt combination. An expansion of the original Claus model estimates
breast cancer risk in women with a family history of ovarian cancer [36]. The major drawback of the Claus model is that it does not include any of the nonhereditary
risk factors.

Concordance of the Gail and Claus models has been shown to be relatively poor, with
the greatest discrepancies seen with nulliparity, multiple benign breast biopsies,
and a strong paternal or first-degree family history [37,38]. Indeed, a particular problem with the use of the Claus model is the discrepancy
in results obtained when using the published tables [30], compared with computerised versions of the model [27-29,39]. While the tables make no adjustments for unaffected relatives, the computerised
version is able to reduce the likelihood of the 'dominant gene' with an increasing
number of affected women. The tables give consistently higher risk figures than the
computer model, however, suggesting that either a population risk element is not added
back into the calculation or that the adjustment for unaffected relatives is made
from the original averaged figure rather than from assuming that each family will
have already had an 'average' number of unaffected relatives. The latter appears to
be the probable explanation, as inputting families with zero unaffected female relatives
gives risk figures close to the Claus table figure.

Another potential drawback of the Claus tables is that they reflect risks for women
in the 1980s in the USA. These are lower than the current incidence in both North
America and most of Europe. As such, an upward adjustment of 3–4% for lifetime risk
is necessary for lifetime risks below 20%. Our own validation of the Claus computer
model showed that it substantially underestimated risks in the family history clinic.
Manual use of the Claus tables, however, provided accurate risk estimation (Table
1) [27]. A modified version of the Claus model has now been validated as the 'Claus extended'
model, by adding risk for bilateral disease, ovarian cancer and three or more affected
relatives [40].

BRCAPRO model

Parmigiani and colleagues [41] developed a Bayesian model that incorporated published BRCA1 and BRCA2 mutation frequencies, cancer penetrance in mutation carriers, cancer status (affected,
unaffected, or unknown), and age of the consultee's first-degree and second-degree
relatives. An advantage of this model is that it includes information on both affected
and unaffected relatives. In addition, it provides estimates for the likelihood of
finding either a BRCA1 mutation or a BRCA2 mutation in a family. An output that calculates breast cancer risk using the likelihood
of BRCA1/2 can be utilised [27]. None of the nonhereditary risk factors can yet be incorporated into the model (Table
1).

The major drawback from the breast cancer risk-assessment aspect is that no other
'genetic' element is allowed for. As such, this model will underestimate risk in breast-cancer-only
families. The BRCAPRO model produced the least accurate breast cancer risk estimation
from our family history clinic validation [27]. The model predicted only 49% of the breast cancers that actually occurred in the
screened group of 1,900 women.

Cuzick–Tyrer model

Until recently, no single model integrated family history, surrogate measures of endogenous
oestrogen exposure and benign breast disease in a comprehensive fashion. The Cuzick–Tyrer
model, based partly on a dataset acquired from the International Breast Intervention
Study and other epidemiological data, has now done this. The major advantage over
the Claus model and the BRCAPRO model is that the Cuzick–Tyrer model allows for the
presence of multiple genes of differing penetrance. It does produce a readout of BRCA1/2, but also allows for a lower penetrance of BRCAX. As can be seen in Table 1, the Cuzick–Tyrer model addresses many of the pitfalls of the previous models; significantly,
the combination of extensive family history, endogenous oestrogen exposure and benign
breast disease (atypical hyperplasia). In our validation process, the Cuzick–Tyrer
model performed by far the best at breast cancer risk estimation [27].

Model validation

The goodness of fit and discriminatory accuracy of the above four models was assessed
using data from 1,933 women attending the Family History Evaluation and Screening
Programme in Manchester, UK, of which 52 developed cancer. All models were applied
to these women over a mean follow-up of 5.27 years to estimate the risk of breast
cancer. The ratios of expected to observed numbers of breast cancers (95% confidence
interval) were 0.48 (0.37–0.64) for the Gail model, 0.56 (0.43–0.75) for the Claus
model, 0.49 (0.37–0.65) for the BRCAPRO model and 0.81 (0.62–1.08) for the Cuzick–Tyrer
model (Table 1). The accuracy of the models for individual cases was evaluated using receiver-operating
characteristic curves. These showed that the area under the curve was 0.735 for the
Gail model, 0.716 for the Claus model, 0.737 for the BRCAPRO model and 0.762 for the
Cuzick–Tyrer model.

The Cuzick–Tyrer model was the most consistently accurate model for prediction of
breast cancer. The Gail, Claus and BRCAPRO models all significantly underestimated
risk, although with a manual approach the accuracy of Claus tables may be improved
by making adjustments for other risk factors ('manual method') by subtracting from
the lifetime risk for a positive endocrine risk factor (for example, a lifetime risk
may change from one in five to one in four with late age of first pregnancy). The
Gail, Claus and BRCAPRO models all underestimated risk, particularly in women with
a single first-degree relative affected with breast cancer. The Cuzick–Tyrer model
and the manual model were both accurate in this subgroup. Conversely, all the models
accurately predicted risk in women with multiple relatives affected by breast cancer
(that is, two first-degree relatives and one first-degree relative plus two other
relatives). This implies that the effect of a single affected first-degree relative
is higher than may have been previously thought. The Gail model is likely to have
underestimated risk in this group as it does not take into account the age at breast
cancer and most women in our single first-degree relative category had a relative
diagnosed at younger than 40 years of age. The BRCAPRO, Cuzick–Tyrer and manual models
were the only models to accurately predict risk in women with a family history of
ovarian cancer. As these were the only models to take account of ovarian cancer in
their risk-assessment algorithm, this confirmed that ovarian cancer has a significant
effect on breast cancer risk.

The Gail, Claus and BRCAPRO models all significantly underestimated risk in women
who were nulliparous or whose first live birth occurred after the age of 30 years.
Moreover, the Gail model appeared to increase risk with pregnancy at age < 30 years
in the familial setting. It is not clear why such a modification to the effects of
age at first birth should be made, unless it is as a result of modifications to the
model made after early results suggested an increase with BRCA1/2 mutation carriers [21]. The Gail model, however, has determined an apparent increase in risk with early
first pregnancy and it would appear to be misplaced from our results, and from subsequent
studies published on BRCA1/2 [22,23]. Furthermore, the Gail, Claus and BRCAPRO models also underestimated risk in women
whose menarche occurred after the age of 12. The Cuzick–Tyrer and manual models accurately
predicted risk in these subgroups. These results suggest that age at first live birth
has also an important effect on breast cancer risk, while age at menarche perhaps
has a lesser effect. The effect of pregnancy at age < 30 years appeared to reduce
risk by 40–50% compared with an older first pregnancy or late-age nulliparities, whereas
at the extremes of menarche there was only a 12–14% effect.

Our study remains the only one to validate a risk model prospectively, and clearly
further such studies are necessary to gauge the accuracy of these and newer models.
Indeed, the tendency to modify models to adapt for new risk factors without prospective
revalidation in an independent dataset is a problem and can lead to erroneous risk
prediction.

BOADICEA model

Using segregation analysis, a group in Cambridge UK have derived a susceptibility
model – the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation
Algorithm (BOADICEA) model – in which susceptibility is explained by mutations in
BRCA1 and BRCA2 together with a polygenic component reflecting the joint multiplicative effect of
multiple genes of small effect on breast cancer risk. The group has shown that the
overall familial risks of breast cancer predicted by the model are close to those
observed in epidemiological studies. The predicted prevalences of BRCA1 and BRCA2 mutations among unselected cases of breast and ovarian cancer were also consistent
with observations from population-based studies. They also showed that their predictions
were closer to the observed values than those obtained using the Claus model and the
BRCAPRO model. The predicted mutation probabilities and cancer risks in individuals
with a family history can now be derived from this mode. Early validation studies
have been carried out on mutation probability but not yet on cancer risk prediction.

BRCA1/2 risk estimation

A number of models/scoring systems have been derived to assess the probability of
a BRCA1 mutation or a BRCA2 mutation in a given individual dependent on their family history. Some of the earlier
models, such as the Couch model [42] and the Shattuck-Eidens model [43], were derived before widespread genetic testing had been performed. Two tabular scoring
systems have been derived from the Myriad laboratories genetic testing programme [44,45], with the second based on testing in over 10,000 individuals [45]. The most widely used and validated model is the BRCAPRO model [46-48], which requires computer entry of the family history information. More simple scoring
systems have been developed, such as the Manchester system [49]. While simple tabular or scoring systems are easy to use and can generate probabilities
in 1–2 minutes, computer-based programmes take 10–20 minutes to input. Nonetheless,
these may well be carried out in clinics in order to generate pedigrees and store
family information. Model-based approaches are also able to take into account unaffected
relatives.

Validation studies for the BRCA1/2 risk estimation models are much more widespread than for breast cancer risk over time
[46-55]. Perhaps the most useful aspect of these is the development of a cutoff point for
the intervention of a genetic test at the 10% or 20% level. An assessment of using
a cutoff point for several of the models is presented in Table 2. In practice, most genetic testing has been carried out on high-risk families. While
a pretesting assessment of the chances of BRCA1/2 involvement is useful, it does not alter the decision-making of whether or not to
test a family member if there is a difference between a 20% chance and a 60% chance
of a mutation.

Table 2. Sensitivity and specificity at the 10% cutoff point for prediction of identifying
a BRCA1/2 mutation in non-Ashkenazi Jewish families for four models/scoring systems based on
four validation studies

With genetic testing for BRCA1/2 costing around $3,000, insurance companies and healthcare systems require a threshold
for test use. In the United Kingdom this is set at 20% mutation probability [56], but in most of the rest of Europe and North America this is 10%. In order to adequately
assess the models at a 10% threshold, a range of families is necessary to test around
the threshold. Ideally around 10% of the samples should be mutation positive. As can
be seen from Table 2, apart from our own study [50] all the remaining models had detection rates above 20% [53-55]. None of the models, tables or scoring systems work perfectly and they need to be
adjusted for new information such as the triple-negative grade 3 breast cancer histology
associated with BRCA1. For a simple first-line test in the clinic, using a scoring system [50] or using a table [45] will at least give a guide as to whether a family will qualify for testing. With
improvements in the computer models such as the BOADICEA model, we will hopefully
achieve a more accurate and discriminatory cutoff point.

Conclusion

There are a number of models available to assess both breast cancer risk and the chances
of identifying a BRCA1/2 mutation. Some models perform both tasks, but none are yet totally discriminatory
as to which family has a mutation and who will develop breast cancer. Improvements
in the models are being made, but these require revalidation processes. The discovery
of all the alleles associated with breast cancer risk will add a new layer of complexity
to all these models.