Screening for Cervical Cancer: A Systematic Evidence Review for the U.S. Preventive Services Task Force [Internet].

3Results

Key Question 1. When Should Cervical Cancer Screening Begin, and Does This Vary By Screening Technology or By Age, Sexual History, or Other Patient Characteristics?

Various factors should be considered when determining the age of onset for cervical cancer screening and type of screening test to be used. These factors include the prevalence and incidence of CIN2, CIN3, and cervical cancer among women in their teens and early 20s, as well as the rate of progression of CIN2 and CIN3 to cervical cancer. Screening for cervical cancer or CIN2 and CIN3 may be of little net benefit if these conditions are rare or progress slowly in younger age groups. This is particularly true if screening for and treatment of preinvasive disease causes excess harm. For screening to be beneficial, it should lead to earlier detection of disease that, when treated, results in decreased morbidity and mortality from cervical cancer. Given HPV’s significant association with preinvasive and invasive disease of the cervix, the natural history of HPV infections in young women is also important, including persistence and progression from HPV infection to cervical cancer. All of this evidence informs whether women in their teens and early 20s should be screened for cervical cancer and, if so, how they should be screened.

The ideal study for determining when cervical cancer screening should begin would be an RCT in which women are randomized to begin screening at different ages and then followed for the development of CIN3 and cancer, including morbidity and mortality. We did not identify any RCTs evaluating the age at which screening should begin. We considered cohort studies that evaluated the incidence and prevalence of cervical cancer in young, screened populations by age and other risk factors, the natural history of CIN and HPV infection in young women, and the effects of population-based screening programs targeting young women. We identified one large, fair-quality, population-based, case-control study evaluating the association between cervical cancer screening at ages 20 to 69 and future cervical cancer detection among 11,901 women in the United Kingdom;23 two fair-quality cohort studies (from the United States104 and United Kingdom32) examining age-specific screening outcomes in 199,707 women aged 15 years and older; one fair-quality population-based correlational study evaluating screening in all 20- to 34-year-olds attending routine screening in Iceland;105 and one good-quality prospective cohort study (from the United Kingdom) describing the natural history of incident HPV and CIN infection in 1,075 15- to 19-year-olds106 (Appendix C Table 1). The best evidence comes from the case-control study and is supported by the cohort studies.

Although evidence is based on a limited number of studies, these studies show that high-risk HPV infections and cytologic abnormalities among women younger than age 20 years are common and transient, whereas CIN3+ is much less common in this group than in women aged 25 years and older. They also show that screening in younger women (younger than age 25) has lower detection rates and higher false positives than in older women. Screening these women also does not result in decreased incidence of cervical cancer among women younger than age 30 years.

A population-based, case-control study conducted within the United Kingdom’s National Health Service (NHS) used prospectively recorded data on cervical cancer screening to estimate the association between having an adequate smear test taken in a particular 3-year age group (such as ages 22 to 24) and the incidence of cervical cancer in a subsequent 5-year age group (such as ages 25 to 29).23 The study cohort included 4,012 women with invasive cancer and 7,889 controls (two controls per case). The authors found that cervical cancer screening among women younger than age 25 years was not associated with a decreased incidence of cervical cancer diagnosis prior to age 30 years. However, the authors could not rule out the possibility that screening women in the age group of 20 to 24 years would be effective in reducing stage IB+ cervical cancer in women aged 25 to 27 years, because the group was small (65 women) and thus confidence intervals were wide (OR, 0.52 [95% CI, 0.23 to 1.2]). A statistically significant protective effect of screening was not demonstrated until age 32 years, when screening was associated with a 45 percent reduction (OR, 0.55 [95% CI, 0.44 to 0.69]) in the incidence of ICC diagnosis between ages 35 and 39 years.23

This study’s major strength was that it was designed specifically to answer the question of when cervical screening should begin, measuring the association between age at screening and the outcome of greatest interest—cervical cancer incidence. Additional strengths of this study include a study population comparable to that of the United States and the extensive electronic database from which the data were abstracted. The authors’ use of random control selection minimized selection bias, and their use of prospectively recorded data on screening history reduced the recall bias that would otherwise be a weakness of the study’s retrospective design. However, confounding is another limitation of observational studies, and the authors did not report or adjust for many potential risk factors for cervical cancer.

A population-based study in Iceland evaluated the value of cervical cancer screening in the 20- to 34-year-old age group by analyzing trends in CIN2, CIN3, and cervical cancer.105 Iceland’s national cervical cancer screening program commenced in 1969 for women aged 25 to 69 years at 2- to 3-year intervals. After 1987, women aged 20 years or older were also invited to screening at 2-year intervals. In the years following the introduction of cervical cancer screening for women aged 20 to 24 years, the rate of ICC did not change among this age group. This rate did decrease significantly among women aged 35 to 39 years, however, with a stage shift toward earlier disease detection. In contrast, the detection rate of CIN2 and CIN3 increased among women aged 20 to 29 years, whereas detection of CIN2 increased among women aged 30 to 34 years but detection of CIN3 decreased.105 In Iceland, the usual practice is to observe smears with low-grade cytologic abnormalities (≤LSIL) and refer women with high-grade smears for colposcopy with biopsy and endocervical curettage. However, screening and treatment practices may change over time or vary across institutions or providers. So, at an ecologic level, it is difficult to establish causality or to attribute changes in CIN2, CIN3, and cervical cancer detection rates solely to the initiation of screening younger women. This study’s strength was the large timeframe over which the data were reviewed, which allowed for evaluation of shifts in trends of disease detection. This study and the UK study were rated fair quality. One advantage of the UK study’s design is the use of prospectively collected individual-level data rather than population data. The Icelandic study supports initiation of screening in women in their early 20s, whereas the UK study was limited in power to determine whether screening among this group of women is beneficial. Neither study provided sufficient detail to allow determination of a specific age at which screening should be initiated.

A large cohort study conducted among 150,052 women aged 15 years or older enrolled in a Kaiser Permanente health plan between 1997 and 2002 examined age-specific cervical cancer screening outcomes.104 In this population, the 25- to 29-year-old age group was the most frequently screened of all age groups—650 screened per 1,000 females enrolled compared to 217 per 1,000 for women aged 15 to 19 years.104 The likelihood of detecting CIN3 was lower in women younger than age 25 years, compared to women aged 25 to 29 years, but the risk of having a false-positive smear was higher. The proportion of smears yielding CIN3 was 0.2 percent for women aged 15 to 19 years and 0.2 percent for women aged 20 to 24 years, compared to 0.6 percent for women aged 25 to 29 years and 0.4 percent for women aged 30 to 39 years. False-positive smears occurred in 3.1 percent of women aged 15 to 19 years, 3.5 percent of women aged 20 to 24 years, 2.1 percent of women aged 25 to 29 years, and 2.6 percent of women aged 30 to 39 years. One limitation of these data is that they were drawn from a screened population of women with relatively stable health insurance, and therefore may not be generalizable to the U.S. population. The remaining included studies also evaluated populations of women with health insurance.

A second large cohort study, which provides important evidence on the prevalence of high-risk HPV by age in a population of women similar to those in the United States, evaluated 49,655 British women of any age presenting for routine screening between 1988 and 1993.32 This study’s goal was to describe the relationship between HPV detection at entry and cytologic and histologic followup. The 78,062 cervical samples obtained during the study were stratified according to the 12-month period in which they were taken and into 5-year age groups. HPV testing (PCR) was performed on an age- and period-stratified random sample to limit cost (n=6,462). The authors found that the prevalence of high-risk HPV was greatest in women aged 15 to 19 years (20%) and decreased with increasing age to 2.6 percent among women aged 50 to 54 years (Figure 6).32 Across all age groups, the prevalence of high-risk HPV positivity was much lower for women with normal smears than those with abnormal smears (17.2% vs. 73.7% among women aged 15 to 19 years; 1.6% vs. 40.7% for women aged 50 to 57 years). Although the prevalence of high-risk HPV peaked among adolescents, prevalent cases of CIN3+ peaked among women aged 35 to 39 years, and incident cases of CIN3+ peaked among women aged 25 to 29 years (Figure 6). Among women with no previous smear, the prevalence of CIN3+ was 0.2 percent among women aged 15 to 19 years compared to 1.7 percent among women aged 35 to 39 years. No prevalent cancer cases were identified among women younger than age 20 years. The annual incidence of new cases of CIN3+ for women with a screening interval of less than 5 years following a normal smear was 1.56/1,000 per year among women aged 15 to 19 years, peaked at 4.07/1,000 for women aged 25 to 29 years, and decreased with increasing age to the lowest incidence rate, which was 0.19/1,000 among women aged 60 to 64 years.

In a prospective cohort study of 1,075 British women aged 15 to 19 years with normal cytology and negative high-risk HPV tests, each woman was followed with serial smears and HPV testing at 6-month intervals.106 The study’s goal was to describe the natural history of incident HPV infection and its temporal relation to the occurrence of cytologic and histologic abnormality; it provides valuable information on the acquisition and remission of high-risk HPV among adolescents, and the risk of development of CIN2+ in relation to HPV status. All women with cytologic abnormalities underwent colposcopy and biopsy of abnormal areas. Treatment was postponed until there was histologic evidence of CIN2 or greater. The median number of visits was four, and median duration of followup was 29 months. This study demonstrated the frequent occurrence of new high-risk HPV infections and their transient nature as well as the transient nature of cytologic abnormalities among young women. The authors also identified a small percentage of young women who developed CIN2+ despite continuing to test negative for high-risk HPV.

During study followup, 38 percent of women became positive for any HPV type, and 26 percent of women became positive for high-risk HPV types (16, 18, 31, 33, 52, or 58). The cumulative risk at 3 years of any HPV type was 44 percent, and at 5 years the risk was 60 percent. The median duration of the first HPV positive episode was 13.7 months (interquartile range [IQR], 8.0 to 25.4) for any HPV type, 10.3 months (IQR, 6.8 to 17.3) for HPV 16, and 7.8 months (IQR, 6.0 to 12.6) for HPV 18. The cumulative risk at 3 years of any cytologic abnormality was 28 percent (95% CI, 25 to 32). The median duration of the first episode of cytologic abnormality was 8.7 months (IQR, 5.8 to 13.8). In this cohort, 28 women (2.6%) developed CIN2 (1.3%) or CIN3 (1.3%) during a median of 36 months of followup. Five of these women consistently tested HPV negative. The risk of being diagnosed with CIN2+ was 8 times greater for women who became HPV positive during followup than for those who remained negative (RR, 7.8 [95% CI, 2.7 to 22.0]).

We identified two RCTs,107,108 one cohort,109 and one cross-sectional study110 that provide data comparing LBC (ThinPrep) and CC. Only one RCT, a cluster-randomized trial rated as good quality (Netherlands ThinPrep versus Conventional Cytology [NETHCON]), set out with the primary purpose of comparing LBC and CC.108 The other RCT, the New Technologies for Cervical Cancer Study (NTCC) Phase I, rated as fair quality, was designed to compare CC with LBC in combination with HPV testing.107 Both provide relative test performance data only. The two remaining observational studies, both rated as fair quality, provide absolute test performance data, since colposcopy and/or biopsy was systematically applied to all women.109,110 The NTCC and NETHCON trials included a total of 134,162 eligible women, and the nonrandomized trials included a total of 7,404 women. All of the studies were conducted in primary care settings in nonU.S. populations (periurban South Africa, France, the Netherlands, and Italy) (Table 3).

The NTCC and NETHCON trials showed no difference between LBC and CC in relative detection ratio of CIN2+ or CIN3+ (Table 4).107,108 The NETHCON trial demonstrated no difference between LBC and CC in relative PPV for detection of CIN2+ and a higher PPV of borderline statistical significance (p=0.036) favoring LBC for the detection of CIN3+,108 while the NTCC trial demonstrated a lower PPV for LBC for the detection of both CIN2+ and CIN3+, compared to CC.107 The NTCC trial found a higher relative proportion of false-positive test results for LBC compared to CC (1.97 for detection of CIN2+ and 1.93 for detection of CIN3+),107 whereas the NETHCON trial found a slightly lower proportion of false-positive test results with LBC (0.90 for detection of CIN2+ and 0.89 for detection of CIN3+).108

The cluster-randomized NETHCON trial was designed to compare LBC to CC among women aged 30 to 60 years participating in the Dutch cervical screening program.108 Overall, this is a well-designed study with good applicability to the United States, and it provides the best available evidence to address KQ2 in terms of the use of LBC in a large cervical cancer screening program. Randomization was by clinical site, with 88,988 women at 246 family practices included in the analysis. Exclusion criteria were not reported. Followup for screen-positive women followed Dutch clinical guidelines, with colposcopy referral and directed biopsy for high-grade or persistent low-grade abnormalities.

Among 40,047 women with cytology results of ASC-US or worse, 280 cases of CIN2+ and 190 cases of CIN3+ were detected in the CC arm, of whom 420 underwent colposcopy only (n=2) or colposcopy and biopsy (n=418). In the LBC arm, 346 cases of CIN2+ and 253 cases of CIN3+ were detected among 48,941 women screened, of whom 484 underwent colposcopy only (n=4) or colposcopy and biopsy (n=480).108

The NETHCON trial found no significant difference between LBC and CC in the adjusted relative detection ratio (adjusted for age, site, urbanization, and study period) of either CIN2+ (1.00 [95% CI, 0.84 to 1.20]) or CIN3+ (1.05 [95% CI, 86 to 1.29]). The unadjusted relative PPV (adjusted results not provided by the authors) for CIN2+ was similar between the two screening tests (PPV for ASC-US+, 1.09 [95% CI, 0.95 to 1.25]; PPV for LSIL+, 1.04 [95% CI, 0.93 to 1.15]). For detection of CIN3+, the relative PPV for LBC bordered on statistical significance, compared to CC (PPV for ASC-US+, 1.17 [95% CI, 0.99 to 1.39]; PPV for LSIL+, 1.17 [95% CI, 1.01 to 1.36]). The relative false-positive proportion (RFPP) for LBC was 0.90 (95% CI, 0.82 to 0.99) for detection of CIN2+ and 0.89 (95% CI, 0.82 to 0.98) for detection of CIN3+, compared to CC.108

The NTCC study was not a randomized trial of LBC versus CC. Rather, this study was a randomized screening program of LBC plus the HC2HPV test (experimental group) versus CC (control group).107 The referral threshold for colposcopy was ASC-US for the experimental arm, and either ASC-US (72%) or LSIL (28%) for the control arm. Since the referral criterion differed for the two study groups, we present results for the centers that used the same referral criterion for both tests. In their comparison of LBC versus CC, the authors included CIN2 lesions or worse that were detected during the recruitment phase of the trial, within 1 year of referral to colposcopy.

Among women with cytology results of LSIL or worse, 70 cases of CIN2+ and 44 cases of CIN3+ were detected in the CC arm among 22,466 women, of whom 317 underwent colposcopy. In the LBC arm, 73 cases of CIN2+ were detected (relative detection ratio, 1.03 [95% CI, 0.74 to 1.43]; relative PPV, 0.58 [95% CI, 0.43 to 0.78]) and 32 cases of CIN3+ were detected (relative detection ratio, 0.72 [95% CI, 0.46 to 1.13]; relative PPV, 0.40 [95% CI, 0.26 to 0.62]) among 22,708 women screened, of whom 1,337 underwent colposcopy. Overall, more colposcopies were required in the LBC group (15/1,000 for CC vs. 27/1,000 for LBC). The relative detection ratio and PPV values noted for cytology results of ASC-US or worse (confidence intervals provided per correspondence with primary author, Dr. Ronco, on March 11, 2008) were also not significantly different.111 The RFPP for LBC compared to CC was 1.97 (95% CI, 1.75 to 2.21) for detection of CIN2+ and 1.93 (95% CI, 1.72 to 2.21) for detection of CIN3+ for cytology results of ASC-US or worse, and 1.80 (95% CI, 1.48 to 2.19) for detection of CIN2+ and 1.72 (95% CI, 1.42 to 2.07) for detection of CIN3+ for cytology results of LSIL or worse.107

One substantial limitation of this study is that the colposcopists were not blinded to study arm. Therefore, they would have known the women’s HPV test results.107 Furthermore, no data were provided to determine that randomization provided comparable groups for this secondary analysis in which some women were excluded from the LBC arm because of positive HPV test results, but not from the control arm (since this group was not tested for HPV).

Two studies, one conducted in South Africa109 and one in France,110 provided absolute test performance results for comparison of LBC and CC for the detection of CIN2+ (Table 4). Only the South African study provides data on detection of CIN3+.109 For the detection of CIN2+ and CIN3+, the sensitivity of both LBC and CC decreased with increasing cytologic threshold, whereas specificity increased. LBC and CC did not significantly differ in sensitivity, specificity, false positive rates, or PPV for detection of CIN2+ and CIN3+, although wide, overlapping confidence intervals suggest limited power to detect a difference in sensitivity.

The French study was limited by a split-sample study design, which could bias the results because the smear prepared first may have the best sample of cells.110 If that is so, then it might be expected that in this study, where the conventional smear was prepared first, the sensitivity of CC would be higher than LBC, but that was not the case. Both tests performed similarly.

The South African study has limited applicability, as the women in this study had never been screened prior to enrollment, which would be unusual for most U.S. women in the same age group.109 Second, a high proportion of women in the study were infected with HIV. Finally, 14.5 percent had recently been treated for CIN. The proportion of women with HIV infection and/or recent CIN treatment was similar in both arms, so this is unlikely to bias the results in the direction of either cytology method; however, it is unclear how the absolute test performance of either method was impacted. The method of cervical sampling was not randomized or blinded, so there is some potential for introduction of bias through unequal allocation. The strength of this study lies in the fact that the gold standard was systematically applied to all study participants after collection of the screening test, therefore limiting differential application of the gold standard and verification bias.

Unsatisfactory Slides

Both the NETHCON and NTCC trials demonstrated a lower proportion of unsatisfactory cytology samples for LBC than CC, with 0.37 and 2.6 percent of LBC slides considered unsatisfactory, compared to 1.09 and 4.1 percent of CC slides, respectively (Table 4).107,108 These findings are different than what had previously been demonstrated in the cohort and cross-sectional studies, in which LBC had more unsatisfactory samples. However, study design might explain these earlier results in at least one of the nonrandomized studies, in which the collected sample was first used to prepare the CC slide and the residual material was used to perform the LBC test.110

Key Question 3. What Are the Benefits of Using HPV Testing as a Screening Test, Either Alone or in Combination With Cytology, Compared With Not Testing for HPV?

We identified 22 unique studies in 48 publications that assessed the benefits of using HPV testing, either alone or in combination with cytology, as an initial screening or to triage abnormal initial screening cytology. These strategies were compared with cytology screening strategies that did not involve HPV testing. Results from these studies are summarized here, with more individual study details provided in Appendix C.

These studies address four different cervical cancer screening strategies using HPV: 1) primary screening with HPV test alone; 2) HPV testing with cytology triage of positive HPV (reflex cytology); 3) combination HPV and cytology testing (co-testing); and 4) cytology testing with HPV triage of positive cytology (reflex HPV). Within each HPV screening strategy, we found at least one fair- or good-quality RCT specifically testing that strategy compared with cytology (Tables 5a-c and 6). Although most trials evaluated only one type of HPV screening strategy, the Italian NTCC trial addressed two different HPV screening strategies through separate recruitment phases—combined HPV and cytology testing (Phase I)112 and HPV testing alone (Phase II).113 The HPV test used in most trials was HC2, to detect 13 high-risk types of HPV (types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68) at a positive test cut-off of >1 pg/ml, except for two trials that tested PCR using general primers GP5+ and GP6+ to detect 14 high-risk HPV types (types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, 66, and 68).114,115 Similarly, most trials compared various HPV screening strategies to cytology performed using CC, except for two that used LBC.116,117 Colposcopy referral threshold varied between studies, and in three studies, different cytology thresholds were used by different study sites.112,113,115

Characteristics of RCTs of Cytology Testing With HPV Triage of Positive Cytology (KQ3).

All RCTs except one118 (which enrolled previously unscreened women in rural India to evaluate one-time HPV testing versus cytology) were conducted in developed countries (i.e., United States, Italy, Sweden, England, the Netherlands, Finland) where cervical cancer screening is well established. The Sankaranarayanan trial is important in that it establishes a mortality benefit in reduced cervical cancer deaths (adjusted hazard ratio [HR], 0.52 [95% CI, 0.33 to 0.83]) with one-time HPV screening in never-screened Indian women aged 30 to 59 years compared to no screening. Trials addressed HPV screening strategies appropriate to unvaccinated women. Three trials limited recruitment to middle-aged women (excluding those younger than age 30 years or older than ages 56 to 64 years),114,115,118 while six included women younger than age 30 years.112,113,116,117,119,120 Only one study included women older than age 60 years.116 We provide data stratified by age where possible for two primary reasons: 1) the FDA has approved the use of HC2 in women aged 30 years and older as an adjunct to cytology to assess the absence or presence of high-risk HPV types;70,71 and 2) the prevalence of high-risk HPV is much lower in women aged 30 years and older than in women younger than age 30 years, dropping sharply from a prevalence of 35 percent for women aged 15 to 19 years to less than 15 percent for women aged 30 to 39 years (Figure 4).34

Five RCTs112–115,117 reported program results after two rounds of screening, while the other four reported results after a single round of screening.116,118–120 Treatment was generally offered for patients with CIN2+ histology, although in several studies this information was not clearly reported114,115,118 or a different threshold was used.119,120 Most RCTs reported relative results (estimating HPV screening strategy performance relative to cytology) by providing relative test performance characteristics and detection of CIN2+/CIN3+ for a single screening round and/or comparing cumulative disease detection after multiple screening rounds. The NTCC Phase I and II trials reported invasive cancer separately from CIN2 and CIN3,112,113 but the author provided recalculated results for CIN2+ and CIN3+ by age to allow cross-study comparability. One trial was designed to allow randomized comparison of a differing order in which cytology and HPV specimens were collected.121 This trial reported cross-sectional data most comparable with other observational studies, and is reported with these.

Primary Screening With HPV Test Alone

Countries with developed cervical cancer screening programs. One fair-quality RCT within the national screening program in Italy (NTCC Phase II) compared HC2 to cervical cytology for primary cervical cancer screening in 49,196 women aged 25 to 60 years (13,725 younger than age 35 years).113 In the second screening round 3 years later, both groups were screened with cytology alone (Table 5a). Immediate colposcopy referral occurred for positive HC2 tests or for ASC-US+ cytology (Table 5b). The author provided data reported here for CIN3+ and CIN2+, since published data separated out invasive cancer cases. Published outcome data reporting CIN3 and adenocarcinoma in situ (AIS) (with or without CIN2) are provided in Appendix C, along with authors’ analyses combining ICC results across two protocols in NTCC (HPV screening and HPV-LBC co-testing). Trial data are supplemented by one good-quality122 and five fair-quality cohort studies in community settings in the United States, Canada, Switzerland, Germany, and France (Table 7).110,121,123–125 These studies compared the sensitivity and specificity of one-time HC2 screening to cervical cytology (primarily ASC-US+) in 40,732 women aged 17 to 93 years, with less than 10 percent (n=3,301) younger than age 30 years.

Women aged 30 or 35 years and older. After two rounds of screening in NTCC Phase II (one round of HPV screening) and a median of 3.5 years of followup, cumulative detection of CIN3+ (CIN3, AIS, or ICC) was increased in 17,724 women screened with HC2 relative to 17,747 women screened with cytology alone (55 vs. 35 CIN3+ lesions; RR, 1.57 [95% CI, 1.03 to 2.40]), with about the same number of invasive cancer cases detected in both arms (HC2 arm: 4 ICC/AIS cases; cytology alone: 5 ICC/AIS cases) (Table 8a).113 Trial investigators pooled invasive cancer cases from these primary HC2 results (NTCC Phase II) with HC2-CC co-testing results (NTCC Phase I) due to insignificant statistical heterogeneity between trials.113 Pooled results suggested decreased invasive cancer in women aged 35 years and older who were screened with HPV (6 total ICC cases in the HPV screening arms compared to 15 in the CC only arms; p=0.052). However, cancer outcomes would ideally come from comparable screening strategies and reflect clearly similar opportunities for diagnosis through comparable delivery of colposcopies and/or long enough followup with registry linkages to allow disease ascertainment outside the screening program. Because cumulative results are not reported for PPV, false-positive results, or colposcopy, it is difficult to assess the relative harms of HC2 versus cytology alone, or the net benefit of the two screening approaches.

Results for RCTs of HPV Screening Strategies in Cervical Cancer Screening, Women ≥30 or 35 Years of Age (KQ3).

Reported baseline colposcopy referrals were higher in HC2 screened women (5.8%), compared with cytology screened women (2.5%). Colposcopy referral data are not reported for the second screening round and are incomplete for the entire first round of screening. However, baseline colposcopy referrals in this trial may be a close approximation for the entire Round 1 screening, since women in both arms had a low threshold for immediate colposcopy referral, so few would undergo repeat testing strategies. Thus, there were about 3.3 percent more colposcopies after a single HPV test in Round 1, compared with cytology (ASC-US+ referral threshold) in women aged 35 years and older. In addition to incomplete reporting of harms and the use of different screening tests in Rounds 1 and 2 (with cytology alone in both arms in Round 2), another limitation of NTCC Phase II is that referral criteria differed by site in the control arm; two sites referred patients to colposcopy for LSIL+, and seven sites referred patients for ASC-US+.

Six community-based studies in both urban and rural settings in Europe, North America, and Asia reported absolute test performance of HPV alone compared with cytology. In a large study among women aged 30 years and older (n=7,908), one-time HC2 testing was much more sensitive than cytology (threshold of ASC-US+) for CIN3+ (HC2: 97.3% [95% CI, 83.2 to 99.6]; cytology: 46.0% [95% CI, 30.8 to 61.9]) and slightly less specific (HC2: 95.2% [95% CI, 93.4 to 96.5]; cytology: 98.0% [95% CI, 96.7 to 98.8]) (Table 9a).123 A second, much smaller, study (n=774) provided similar estimates of greatly improved sensitivity with slightly reduced specificity, but with very wide confidence intervals.122 This study’s applicability to women older than age 30 years is limited, since more than 80 percent of women enrolled were younger than 30 years of age. More studies reported sensitivity and specificity for CIN2+, which generally showed the same pattern of markedly higher sensitivity for HC2, with slightly decreased specificity (Table 9a and Appendix C).110,121,123–128 One study with notably different sensitivity and specificity estimates for HC2 than the rest may have been affected by misclassification of women; this study attempted to report results for primary screening separately using an “enriched” screening sample (i.e., 26% of women were already referred for abnormalities detected in previous screening, while 74% were presenting for primary screening).

Absolute Test Performance By Age of Primary Screening With HPV Test Alone and Combination HPV and Cytology Testing Among Developed Countries Only, Women ≥30 Years of Age (KQ3).

Women younger than age 30 or 35 years. The pattern of results in 13,725 younger women was similar to older women, but with a much higher rate of colposcopy referrals after HC2screening (Table 8b). After two rounds of screening in NTCC Phase II, cumulative detection of CIN3+ also increased in younger women screened with HC2 relative to cytology alone (47 vs. 21 CIN3+ lesions; RR, 2.19 [95% CI, 1.31 to 3.66]), with few ICC cases detected in either arm (HC2 arm: 1 case; cytology arm: 0 cases). Relative CIN3+/CIN2+ detection was increased after HC2 screening in Round 1 to a much greater degree than in older women, with a possibly greater decrease in Round 2. Colposcopy referrals (reported for Round 1 only) were much higher in HC2 screened younger women (13.1%), compared to those screened with cytology (3.6%). One study (n=3,301) provided absolute test performance of HC2 compared with cytology in women younger than age 30 years.122 HC2 sensitivity (for CIN3+ or CIN2+) was much higher (23 to 27%) than cytology, similar to markedly increased HC2 sensitivity in older women. Specificity of HC2, however, was relatively reduced compared to cytology to a much greater degree in younger women (about 11%) (Table 9b).

Absolute Test Performance By Age of Primary Screening With HPV Test Alone and Combination HPV and Cytology Testing Among Developed Countries Only, Women <30 Years of Age (KQ3).

Countries without developed cervical cancer screening programs. A fair-quality cluster-randomized RCT of 131,806 never-screened women aged 30 to 59 years in rural India compared cervical cancer deaths and incidence up to 8 years after one-time HPV, CC, or visual inspection with acetic acid (VIA) screening to a never-screened control group.118 One-time HPV testing significantly reduced the incidence of cervical cancer deaths (adjusted HR, 0.52 [95% CI, 0.33 to 0.83]) and Stage II or higher cervical cancer (adjusted HR, 0.47 [95% CI, 0.32 to 0.69]), compared to not screening. Neither VIA nor CC significantly reduced either cervical cancer deaths or incidence of Stage II or higher cervical cancer. Per 100,000 person-years of followup, there were 19.6 fewer Stage II or higher cervical cancer cases and 13.1 fewer cervical cancer deaths in the HPV screening group, compared with the unscreened controls. Since 25 percent of the cervical cancer deaths in the HPV screening group were in women who were not screened (about 20% of those randomized to the HPV arm), there is potential for even greater benefit if a larger proportion of never-screened women received a single HPV screening. This study’s intent was to improve cervical cancer screening in a country developing its population-based screening, so applicability to the U.S. population or other developed countries is very limited (poor). Differences in treatment protocols and clinical care between rural India and the United States also suggest that cancer mortality data should be interpreted with caution. Another important limitation is that about 20 percent of eligible women randomized to one of the three screening interventions were neither screened nor included in the analysis.

We found four fair- or good-quality observational studies of primary HPVscreening compared with cytology among 37,245 women aged 25 to 65 years in countries in the process of developing more robust cervical cancer screening.129–132 All except one of these studies130 show a pattern consistent with the observational studies conducted in developed countries (i.e., HPV testing is more sensitive but less specific than cytology). These studies were all judged to have poor129–131 or fair-to-poor132 applicability to the U.S. population, so they are not discussed further, but are included in Appendix C Table 3.

HPV Testing With Cytology Triage of Positive HPV (Reflex Cytology)

We identified one fair-quality RCT120 of 71,337 women aged 25 to 65 years (approximately 16% younger than age 35 years) within the Finnish national screening program comparing HC2 testing (with CC testing to triage positive HPV results) to cytology alone.133 HPV+ women with LSIL+ results on cytology triage were referred for immediate colposcopy, with retesting for ASC-US or HPV+/normal cytology results (Table 5b). Unlike most other studies, CIN1+ results were treated in all but the latter years of the trial, during which CIN1 in women younger than age 30 years was surveyed. A second round of screening (5 years after the initial round) is planned, but results from this second round have not yet been reported. Additional limitations of the Finnish trial include a high proportion of post-randomization loss (approximately one-third of women randomized to each study arm did not attend screening); unequal cross-over between study arms (more women in HPV arm screened with cytology [8%] than the converse [<0.1%]); and incomplete reporting of colposcopy referral rates (reported for baseline only) and false positives, particularly important for women aged 35 years and younger.

Women aged 35 years and older. After a single screening round (minimum 2 years of followup), HC2 testing with CC triage using an LSIL+ threshold nonsignificantly increased detection of CIN3+ after at least 2 years of followup, compared to cytology, in women older than 35 years (32 vs. 23 CIN3+ cases; RR, 1.38 [95% CI, 0.81 to 2.36]) and significantly increased CIN2+ detection (RR, 1.36 [95% CI, 0.98 to 1.89]) (Table 8a). Six cases of invasive cancer were detected with HPV screening and four with conventional screening. Colposcopy referrals were modest in women older than age 35 years and similar between HPV screening (0.9%) and cytology alone (1.0%). Based on test positivity, these data appear to reflect immediate referrals for LSIL+ and appear not to include colposcopy due to retesting during initial or extended followup.

Extended followup (mean, 3.3 years; maximum, 5.0 years) of this first screening round with linkage to registry data in 38,670 screened women aged 30 to 64 years found significantly increased CIN3+ (and cancer) after cytology triage of HC2 testing, compared with cytology alone (HC2: 59 CIN3+ cases, including 11 ICC/ACIS; CC: 33 CIN3+ cases, including 6 ICC/ACIS; RR, 1.77 [95% CI, 1.16 to 2.74]).134 Extended followup included just over half of the original cohort, with women from eight of the original nine municipalities and only women older than age 30 years. Additional cases were detected in those who were invited but did not attend program-based screening. However, relative detection of CIN3+ was also increased using an intention-to-screen analysis among all women invited (1.44 [95% CI, 1.01 to 2.05]). The majority of women who tested positive in both arms (1244/1354 in HPV with triage arm and 1053/1125 in cytology arm) were not referred for immediate colposcopy, but had retesting recommended (data not shown). Almost half of CIN3+ cases detected in both arms came from the groups recommended for retesting. It also took longer for a relative CIN3+ detection advantage to emerge between women immediately referred for colposcopy and those who underwent repeat testing. Within 1 year of initial screening, cases of CIN3+ from women with LSIL+ cytology were detected, while it took 3 to 3.5 years for all CIN3+ cases to accrue among women undergoing repeat testing for less abnormal results (HPV+ with or without ASC-US+ cytology). Thus, adequate length and completeness of followup appears important in determining the comparative detection impact of screening strategies. Women who screened HPV negative tended toward a relatively lower risk of CIN3+ compared with cytology negative women (0.28 [95% CI, 0.04 to 1.17]) (data not shown).

Women younger than age 35 years. Round 1 results (without extended followup) in 11,580 women younger than age 35 years found no enhanced CIN3+ or CIN2+ detection and little difference between HC2screening with cytology triage and cytology alone in immediate colposcopy referrals (2.8 vs. 2.7%) (Table 8b). Complete Round 1 colposcopy referrals are likely higher in the HC2-cytology triage arm, since 15.8 percent of younger women in this arm were targeted for repeat testing, about twice as many as in the colposcopy arm alone (data not shown).

Combination HPV and Cytology Testing (Co-Testing)

We found four fair-quality RCTs within national screening programs in Italy, the United Kingdom, Sweden, and the Netherlands comparing cytology screening alone to combination testing (co-testing) in a total of 127,149 women aged 20 to 64 years (16,976 younger than age 30 or 35 years).112,114,115,117 Trials tested HC2 plus LBC against CC (NTCC Phase I); HC2 plus LBC against LBC (A Randomised Trial in Screening to Improve Cytology [ARTISTIC]); or PCR using GP5+/6+ plus CC against CC (Population Based Screening Study Amsterdam Program [POBASCAM], Swedescreen). Colposcopy referral thresholds for cytology results varied considerably between trials (HSIL+ for ARTISTIC and POBASCAM, HSIL+ or ASC-US+ for different sites within Swedescreen, ASC-US+ or LSIL+ for different sites within NTCC Phase I). A cytology referral threshold of ASC-US+ or LSIL+ is probably most applicable to U.S. screening practice. Three trials (POBASCAM, Swedescreen, ARTISTIC) based immediate colposcopy referral on cytology results alone in both arms, using HPV positive results (alone or in combination with milder cytology abnormalities) to determine enhanced followup testing protocols (Table 5b).114,117,115 NTCC Phase I followed a similar approach in women younger than age 35 years (retesting for HPV+ in persons with normal cytology), but referred older women with either HPV positive or ASC-US+ cytology results for immediate colposcopy.112 No trials represented screening and retesting protocols identical to U.S. practice (as represented by ASCCP guidelines) for ASC-US or LSIL in combination with HPV results. Two trials changed screening strategies in the second round: POBASCAM screened both arms with PCR plus cytology after 5 years, while NTCC Phase I screened both arms after 3 years with cytology only (Table 5b).112,114

Duration of overall followup and completeness of followup for the whole sample varied between studies, which potentially affected complete ascertainment of outcomes. Followup interval from baseline was reported as 4.1 years (mean) in Swedescreen, up to 7 years in ARTISTIC, at least 6.5 years (median, 7.2) in POBASCAM, and up to 3.5 years after Round 2 invitation in NTCC Phase I. Based on incomplete followup, program impact could not be reported for a substantial portion of the sample in POBASCAM (not reported for the two-thirds without full 6.5 years of followup), while 29 percent of the sample in ARTISTIC had less than the minimal (2.5 years) followup after Round 2. Reporting of results after a third screening round in ARTISTIC did not remedy this.135 Followup after a second screening round at 3 years in Swedescreen averaged less than 1 year, and did not include retesting of low-grade abnormalities. Another limitation of the co-testing trials was adherence to trial protocols. Just 50 to 60 percent of POBASCAM participants complied with repeat testing recommendations in each screening round. Twenty percent of POBASCAM participants in each arm had opportunistic screening outside the study, and 10 percent of ARTISTIC participants had no HPV test in Round 2; these deviations from protocol would be expected to attenuate measured differences between screening strategies. Given the variability in HPV-cytology co-testing strategies between trials and the lack of complete implementation and reporting for all trials at this time, we did not try to quantitatively combine results.

We supplemented trial data with test performance data from four cohort studies (three of fair quality and one of good quality) in community settings in the United States, Canada, Germany, and France (Table 7).110,121–123 These studies compared the sensitivity and specificity of one-time HC2 plus cervical cytology (defining a positive result using various combinations of test results) to cytology alone (ASC-US+) or HC2 alone in 25,040 women aged 18 to 69 years (3,301 younger than age 30 years) (Tables 9a and b).

Women older than age 30 or 35 years. In contrast with HPVscreening alone, HPV plus cytology co-testing (using any of the variable screening, retest, and referral protocols) did not detect more CIN3+ after two rounds of screening than cytology alone in any of the trials (Table 8a). This finding may reflect the more stringent colposcopy referral protocols employed in most co-testing trials, compared with the one primary HPV screening trial (NTCC Phase II) (Table 5b). Round-specific screening results were somewhat mixed between trials, but generally detected relatively more CIN2+ with co-testing compared with cytology alone after Round 1, and less CIN3+ after Round 2. In all but one trial,117 51 to 78 percent more women with CIN2+ were detected in Round 1. ARTISTIC reported a 21 percent increase in CIN2+ detection that was not statistically significant, but also included all ages when reporting round-specific data (21% of women younger than age 30 years). All trials found 47 to 54 percent less CIN3+ detected in Round 2, although not all differences were statistically significant. Most trials detected the same or slightly fewer cancer cases overall in the HPV-tested arm, with few reporting impact on cancer incidence (i.e., second round relative cancer detection). Cumulative CIN2+ detection was relatively increased in the co-testing arm of a single trial (NTCC Phase I) that referred women with a positive HPV test or ASC-US+ cytology for immediate colposcopy.112

Given the many between-trial differences, it is difficult to interpret the mixed pattern of results. Findings to date do not reflect full followup of the second round of screening for any trial except NTCC Phase I (Table 5c). Reported results for colposcopy referral/attendance were also incomplete (NTCC Phase I, Swedescreen),112,115 or round-specific colposcopy by age was not reported (ARTISTIC).117 Colposcopy compliance was rarely reported. Cumulative colposcopies in ARTISTIC were higher in women randomized to co-testing (8.3%) than in those in the LBC only arm (6.4%), with a much higher colposcopy burden carried by women younger than age 30 years receiving co-testing (17.1%), compared with co-tested older women (6.0%) or with similarly young women receiving cytology only (12.0%) (Tables 8a and b).117 Cumulative colposcopies reported in POBASCAM were low (3.4% for co-testing, 2.8% for cytology alone) and inadequate for estimating the burden associated with co-testing (compared with cytology alone), since both arms received HPV testing in Round 2.114 Also, as with ARTISTIC, POBASCAM’s immediate colposcopy referral threshold (HSIL+, with retesting protocols for ASC-US or LSIL results with or without HPV positivity) does not replicate recommended U.S. practice, so complete results will need to be judged for applicability. Nonetheless, most co-testing studies report reduced CIN3+ in the second round of screening compared with cytology screening. Reduced CIN3+ after a second screening round was used as the primary outcome for power calculations in several co-testing trials (ARTISTIC, POBASCAM) and an HPV with cytology triage trial (Finnish trial), indicating its perceived value.

Findings from trials are complemented by fair- or good-quality studies of one-time combined HC2 plus cytology (co-testing) test performance (Table 7). In these cross-sectional studies of 17,885 women aged 30 to 60 years, a one-time co-test was generally more sensitive than cytology (for detection of CIN2+ or CIN3+), but also less specific. Reported sensitivity and specificity are not completely comparable across studies since most used different thresholds for test positivity (Table 9a). Two studies, in which co-testing was positive if either HPV or cytology were abnormal, reported very high sensitivity for HC2 plus cytology co-testing that was clearly superior to cytology alone (44 to 56% more sensitive) at an ASC-US+ threshold, but not clearly more sensitive than HC2 alone.121,123 This co-testing strategy (either test positive) was also 4 to 5 percent less specific than cytology alone, and appeared similar to HC2 testing alone.121,123 Other co-testing strategies required both HPV and cytology tests to be positive, unless cytology met a threshold. One co-testing study based a positive result on HSIL+ cytology or a co-test result of HPV positive with ASC-US+ cytology,110 while the other based a positive result on LSIL+ cytology or HPV+/ASC-US+ cytology results.122 With these strategies, sensitivity of co-testing for CIN2+ or CIN3 was the same or somewhat better than cytology alone, but worse than HC2 alone (although confidence intervals were very wide). As expected, specificity for this more stringent definition of a positive co-test was better than HC2 alone, and similar or better than cytology alone. These co-test strategies are more similar to testing with either test alone, followed by triage if HPV+ or ASC-US cytology results using the other test, than to administering and acting on both tests.

Women younger than age 30 or 35 years. Only two co-testing trials (NTCC Phase I, ARTISTIC) included women younger than age 30 or 35 years (Table 8b).112,117 Complete age-specific data were reported in NTCC Phase I only, and ARTISTIC is discussed with the results for women aged 30 or 35 years and older, since it largely reflects older women. NTCC Phase I compared one round of co-testing followed by cytology with two rounds of cytology. In contrast with the general pattern in older women (in NTCC Phase I and other co-testing trials), NTCC Phase I found no impact on CIN3+ in Round 1, Round 2, or cumulatively in 11,810 women aged 25 to 34 years. CIN2+ detection in younger women, however, was significantly increased in Round 1 and cumulatively. We had particular quality concerns for younger women in NTCC Phase I. Per protocol, the trial did not refer HPV positive/cytology negative younger women for immediate colposcopy, as it did with older women. Instead, younger women were retested at 1 year (Table 5b), a strategy reflecting the higher prevalence of HPV infection and likelihood of regression in young women. However, this difference in testing protocols led to differential loss to followup between the intervention and control arms, as many participants did not comply with repeat testing protocols. No cumulative data on colposcopy referrals in NTCC Phase I are available, and the higher baseline rate of colposcopy in younger women after co-testing compared with cytology alone (11.9 vs. 4.1%) is likely an underestimate, since these data reflect only immediate referrals.

In the only co-testing test performance study conducted primarily in younger women,122 co-test positives were defined as both ASC-US+ and HPV+ (Table 9b). Co-testing was significantly less sensitive for CIN3+ (64.0% [95% CI, 51.1 to 77.6]) than HC2 alone (92.5% [95% CI, 83.5 to 97.3]), but not different than cytology alone (65.4% [95% CI, 51.9 to 79.1]). Specificity (87.6% [95% CI, 86.7 to 88.4]) was significantly higher than cytology alone (81.5% [95% CI, 80.7 to 82.3]) and HC2 alone (70.1% [95% CI, 66.5 to 73.1]). This strategy is dissimilar to that used in NTCC Phase I, and primarily mimics testing with either test alone, followed by triage if HPV+ or ASC-US+ cytology results using the other test. Positive results on both tests would be considered necessary for immediate colposcopy referral.

Cytology Testing With HPV Triage of Positive Cytology (Reflex HPV)

We identified two good-quality RCTs in the United States116 and Sweden119 (Table 6) that addressed HPV triage of positive cytology. Neither study, however, compared HPV testing alone to repeat cytology in women referred with ASC-US or LSIL cytology. We also located four prospective cohort studies in countries with developed cervical cancer screening programs (United States, Sweden, France, Italy), three of fair quality136–138 and one of good quality,100 that evaluated the use of HPV testing using HC2 for triaging 2,261 women aged 15 to 78 years with ASC-US and LSIL cytology to colposcopy (Table 10). Three of the cohort studies compared one-time HPV screening to repeat cytology for triage of women referred with ASC-US or LSIL cytology.100,136,137 In the fourth cohort study, women with ASC-US received repeat cytology and HPV testing at enrollment.138 Women who tested positive on either test were invited for repeat HPV and cytology testing 6 months later. All women received HPV and cytology testing at 12-month followup. All studies compared HC2 to repeat CC, except ALTS, which compared HC2 and LBC to LBC alone (ThinPrep).

ALTS was a three-armed RCT that compared immediate colposcopy referral to HPV testing (HC2) plus repeat LBC or cytology (LBC) retesting alone (conservative management) to determine colposcopy referral in 5,060 U.S. women aged 18 to 81 years with community Pap smear diagnoses (69% ASC-US, 31% LSIL).116 ALTS participants were primarily young (77.5% younger than age 35 years) and were racially and ethnically diverse. Criteria for immediate colposcopy referral was HSIL+ on repeat testing in either arm or positive HPV results in the intervention group. All women received colposcopy at 2 years. The reported sensitivity of the three arms for detection of CIN3+ was actually a calculation of the cases of CIN3+ detected using each of the management strategies within the a priori-defined period for the strategy (enrollment period for HPV arm and enrollment plus followup periods for conservative management) out of the total CIN3+ detected in that arm over the 2-year study period. This calculation focuses on comparing the efficacy of a single test event (HPV plus cytology) with ongoing testing (with cytology alone).

Within the a priori-defined periods for each strategy, CIN3+ was detected in 6.3 percent of women in the HPV-LBC triage arm and 5.1 percent in the cytology triage arm, for a relative CIN3+ detection ratio of 1.24 (95% CI, 0.88 to 1.73) (Table 11). CIN3+ included two cases of ICC and one case of AIS across all arms (one per arm) and similar cumulative CIN3+ cases (97 in immediate colposcopy arm, 101 in HPV triage, 109 in conservative management). HPV testing diagnosed a greater percentage of the CIN3+ cases at baseline, rather than during followup or at the exit visit (75.2% of all cases detected over the 2 years of the study), than immediate colposcopy (59.8%) or followup cytology (40.7%). At a cytology threshold of HSIL+, repeat cytology triage over 2 years referred significantly fewer women to colposcopy than HPV-cytology triage (12.3% vs. 55.6%; p<0.001), with no colposcopy referrals in the HPV arm based on cytology alone. Colposcopy compliance was reduced slightly when delayed (90.1% after HPV triage, 98.7% with immediate colposcopy referral).

Results of RCTs for Cytology Testing With HPV Triage of Positive Cytology (KQ3).

Among women with LSIL enrolled in ALTS, HPV testing diagnosed a greater percentage of the CIN3+ cases at baseline, rather than during followup or at the exit visit (68.3% of all cases detected over the 2 years of the study), than immediate colposcopy (62.7%) or followup cytology (36.6%). HPV testing to triage LSIL, however, referred a vast majority of women (85%) to colposcopy. CIN3+ was detected in 12.1 percent of women in the HPV-LBC triage arm and 6.7 percent in the cytology triage arm. The HPV arm was closed early due to very high HPV positivity, leading to an unequal number of women in each arm. Therefore, relative detection ratios are not valid in women with LSIL.

The ALTS trial was rated as good quality, but had some limitations, particularly related to applicability. The study design does not likely reflect current standard practice for ASC-US cytology in U.S. practice. This study used a repeat cytology threshold of HSIL for referral to colposcopy, while recent guidelines recommend referral of women to colposcopy if ASC-US or worse is identified on repeat cytology.68 The results of the ALTS trial are reasonable estimates of what the study arms would produce in real life. While theoretical estimates are reported by the authors, the data provided do not allow for calculation of realistic estimates of comparative referral rates for usual care. More women would potentially have been referred to colposcopy, and sensitivity for CIN3+ might have been higher in the conventional management arm if a lower cytology threshold for referral had been employed. Whether this would have differed from the HPV triage strategy is unknown.

A second, smaller RCT compared HPV testing (HC2) plus repeat CC to repeat CC alone in 674 women aged 23 to 60 years with ASC-US or LSIL Pap smears identified through the Swedish national screening program.119 After one round of triage, 132 CIN3+ lesions (including one ICC) were detected, and relative CIN3+ detection tended to be higher with HPV-CC triage than CC triage alone (1.20 [95% CI, 0.88 to 1.63]) (Table 11). These results were very similar in magnitude to the single round in ALTS, but nonstatistically significant. Although power was limited, HPV-CC triage tended to improve relative CIN3+ detection primarily in women aged 30 years and older. HPV-CC triage significantly improved CIN2+ detection compared with CC alone (1.32 [95% CI, 1.04 to 1.67]), with relatively greater detection of CIN2+ in younger women. A very large proportion of women (62% in the HPV-CC triage arm and 41% of women in the CC triage arm) were referred to colposcopy after triage. The relative false-positive proportion for CIN3+ was 1.74 (95% CI, 1.38 to 2.20), meaning there were seven false positives in the HPV-CC triage arm for every four in the CC-only triage arm. Age-specific results suggested worse relative false-positive performance for women younger than age 30 years, compared with older women.

This trial differed from ALTS in several important ways: 1) women were referred at a threshold of ASC-US (and/or HPV positive results) after one triage test, rather than through a program of repeat testing; and 2) all triage positive women were treated with LEEP, laser conization, or hysterectomy, providing good histological confirmation of disease. While this trial was rated as good quality, its small sample size limits its power. Additionally, the applicability of this trial to U.S. practice is limited because the authors do not present results separately for women referred with ASC-US versus LSIL cytology. As seen in the ALTS trial and the observational studies discussed below, the HPV test does not perform well as a triage test in women referred with LSIL cytology due to low specificity, whereas the specificity of HPV is similar to repeat cytology in the triage of women with ASC-US cytology to colposcopy.

Three fair-quality studies136–138 and one good-quality study100 that included 2,299 women aged 15 to 78 years with ASC-US on initial CCscreening compared the absolute sensitivity and specificity of HPV alone or combined with cytology to repeat cytology alone for the detection of CIN2+ (Table 12). Studies primarily reported CIN2+ using a referral threshold of ASC-US+ or HC2 >1 pg/ml. All but one of these studies trended toward higher sensitivity for HPV.136 The confidence intervals, however, were wide, and the differences were not statistically significant.100,137,138 HPV tended to have similar100 or worse specificity136,137 than repeat cytology in all but one study that showed slightly improved specificity with HPV,138 although power was an issue in most comparisons. HPV plus cytology tended to improve sensitivity but reduce specificity (with limited power, because fewer than 1,000 women with ASC-US were evaluated for these comparisons).136,138

The cohort studies were small (total n=2,299), but data could be pooled for three of the studies (n=1,550) to provide combined test performance estimates for the comparison of HPV testing to repeat cytology for the detection of CIN2+ among women with ASC-US referral cytology (Figures 7 and 8).100,137,138 The pooled difference in sensitivity between HC2 and repeat cytology was estimated to be 12 percent (95% CI, 0.2 to 23.9), suggesting a better sensitivity for HC2. The confidence interval was wide even after pooling due to small sample sizes. No difference in specificity between HC2 and repeat cytology was observed (P=0.65 for the combined difference).

The fourth cohort study was not pooled because it provided cumulative test performance over 1 year for detection of CIN2+ among women with ASC-US cytology results.138 In this study, HC2 alone was more sensitive than cytology for the detection of CIN2+ (93.1% [95% CI, 91.3 to 94.9] vs. 74.1% [95% CI, 70.9 to 77.3]) and more specific (78.6% [95% CI, 75.7 to 81.6] vs. 72.3% [95% CI, 69.0 to 75.6]). The combination of both cytology and HPV testing was 100 percent sensitive, but less specific (62.5% [95% CI, 58.9 to 66.0]) than either HC2 alone or cytology alone. Age-specific results showed significantly better sensitivity and a tendency toward better specificity (but worse PPV) with repeat cytology in women aged 35 years and older, compared with younger women. In older women, HPV triage had a significantly higher area under the curve (AUC) (0.92) than in younger women (AUC, 0.74).

One study confirmed worsened sensitivity of immediate colposcopy that ALTS also found when compared to HPV testing (but not repeat Pap testing) in all women with ASC-US, regardless of age.138 When data were reported for triaging initial LSIL results using HPV,137 with or without repeat cytology,136 these studies confirmed findings from ALTS of very poor specificity of HPV testing strategies for triaging LSIL.

Findings from observational studies generally represented older women (mean or median age, 34 to 42 years) and confirmed an increased detection of CIN2+ with HPV triage of ASC-US cytology (compared with repeat CC) and no further sensitivity advantage of adding CC to HPV triage. Trial results suggest reduced specificity (more false positives and colposcopies) with an HPV triage strategy, and most observational studies agree. In one small study (n=749) of ASC-US only that reported age-specific results for women younger and older than age 35 years, sensitivity for CIN2+ did not differ by age. However, in women aged 35 years and older, specificity for HC2 was better than for repeat cytology (84.8% vs. 74.7%), while in women younger than age 35 years, specificity for HC2 tended to be lower than for repeat cytology (60.4% vs. 65.5%).138

Key Question 4. What Are the Harms of Liquid-Based Cytology?

Potential harms of screening with LBC (which are also potential harms of CC) include harm from collecting the cytologic sample itself, harm from unnecessary evaluation of false-positive smears, psychological distress associated with a false-positive result, and the economic burden related to recall for repeated sampling due to an inadequate or insufficient LBC specimen. We did not identify any studies that specifically addressed direct harm from collection of the LBC sample or psychological distress. Additionally, we did not systematically review the harms of diagnosis with colposcopy and biopsy.

Key Question 5. What Are the Harms of Using HPV Testing as a Screening Test, Either Alone or in Combination With Cytology?

Potential harms of HPV testing include harm from collecting the sample, psychological distress associated with a false-positive result or unnecessary evaluation of a false-positive result, partner discord, and the economic burden related to recall for repeated sampling due to an inadequate or insufficient specimen. Seven of the studies included for KQ3 reported on insufficient HPV test samples (Appendix C Table 3).112,113,115–117,122,132 The range of insufficient HPV test samples from these studies (including both HC2 and PCR) ranged from 0.08 to 6.0 percent of samples taken. No studies reported direct harm from collection of the cervical sample itself.

We found four fair-quality observational studies that examined the psychological impact of HPV testing (Tables 13 and 14).139–142 Three were conducted in the United Kingdom, two cross-sectional surveys140,141 and one consecutive series139 of patients evaluated from a randomized trial of combined HPV and LBC testing, which included 4,155 women aged 20 to 64 years presenting for routine cervical screening. These three studies focused on the psychological impact of knowing HPV test results. The fourth, a randomized trial of HPV triage of ASC-US Pap smears conducted in Australia (n=314), evaluated the psychological impact of HPV triage versus repeat cytology versus having an informed choice of either an HPV test or repeat cytology.142 All study details are included in Appendix C.

Two of the three studies evaluating the psychological impact of knowing HPV test status (known test positive versus test negative, known test result versus no test result) evaluated only the immediate impact of the HPV test results.139,141 The third evaluated participants both at 1 week and 6 months after receiving test results.140 These studies found testing positive resulted in short-term increases in anxiety and distress among women who knew their HPV test result, but these findings resolved by 6-month followup (Table 14). Among women who did not know their test results, there were no differences in anxiety and distress between women who tested positive. In the fourth study that evaluated the short- and long-term psychological impact of HPV triage of ASC-US cytology versus repeat cytology, long-term followup suggested greater satisfaction with care and less distress among women undergoing HPV testing.142

A second study evaluated the short-term psychological impact of HPV testing in a consecutive sample of women enrolled in the ARTISTIC trial with normal or mildly abnormal cytology (Table 13).139 Women in the ARTISTIC trial underwent both cytology and HPV testing, but in one arm the HPV test result was concealed. Overall, 2,700 women in the revealed arm and 882 women in the concealed arm were mailed questionnaires assessing psychological distress, anxiety, and sexual satisfaction at 2 weeks after they had received the results of their baseline cytology. This study was rated as fair quality, with the primary concern that no baseline General Health Questionnaire (GHQ), STAI, or Sexual Rating Scale (SRS) testing was assessed for study participants prior to undergoing cervical cancer screening. In addition, the followup questionnaire had a response rate of about 70 percent.139 The primary comparison was made among women with normal cytology who either knew (revealed arm) or did not know (concealed arm) they were HPV positive.139

The two groups did not differ in distress or anxiety (Table 14). Women in the HPV revealed arm did indicate lower sexual satisfaction with their current partner (adjusted mean difference in SRS score, −7.28 [95% CI, −12.60 to −1.96]). Planned subgroup analyses of women who had borderline or mild abnormalities on cervical cytology revealed no significant differences in distress, anxiety, or sexual satisfaction between the two groups. The study also compared women within the revealed arm of the study who knew their HPV test result.139 Among women with negative cytology results, the odds of psychological distress (GHQ score ≥4) were increased (age adjusted OR, 1.70 [95% CI, 1.33 to 2.17]), and mean GHQ scores were higher (mean difference, 1.43 [95% CI, 0.75 to 2.10]) for women who knew they were HPV positive compared to women who knew they were negative. STAI scores indicated higher state (mean difference, 2.90 [95% CI, 1.40 to 4.39]) and trait (mean difference, 1.53 [95% CI, 0.16 to 2.92]) anxiety levels for women who were HPV positive. There were no statistically significant differences between HPV positive and negative women with mild or borderline cytology results, except that women who were HPV positive had higher odds of sexual satisfaction with their current partners than those who were HPV negative (mean difference, 8.66 [95% CI, 4.30 to 13.02]; p<0.0001).

The third study evaluating the psychological impact of HPV test results was conducted by Maissi and colleagues,140,143 who evaluated 2,183 women attending routine cervical screening at two of the three centers taking part in an English HPV/LBC pilot study (Table 14). In addition to assessing distress and anxiety, they also assessed health-related quality of life (EuroQol EQ-5D). Outcomes were assessed by mailed questionnaire within 1 week after women had received their HPV and cytology results. Sixty-three percent of women returned the 1-week questionnaire, and 74 percent completed a followup questionnaire at 6 months. No data were provided to assess differences between responders and nonresponders. The analysis compared psychological outcomes among four groups of participants: women with 1) normal cytology results and no HPV test, 2) borderline or mildly abnormal cytology results and no HPV test, 3) borderline or mildly abnormal cytology results and an HPV negative test, and 4) borderline or mildly abnormal cytology results and an HPV positive test.140,143 The study groups varied significantly in baseline characteristics; however, attempts were made to control for potential confounders in multivariate analyses.140,143

Results from immediate followup showed that the groups differed significantly in anxiety (F=4.44; p=0.004), distress (F=5.37; p=0.001), and concern about test result scores (F=242.46; p<0.001) (Table 14).140,143 Women with abnormal cytology who were HPV positive had significantly higher anxiety (mean, 39.6 vs. 37.6; p<0.00), distress (mean, 2.8 vs. 2.1; p<0.05), and concern (mean, 9.7 vs. 8.8; p<0.05) than women who were HPV negative. There was no difference in anxiety, distress, or concern between women who had negative HPV tests and women who were not tested for HPV.

At 6 months, the groups still differed significantly in concern about test results (F=83.39; p<0.001), but not in anxiety or distress (Table 14).140,143 Levels of anxiety, distress, and concern did not differ significantly between the HPV positive and HPV negative groups. Groups did not differ in health-related quality of life scores at baseline or followup. All four groups had low scores on the Psychological Effects of Abnormal Pap Smears Questionnaire, indicating low levels of sexual health worries, but women who were HPV positive had significantly higher scores than women who were HPV negative (p<0.05).

The fourth study, by McCaffery and colleagues, evaluated the psychological impact of HPV triage versus repeat cytology versus having an informed choice of either an HPV test or repeat cytology among women from family planning clinics across Australia with ASC-US equivalent cytology results (Table 13).142 Overall, this was a well-designed, pragmatic, nonblinded RCT with good followup. Outcomes were assessed by questionnaire at baseline and at 2 weeks, 3 months, 6 months, and 12 months after the triage test. The primary outcome measure was health-related quality of life (36-Item Short-Form Health Survey, mental health combined score). They also assessed cognitive measures (perceived disease severity and risk, intrusive thoughts, worry, and satisfaction with care), emotional measures (anxiety, distress, and concerns about infectivity and effects on relationships), and behavioral measures (effects on sexual health, help seeking behavior, and visits to primary care physician).

At 2 weeks, no significant differences were seen between the three groups in psychosocial outcomes, except in proportion reporting intrusive thoughts (57%, 43%, and 32% for the HPV, informed consent, and repeat cytology groups, respectively; p=0.02) and satisfaction with care (p=0.04). Over 1 year, however, distress was significantly less in the HPV group than either the repeat cytology or informed choice groups, with mean CSQ scores of 16.6 in the HPV group, 18.4 in the cytology group, and 17.5 in the informed choice group (p<0.01).142 Mean satisfaction scores were highest among women randomized to HPV testing and informed choice. The authors hypothesized that the longer wait for results in those who chose or were assigned to cytology likely accounted for higher distress and lower satisfaction in this group.