Cancer Biology Program, Program in the Biomedical Sciences, Rackham Graduate School, University of Michigan, Ann Arbor, Michigan.Department of Otolaryngology/Head and Neck Surgery, University of Michigan, Ann Arbor, Michigan.

Abstract

High-risk HPV (hrHPV) is the leading etiologic factor in oropharyngeal cancer. HPV-positive oropharyngeal tumors generally respond well to therapy, with complete recovery in approximately 80% of patients. However, it remains unclear why some patients are nonresponsive to treatment, with 20% of patients recurring within 5 years. In this study, viral factors were examined for possible clues to differences in tumor behavior. Oropharynx tumors that responded well to therapy were compared with those that persisted and recurred. Viral oncogene alternate transcripts were assessed, and cellular sites of viral integration were mapped and sequenced. Effects of integration on gene expression were assessed by transcript analysis at the integration sites. All of the tumors demonstrated active viral oncogenesis, indicated by expression of HPV E6 and E7 oncogenes and alternate E6 splicing. In the responsive tumors, HPV integration occurred exclusively in intergenic chromosome regions, except for one tumor with viral integration into TP63. Each recurrent tumor exhibited complex HPV integration patterns into cancer-associated genes, including TNFRSF13B, SCN2A, SH2B1, UBE2V2, SMOC1, NFIA, and SEMA6D. Disrupted cellular transcripts were identified in the region of integration in four of the seven affected genes.

Introduction

High-risk human papillomaviruses (hrHPV) are known factors in the etiology of head and neck squamous cell carcinoma (HNSCC), particularly in association with the increasing incidence of oropharynx cancers. Conventional treatment for these patients includes high-dose radiotherapy often combined with concurrent chemotherapy. These treatments are associated with significant acute and long-term morbidity. In oropharyngeal tumors, hrHPV is associated with better prognosis, suggesting that hrHPV-positive tumors may be responsive to alternate therapies that are more tolerable than those currently used (1–5). However, a reduction in treatment intensity is hampered by our current inability to distinguish the most responsive HPV-positive oropharynx tumors from those that would fail if given reduced-intensity treatment or that fail to respond to current therapies.

Carcinogenesis in hrHPV-induced tumors is driven by sustained expression of viral E6 and E7 oncogenes, which is secondary to disruption of the viral E2 gene that regulates E6-E7 expression (6). The HPV16 E6 gene contains multiple splice sites, generating alternate E6*I-E7 and E6*II-E7 transcripts that have been linked to increased expression of E7, considered to be the more potent oncoprotein, at the expense of full-length E6 (7, 8). The E6*I and E6*II alternate transcripts result from a single donor site at nucleotide (nt) 226 of the viral genome and two acceptor sites at nt 407 (E6*I) and at nt 526 (E6*II; Fig. 1A). Secondary carcinogenic mechanisms of viral integration could include disruption of tumor suppressor genes or upregulation of genes that promote cell-cycle progression. Integration of hrHPV into the host cellular genome has been reported to be associated with high E6 and E7 transcription and carcinogenic progression from cervical intraepithelial neoplasia (CIN) to invasive disease in many cervical cancer studies (6, 9, 10). Cellular sites of viral integration in cervical cancer are primarily into gene-poor regions or chromosome-common fragile sites (11–13), but there are studies that report viral integration into known genes in cervical cancer (14–17). Similarly, in head and neck cancers, viral integration into the host genome and into cellular genes is now being appreciated. A study by Parfenov and colleagues that analyzed TCGA data from 35 HPV-positive head and neck tumors noted frequent integration of HPV into regions of microhomology between the viral and host genomes. Furthermore, they observed that more than half of these integration events occurred into a known gene, and another subset (19%) occurred within 20 kb of a gene. They also found that the integration of HPV often altered the expression of cellular genes and was associated with focal amplifications; however, there was no significant association of HPV integration status with clinical outcome (18).

We and others have previously demonstrated transcriptionally active hrHPV integration into known cancer-related genes (those that are known to be involved in cancer pathways or that have been reported to have altered expression in one or more types of cancer) in HPV16-positive HNSCC cell lines (19–21). Parfenov and colleagues (18) postulate that integration events affecting expression and function of cellular genes may be a secondary driver of HPV-positive head and neck tumors. Vojtechova, and colleagues (21) reported HPV integration, extrachromosomal or mixed integrated, and extrachromosomal HPV in 14 fresh and 186 archival tumors and found HPV positivity, tumor size, and lymph node positivity were associated with survival, but integration status was not. In this study, we investigate viral copy number, viral oncogene transcript production, sites of viral integration, and effects of viral integration on cellular gene transcripts across the integration sites in 10 patients who differed in tumor resolution after therapy.

Materials and Methods

Tumor specimens

Oropharyngeal tumors were obtained from the Head and Neck Cancer SPORE Biorepository. We evaluated 10 HPV-positive oropharyngeal tumor pretreatment biopsies from patients who had also provided fresh-frozen tumor tissue and written informed consent to investigate their tissue under a study approved by the Institutional Review Board for the University of Michigan (Ann Arbor, MI) medical school. Tumor information, patient gender, age, smoking status, year of diagnosis, treatment, and outcome are listed in Table 1. Areas in the paraffin tissue block enriched for tumor cells were identified and marked by a pathologist on a freshly cut hematoxylin and eosin–stained slide prior to tumor sampling. Genomic DNA was extracted from formalin-fixed, paraffin-embedded (FFPE) tumor cores using the DNeasy Blood and Tissue Kit (Qiagen) or from fresh-frozen tumor sections using a standard phenol extraction. Tumor tissue was microdissected for RNA from fresh-frozen tumor sections immediately following histologic evaluation. Total RNA was isolated using the RNeasy Mini Kit with QIAzol (Qiagen), followed by on-column DNase treatment.

HPV genotyping and copy number analysis

hrHPV genotyping was performed on DNA from all tumors using the HPV PCR-MassArray assay (22, 23). Type-specific TaqMan quantitative PCR was used to determine HPV copies per cell, assessing both E6 and E7 amplicons, with a GAPDH assay as an endogenous two-copy/cell endogenous reference control.

HPV E6 and E7 transcript analysis

HPV16 E6 and E7 transcripts were evaluated by RT-PCR with gel electrophoresis and TaqMan quantitative RT-PCR. To analyze the expression of HPV16 E6 and E7, transcript-specific assays were used that exclusively amplify each product: the intact, nonspliced, full-length E6-E7 transcript, the spliced E6*I-E7 transcript, and the spliced E6*II-E7 transcript, as illustrated in Fig. 1A (primer sets are listed in Supplementary Table S1). An assay for human endogenous GAPDH was included to verify the absence of contaminating genomic DNA. Quantitative RT-PCR was performed using similar transcript-specific TaqMan assays that individually interrogate each HPV E6 and E7 transcript: nonspliced full-length E6, spliced E6*I, spliced E6*II, and E7 (Primer sequences are listed in Supplementary Table S2; Fig. 1B). A TaqMan quantitative assay for GAPDH was included as an endogenous control to calculate relative viral gene expression.

Detection of integrated papillomavirus sequences-PCR

Viral integration was evaluated using an adaptation of the detection of integrated papillomavirus sequences-PCR (DIPS-PCR) method previously published (13). Genomic DNA from each tumor was subjected to Taqα1 restriction enzyme digestion, producing fragmented DNA. There are approximately 1.5 million Taqα1 restriction sites within the human cellular genome, but only one in the nonvariant HPV16 genome, located in the E6 open reading frame (ORF) at nucleotide 505. Additional HPV16 Taqα1 restriction sites have been described in HPV16 variants at positions 311 and 2608. Following restriction digest, a ligation reaction attached a double-strand adapter oligo (5′-CGCAACGTGTAAGTCTG-NH2-3′ annealed to 5′-GGGCCATCAGTCAGCAGTCGTAGCCGGATCCAGACTTACACGTTG-3′) to the overhanging ends of each fragment. Linear amplification of the ligated fragments was performed using 11 viral-specific primers, generating amplicons that originate in the viral genome, extend into adjacent cellular sequence, and terminate at the end of the adapter. This was followed by a second, logarithmic, PCR using 11 nested viral primers with a reverse adapter-specific primer (primers are listed in Supplementary Table S3). Thermocycling conditions used for linear and exponential PCR included 3-minute extension cycles, allowing limitation of amplicon size to 3 kb or less, therefore excluding production of any of large (>3 kb), episome-only fragments. PCR products were separated by gel electrophoresis.

Sequence analysis of HPV16 integration products

Viral–cellular fragments were distinguished from episomal virus fragments based on predicted viral-only amplicon sizes of 2,750 bp or larger (Supplementary Table S3). DIPS-PCR amplicons of approximately 2,500 bp or smaller were identified, the corresponding bands were excised, and the amplicons were purified and sequenced. Integration events into known cellular genes were confirmed by direct PCR and sequencing of the original tumor genomic DNA, using primers designed for each viral and cellular region.

Integration site transcript analysis

RT-PCR assays were designed to amplify viral–cellular fusion transcripts and cellular transcripts from tumor RNA in cases expected to be altered by confirmed viral integration into known cellular genes. Assays included virus–cellular fusion transcripts (although expected only in the single case where the integration into the cellular gene followed the same orientation as the virus) from HPV ORFs into cellular gene exons, cellular gene exon–exon transcripts spanning the integration site, and exon–exon or within-exon transcripts outside of the integration site region. All successfully amplified transcripts were sequenced for verification.

Results

Patient material

Of the 227 HPV16-positive oropharynx tumors, there were 19 with sufficient FFPE and fresh-frozen tumor tissue for our study. Of these, tumor DNA and RNA of sufficient quantity and quality and patient follow-up of at least 2 years were obtained from 10 patients, 5 with recurrent, and 5 with responsive tumors. Patient information, tobacco and alcohol use, HPV copy number, year of diagnosis, tumor staging, treatment, and outcome are listed by patient in Table 1. Tumors were from patients treated with (i) surgery alone (1 patient); (ii) surgery and chemotherapy with weekly cisplatin (40 mg/m2; 1 patient); or (iii) concurrent chemotherapy with weekly carboplatin (AUC 1) and paclitaxel (30 mg/m2) and intensity-modulated radiotherapy to defined tumor targets to maximize tumor dose and minimize dose to normal tissue (8 patients). All patients had advanced stage III or IV disease. Patients with recurrent tumors all had advanced stage disease (IVA or IVB) and large (T3 and T4) tumors. Of patients who remained free of disease, three had stage III, one had IVA and one had IVB disease, two with T1, two with T2, and one with T3 tumors. At the time of publication, all of the patients with responsive tumors were free of disease [38–56 (mean, 48.6) months following diagnosis] and 4 of the 5 patients with recurrent tumors have died with survival time from diagnosis ranging from 11 to 75 (mean, 29.5) months. The single surviving patient in the recurrent group is alive with disease at 109 months.

HPV genotyping and copy number analysis

All 10 tumors were positive for HPV16 and negative for all other 13 hrHPV types included in the PCR-MassArray assay. HPV16 copy number for the responsive tumors ranged from 16 to more than 500 copies per cell with a mean viral copy number of 242.8. The recurrent tumors had overall lower copy number, ranging from 6 to 279 HPV16 copies per cell with a mean of 92.6 (Table 1). Although the mean copy number was lower in the nonresponsive tumors, the difference in the mean copy number values for responsive and recurrent tumors was not significant (P = 0.26).

HPV E6 and E7 transcript analysis

HPV16 E6 and E7 transcripts were expressed in all 10 tumors. The alternate E6*I transcript was the most highly expressed transcript in 8 of the 10 tumors. Only in responsive tumor 1971 and recurrent tumor 0732, was the E6*I less abundant than the full-length E6 transcript. The E7 transcript was expressed at levels equivalent to the E6 (full length or E6*I) transcript in all responsive tumors except one (2148). In all of the nonresponsive tumors, the E7 transcript was expressed at higher levels than any of the E6 transcripts (Figs. 1 and 2).

DIPS-PCR

All 10 HPV16-positive tumor specimens demonstrated viral integration. (representative DIPS-PCR gels for responsive and recurrent tumors are shown in Supplementary Figs. S1 and S2, respectively). A total of 207 hybrid viral–cellular amplicons were isolated and sequenced, 99 amplicons generated from the responsive tumors and 108 amplicons from the recurrent tumors. The numbers of viral–cellular amplicons generated and sequenced from each tumor are listed in Supplementary Table S4. All amplicons were analyzed and viral–host DNA fusions were identified by sequence and BLAST analysis. The sequence reads mapped to viral-only sequence, viral–cellular hybrids, or were unmapped due to poor sequence resolution. All identifiable integrations are reported; multiple amplicons from each tumor were of the same integration.

Analysis of integration events

Each integration into a cellular gene was confirmed on the undigested tumor genomic DNA by direct PCR and sequencing. Representations of the viral integration events are depicted in Fig. 3 and summarized in Table 2, indicating the chromosome locus, cellular gene, and the region of integration. On the basis of the integration results from DIPS-PCR, transcript analysis was used to investigate expression of the affected gene. Electrophoretic gel images of transcript amplicons are shown in Supplementary Fig. S3 and cellular gene transcript RT-PCR and sequencing results are summarized in Fig. 4 and in Supplementary Table S5.

Integration events in responsive tumors

Eleven of the 12 HPV integration events identified in the responsive tumors involved intergenic chromosome regions (Fig. 3B–F). Tumor 1733 had an HPV E2 integration into a known chromosome fragile site in 2p16 (24), tumor 1979 had 3 intergenic integration events (HPV E2 into 9q21, HPV L1 into 16q11.2, and another L1 into 4q27), four intergenic integration events were identified in tumor 1804 (HPV E1 into 6q16, HPV L2 into 10p11.1, HPV E5 into 16q11.2, and HPV E2 into 16q11.2), and tumor 2148 had a single integration of HPV L2 into 7p22 a known chromosome fragile site (24).

Only responsive tumor 1971 had integration into a cellular gene. Of the three integrations in tumor 1971, two were intergenic, HPV E1 into chromosome fragile site 7p22.3 (24), L2 into 4p16.3, also a known chromosome fragile site (24), and one, HPV L1, into intron 4 of TP63 in 3q28 (Table 2; Fig. 3E). This integration site is located within the region that codes for the DNA-binding domain of this tumor suppressor protein. Upon transcript analysis of this integration event, we found that a fusion transcript between HPV L1 and TP63 exon 4 was not produced (Fig. 4A; Supplementary Table S6). The transcript across TP63 exons 4 and 5, spanning the viral integration site in intron 4, was produced, and the sequence was in-frame. In addition, a transcript across TP63 exons 5 and 6 (outside of the integration region) was generated and was spliced in-frame.

HPV integration into recurrent tumors

The majority of the HPV integrations in the recurrent tumors were viral integrations into cellular genes. Tumor 2049 had two integration events into cellular genes, the first involving a rearrangement of HPV E1 (a duplicated region of E1 was inserted into E1 upstream of the integration) into 8q11.21, at intron 1 of UBE2V2, which codes for ubiquitin-conjugating enzyme E2 variant 2 (Table 2; Fig. 3G). A fusion transcript was generated between HPV E1 and UBE2V2. Sequence analysis of this fusion transcript revealed the entire UBE2V2 exon1 fused to a portion of HPV L1 reading into nonsense sequence then into HPV E1, followed by a segment of chromosome 17q11.2 and the distal end of the transcript amplicon included the expected region of HPV E1 (Fig. 4B; Supplementary Table S6). However, a UBE2V2 transcript across exons 1 and 2, spanning the integration site in intron 1, as well as the transcript outside of the integration region across exons 2 and 3 were produced and the sequences were spliced in frame, suggesting that these came from a different chromosome or that the transcript was incomplete. The protein product of UBE2V2 mediates transcriptional activation of target genes, regulates cell-cycle progression and cellular differentiation, and is involved in DNA repair and cell survival after DNA damage. Deregulation of UBE2V2 expression has been reported to be associated with gastric cancer (25), and in ER-positive/HER2-negative breast cancer, UBE2V2 was linked to poor prognosis (26).

The second integration event identified in tumor 2049 was HPV E1 into 14q24.1, at intron 1 of SMOC1, the gene for SPARC-related modular calcium binding 1 (Table 2; Fig. 3G). The fusion transcript between HPV E1 and SMOC1 was sequenced and contained SMOC1 exon 1 linked to chromosome 3p23, followed by nonsense sequence (Fig. 4B, Supplementary Table S6). There were other transcripts generated across SMOC1 exons 1 and 2 (spanning the intron 1 integration) and exons 3 and 4 (outside of the integration region), but the transcript sequences did not contain any homology to SMOC1 and were determined to be nonsense sequence. As there was no intact SMOC1 transcript, it appears that SMOC1 was inactivated by the viral integration. SMOC1 codes for a secreted protein localized to the basement membrane that is involved in cellular differentiation and has been associated with brain cancer (27).

DIPS-PCR and sequencing revealed an HPV early gene rearrangement in recurrent tumor 0843, where the distal half of E6 was duplicated and joined within the E2 ORF (Fig. 3H). Viral integration in tumor 0843 was identified from HPV L2 into 2q24.3 at intron 16 of SCN2A, which codes for the voltage-gated type II sodium channel α subunit (Table 2). Transcript analysis of the HPV L2 integration that mapped to intron 16 of the cellular gene SNC2A demonstrated that no fusion transcript was created in tumor 0843 between HPV L2 and cellular SCN2A exon 17 (Fig. 4C, Supplementary Table S6). Transcript primers in exon 16 and exon 17 of SCN2A amplified a cDNA transcript generated across the integration site in intron 16, but the sequence analysis identified a portion of HPV L1 flanked on one side by the cellular gene for the ATP-binding cassette, sub-family A, member 12 (ABCA12) located on chromosome 2q34, and on the other side by an intergenic region of chromosome 1q32, indicating a very complex chromosome rearrangement involving multiple chromosomes. Furthermore, there was no transcript generated when SCN2A was queried downstream from the HPV L1 integration event, across exons 18 and 19, indicating that SCN2A was fully disrupted by this integration event. This integration takes place in the second helical transmembrane S6 region of the protein, which participates in a complex for action potential initiation and propagation in excitable cells, as well as proliferation, migration, and adhesion in nonexcitable cells (28). It has been reported that differential expression of voltage-gated sodium channels is associated with the metastatic activity of multiple malignancies, such as leukemia and prostate, breast, and lung cancer, and these ion channels are currently being investigated as targets for cancer therapies (28).

Two integration events were identified in tumor 2238, the first was composed of a rearrangement within HPV, where the L2/L1 overlapping region was inserted into the E1 ORF and inserted into 1p31.3, at intron 9 of NFIA, which codes for nuclear factor I/A (Table 2; Fig. 3I). No fusion transcript was generated between HPV L1 and NFIA (Fig. 4D; Supplementary Table S6). There was no transcript generated across NFIA exons 9 and 10, spanning the intron 9 integration, suggesting that the integration disrupted the normal NFIA transcript, but the transcript across exons 10 and 11, outside of the integration site, was produced and that sequence was in-frame. The NFIA protein product is a sequence-specific transcription factor that regulates numerous adenoviral and cellular genes and is independently proficient in activating cellular transcription and replication. It was recently reported that an investigation of acute erythroid leukemia containing t(1;16)(p31;q24) uncovered a gene fusion between NFIA/CBFA2T3 (29).

The second integration in tumor 2238 was HPV E2 into 15q21.1, at intron 4 of SEMA6D, the gene for semaphorin 6D (Table 2; Fig. 3I). No fusion transcript was generated between HPV E2 and SEMA6D, or across SEMA6D exons 4 and 5, spanning the intron 4 integration site (Fig. 4D; Supplementary Table S6). The SEMA6D transcript across exons 5 and 6, outside of the integration site, was generated but was found to be nonsense upon sequence analysis, which is consistent with disruption of SEMA6D gene expression. The product of SEMA6D is a transmembrane protein historically characterized as an axon guidance molecule but has more recently been shown to participate in differentiation, organogenesis, and angiogenesis, mediated by plexin-A1 as the major Sema6D-binding receptor (30, 31). Furthermore, it has been reported that the Sema6D/plexin-A1 complex binds VEGFR-2 to mediate survival and anchorage-independent growth of tumor cells (31, 32).

Tumor 0732 had integration from HPV E2 into an intergenic region at 10p11.1, as well as HPV L2 into 17p11.2, inserting at intron 3 of TNFRSF13B, the gene coding for a member of the TNF receptor superfamily (Table 2; Fig. 3J). Transcript analysis of this integration event revealed that no fusion transcript was generated between HPV L2 and TNFRSF exon 3 (Fig. 4E; Supplementary Table S6). The transcript across TNFRSF exons 3 and 4, spanning the viral integration site in intron 3, as well as the TNFRSF transcript across exons 4 and 5, outside of the integration site, was generated and both the sequences were in-frame. This viral integration occurs within the region that produces the extracellular topological domain of the receptor protein, which participates in immunity by interacting with a TNF ligand. TNFRSF13B induces B-cell maturation and differentiation and activates multiple transcription factors, including NFAT, AP1, and NF-κB. It has been reported that hematologic malignancies are induced by B-cell survival and aberrant proliferation caused by dysregulated signaling by TNFRSF family members (33), but how this pathway may be involved in HNSCC is not clear.

Tumor 1040 had integration of HPV L2 into an intergenic region of 10p11.1, as well as HPV L1 into 16p11.2, at intron 3 of SH2B1, the gene for Src Homology 2B (SH2B) adapter protein 1 (Table 2; Fig. 3K). In recurrent tumor 1040, there was no fusion transcript generated between HPV L1 and cellular SH2B exon 3 (Fig. 4F; Supplementary Table S6). There were, however, transcripts generated across SH2B exons 3 and 4 (spanning the intron 3 integration site), and within exon 5 (outside of the integration region), and both sequences were in-frame. This is a mediator protein for tyrosine kinase receptors and is involved in JAK and receptor tyrosine kinase signaling pathways.

HPV integration into the intergenic chromosome region 16q11.2 was identified three times among the responsive tumors examined; once in tumor 1769 and in two different events in tumor 1804. A second intergenic region was involved in three integration events, including responsive tumor 1804 and recurrent tumors 0732 and 1040, which all exhibited viral integration into chromosome 10p11.1. These parallels suggest that there may be sequence or structural similarities that increase the probability of viral integration into these intergenic regions.

To further assess the importance of these genes to HNSCC, we assessed the mutation, copy number, and gene fusion status of each gene harboring an integrated copy of HPV using the Oncomine database to assess all publicly available head and neck next-generation sequencing data, including from the Head and Neck Cancer Genome Atlas project (HNSCC-TCGA). Importantly, each of the identified genes harbored a genomic alteration in at least one tissue sample (Supplementary Tables S6–S9). For example, the most frequently altered gene among this set, TP63, was mutated in 12 of 380 (3.2%; detailed in Supplementary Table S7), amplified in 16 of 390 (4.1%; detailed in Supplementary Table S8), and rearranged in 1 of 302 (0.3%; detailed in Supplementary Table S9) of all head and neck samples (summarized in Supplementary Table S6). Genomic amplification of the TP63 gene correlated with increased RNA expression as it did for UBE2V2, but not TNFRSF13B, where almost no reads supported the presence of TNFRSF13B expression (Supplementary Fig. S4). Interestingly, as observed from the sample ID numbers, the genes were altered in a mutually exclusive manner, suggesting that they may individually function as unique cancer drivers or suppressors.

Discussion

The incidence of HPV-positive oropharyngeal cancer is rising, and there remains a lack of understanding around factors that determine or influence tumor response to treatment (1–5, 34, 35). There is significant interest in reducing treatment intensity for patients with HPV-positive tumors, but a decrease would risk the possibility of undertreating some patients who are cured by intensive concurrent chemotherapy–radiotherapy regimens (3, 36). In addition, our studies and others have found a subset of patients with HPV-positive tumors nonresponsive to concurrent chemotherapy–radiotherapy (1, 37). Thus, it is necessary to identify the differences between (i) tumors that fail current intensive multimodality treatments and require alternate therapies; (ii) those that respond to current therapies but will not respond to reduced-intensity treatment; and (iii) tumors that are highly likely to respond to reduced-intensity therapy with lower treatment morbidity. On the basis of our previous work and what is known from HPV in cervical cancer, we examined viral copy number, transcriptional activity, and viral integration of hrHPV in responsive and recurrent tumors to determine whether these factors might be useful as clinically relevant factors to predict response.

The 10 tumors studied were positive for HPV16 and negative for all other high-risk HPV types assessed. HPV copy number was established for each tumor; the ranges of viral load values were similar for responsive and recurrent tumors (16–539 copies/cell for responsive tumors, 6–298 copies/cell for recurrent tumors). It is important to note that the values obtained for viral copy number may not be exact because the tumor DNA was extracted from tissue cores that may have contained normal cells. Nevertheless, the average viral copy number for the responsive tumors (242.8 copies/cell) was more than twice that of the recurrent tumors (92.6). Although the small number of tumors and wide ranges of copy number values limit our ability to draw conclusions from this result, it does agree with our hypothesis based on earlier work (4) that less advanced tumors contain higher numbers HPV copies, perhaps representing multiple episomal copies of HPV, whereas more advanced cancers have fewer copies due to loss of episomes unless the extra copies represent integrated or extrachromosomal concatenated copies of the viral genome that may or may not contribute to cancer cell proliferation (38, 39).

All of the tumors demonstrated expression of the E6 and E7 oncogenes, indicating that both the responsive and recurrent tumors are HPV driven and that the virus is not an incidental passenger to an alternate carcinogenic mechanism. Four of the five tumors in each group (responsive and recurrent) exhibited the alternate E6*I as the most abundant E6 transcript. Only one tumor from each group had the full-length E6 transcript as the most abundant oncogene transcript. E7 transcripts were also very abundant in 9 of 10 tumors, and only one had lower E7 expression than the E6*I transcript. The E6-E7 transcripts (full-length E6-E7, E6*I-E7, and E6*II-E7) are produced as polycistronic mRNAs derived from the first p97 promoter. The E6 oncoprotein is translated from the full-length E6-E7 transcript, and E7 is translated from the spliced E6*I-E7 transcript (7, 8, 40). This suggests that the tumors with more abundant full-length E6 transcripts would have higher levels of the E6 oncoprotein, while the tumors with more abundant E6*I transcripts would produce higher levels of the E7 oncoprotein (7, 8, 40).

All of the tumors evaluated exhibited HPV16 integration into the cellular genome. We suspect that in most cases integration occurs into intragenic regions that account for greater than 90% of the genome. Although the number of viral integrations that we detected and identified varied among the tumors, it is important to acknowledge that the DIPS-PCR method may not detect all integration events, and so the number of integration events detected by this method cannot be used as a prognostic measure as suggested in Liu and colleagues (38). Integration events in both responsive and recurrent tumors demonstrated forward and reverse orientations into the cellular genome. This is consistent with the “looping” model of viral integration described recently by Akagi and colleagues (19) or other rearrangement mechanisms. In each of the responsive tumors, at least one viral integration event was identified into intragenic regions known to be chromosome fragile sites (2p16, 7p22, and 4p16 in tumors 1733, 1971, and 2148) or into intergenic regions that were common to more than one tumor (16q11.2 and 10p11.1 in tumors 1769 and 1804). These results suggest that such integrations are not always entirely random and that viral integration is more likely to occur in gene-poor regions of the cellular genome that are already unstable or into regions with sequence or structural characteristics that favor integration. Zhang and colleagues (41) analyzed 14 publications and concluded that in cervical cancers, HPV integration showed a preference for intragenic areas and transcriptionally active regions of the human chromosomes.

We postulate that the effect of intergenic viral integrations seen in both the responsive and recurrent tumors relates to the primary mechanism of HPV-driven carcinogenesis through disruption of the E6 and E7 transcriptional repressor E2, leading to upregulated and unopposed expression of the E6 and E7 oncogenes. Consistent with this idea, Parfenov and colleagues (18) showed that the most common integration events among the HPV-positive TCGA head and neck tumors occurred within the HPV E1 gene disrupting the polycistronic early region transcripts of E1 and E2. Viral integration into intergenic and chromosome fragile sites occurs frequently in cervical cancer (11–15), resulting in disruption of E2 and enhanced expression of E6 and E7 (9, 42, 43).

One of the responsive tumors also had integration into TP63. Interestingly, we and others (19, 20) observed HPV integration of HPV E2 into distal TP63 in the HNSCC cell line UM-SCC-47. Viral integration into TP63 has also been reported in cervical cancer, and susceptibility for integration into this gene may be due to short segments of homologous sequence shared by HPV E1 and chromosome 3q28 within the TP63 gene (14). TP63 belongs to the p53 family of tumor suppressor genes and is a sequence-specific DNA-binding transcriptional repressor and activator. The p63 protein participates in TGFβ and WNT signal transduction as well as differentiation and cell-cycle regulation (44), and as such, HPV integration into the TP63 gene could cause disruption of these processes and may result in increased proliferation. In contrast to the case with HPV16 integration into UM-SCC-47, in which fusion transcripts were produced, we found no fusion transcript of reverse L1 into TP63 in tumor 1971. In fact, we did find an intact TP63 exon4–exon5 transcript. Thus, TP63 may not have been disrupted by the HPV insertion in tumor 1971.

The integration analysis of the recurrent tumors revealed viral integrations into celluar genes in each case. Alterations of cellular genes as a consequence of viral integration may provide a second mechanism of oncogenesis in HNSCC. Cellular gene disruption caused by viral integration has been reported in rare cases of malignant transformation by low-risk HPV types that lack E6 and E7 oncogenic activity (45, 46). Thus, viral integration events into genic regions cause disruptions that are likely to alter cellular gene expression and mediate additional carcinogenic mechanisms, resulting in a more aggressive tumor phenotype. Parfenov and colleagues (18) noted that the majority of the HPV breakpoints they analyzed colocalized with somatic copy number variants. In the current study, not only was integration into a cellular gene identified in every recurrent tumor, each of the genes disrupted by viral integration (TNFRSF13B, UBE2V2, SCN2A, SH2B1, SMOC1, NFIA, and SEMA6D) is involved in a pathway or mechanism that is related to cancer or is differentially expressed in some cancers (Supplementary Tables S6–S8).

Transcription analysis across the integration sites showed that the effects of integration vary from tumor to tumor and, in some cases, are associated with very complex genomic rearrangements. In the five recurrent tumors, there were seven integration events into cellular genes. In three of the seven events (TNFRSF13B, UBE2V2, and SH2B), intact transcripts were detected both across the integration site and elsewhere in the gene. In all of these cases, the integration was intronic, and it is possible that the gene was spliced in-frame across the integration, eliminating the virus from the intron without disrupting the local transcript. However, it is also possible in these cases that the intact transcripts come from a different copy of the gene. Whether this results in reduced gene dosage or whether the full transcript is intact is not known. It is important to note that while intact UBE2V2 transcripts were identified both across the integration site and elsewhere in the gene, a fusion transcript between HPVE1 and UEB2V2 was generated. Sequence analysis of this fusion transcript demonstrates the severity of chromosome disorder in the cellular genome, as well as within the viral genome, with rearrangements resulting in production of a transcript containing exon 1 of UBE2V2 (located on chromosome 8), HPV E1, nonsense sequence, HPV L1, and an intergenic region of chromosome 17q11.2.

In the remaining four recurrent tumors, viral integrations affecting SCN2A, SMOC1, NFIA, and SEMA6D resulted in disruption of gene transcription. In two of these cases, genomic instability was again demonstrated by complex chromosome rearrangements involving the viral integration site. In tumor 0843, a transcript spanning the integration site in SCN2A was found to involve HPVL1, a portion of chromosome 2q34 (including part of the ABCA12 gene), and an intergenic region of chromosome 1q32. In tumor 2049, a fusion transcript generated between HPV E1 and SMOC1 included both exon 1 of SMOC1 (located on chromosome 14) and a region of chromosome 3p23.

Viral integration into a gene does not inevitably cause loss of gene expression; we have shown that transcription of some or all of the genes can persist, possibly from additional, unaltered copies of the gene, or by splice removal of intron-integrated virus. Likewise, detection of portions of gene transcripts does not nessessarily indicate that full-length transcripts are intact and that the appropriate protein is produced. In recurrent tumor 0843, a complex fusion transcript across the HPV insertion in exon 16 of SCN2A was produced, but an upstream transcript across exons 18–19 was not. We cannot eliminate the possibility that the gene transcripts that were found were incomplete, inactive, or otherwise defective. Full transcript analysis will help to provide a better understanding of the effect of viral integration on the gene expression and function.

Upregulation of cellular genes is a possible consequence of viral integration as well, either through disruption of transcriptional repression, generation of fusion transcripts, or other mechanisms. Viral integration can both result from genomic instability and contribute to genomic instability. Oncogenic activities of E6 and E7 promote instability through unregulated cellular proliferation and aberrant progression through the cell cycle, bypassing important checkpoints for genomic integrity. These effects of HPV E6 and E7 allow the unstable cell to incur viral integration, resulting in increased viral oncoprotein expression (through disruption of E2). Furthermore, prolonged suppression of p53 function and associated aberrant checkpoint function can cause further chromosomal damage (47). The process of HPV integration into the cellular genome appears to be highly clastogenic, in some cases leading to additional dsDNA breaks, resulting in further rearrangement of the viral and cellular genomes. A recent publication supports this postulate (19). As tumor cells progressively acquire chromosome rearrangements from oncogenic processes, the genome becomes more disorganized and abnormal (47–49).

The limitations of the DIPS-PCR method restrict detection of cellular integration sites to those that have a restriction site in relatively close proximity. Viral rearrangement or convoluted integrations (multiple concatenated copies, alternate orientations) can reduce the sensitivity of the method and increase the complexity of analyzing the results as described in the rolling loop integration model by Akagi and colleagues (19). Nevertheless, identification of cellular genes affected by viral integration in all five recurrent tumors, together with detection of rearranged chromosomes, demonstrates the validity of the approach and the extent of cellular disorder present in the nonresponsive and recurrent tumor cells.

Our evaluation of hrHPV transcriptional activity and integration in these tumors provides support for the hypothesis that viral integration analysis may identify additional virus-induced gene disruptions. Clearly, much more work is needed before we fully understand the impact of integration sites on tumor behavior. Eventually, locating cellular genes with viral integration and assessing subsequent alterations in cellular expression may be included in a schema of factors for assessing patients for reduced, alternate, or targeted therapy.

Grant Support

This study was supported by NIH-NCI Head and Neck SPOREP50 CA097248, NIH-NIDCR
R01 DE019126, NIH-NCI P30 Cancer Center Support Grant P30 CA46592, and NIH-NIDCD P30 DC05188, and NIH-NCI R01CA194536. H.M. Walline was supported by the Cancer Biology Training Grant NIH-NCIT32 CA09676, and the Eleanor Lewis Scholarship through the University of Michigan Rackham Graduate School.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Acknowledgments

The authors acknowledge the assistance of the University of Michigan DNA core, especially Dr. Robert Lyons and Ellen Pedersen. We also thank the members of the University of Michigan Head and Neck SPORE team, as well as our nurses and clinical colleagues, including Drs. Vasu Divi, Lance Oxford, and Theodoros Teknos, who are now at other institutions. Most of all, we thank our patients who so graciously allowed us to study their tissues and clinical information.

Footnotes

Note: Supplementary data for this article are available at Molecular Cancer Research Online (http://mcr.aacrjournals.org/).