Abstract

Background

Chromosomal and genomic aberrations are common features of human cancers. However, chromosomal numerical and structural aberrations, breakpoints and disrupted genes have yet to be identified in esophageal squamous cell carcinoma (ESCC).

Methods

Using multiplex-fluorescence in situ hybridization (M-FISH) and oligo array-based comparative hybridization (array-CGH), we identified aberrations and breakpoints in six ESCC cell lines. Furthermore, we detected recurrent breakpoints in primary tumors by dual-color FISH.

Conclusions

The combination of M-FISH and array-CGH helps produce more accurate karyotypes. Our data provide significant, detailed information for appropriate uses of these ESCC cell lines for cytogenetic and molecular biological studies. The aberrations and breakpoints detected in both the cell lines and primary tumors will contribute to identify affected genes involved in the development and progression of ESCC.

Background

Chromosomal and genomic rearrangements are significant features of malignant human tumors. Rearrangements are often associated with structural aberrations, such as translocations, insertions and inversions. They could also result in the copy number alterations (CNAs) [1, 2]. Characterizing rearrangements and genes affected by the aberrations and breakpoints might help us to understand tumor development and progression better.

The products and implications of chromosomal rearrangements (e.g., fusion genes, truncated genes, and gene dysregulation by ectopic promoters) have been described in leukemia, lymphoma, sarcomas, and epithelial cancers [3, 4]. It was initially difficult to detect chromosomal rearrangements and affected genes in the epithelial cancers, mainly due to the technical difficulty of preparing metaphase spreads from primary epithelial tumors and the karyotypic complexity. Until recently, multiple gene rearrangements and even genomic landscapes which reflect the structural aberrations throughout the genomes have been identified in multiple types of epithelial cancers, including prostate cancer [5, 6], breast cancer [7, 8], lung cancer [9, 10], colorectal cancer [11], gastric cancer [12], head and neck cancer [13], hepatocellular carcinoma [14] and so on.

Recently, it has been reported that recurrent rearrangements could affect genes at the boundaries of CNAs [2, 15], thus recurrent breakpoints might be important for screening and identifying frequent unbalanced rearrangements and the involved genes. Multiplex-fluorescence in situ hybridization (M-FISH) [16] and spectral karyotyping (SKY) [17] were designed to replace traditional G-banding in chromosomal analyses of tumor cells, but the resolution of these techniques is not sufficient to detect small rearrangements. Array-based comparative genomic hybridization (array-CGH) was developed to analyze the CNAs, including genomic gains, losses, amplifications and deletions [18, 19]. It was recently demonstrated that array-CGH could be used to identify unbalanced breakpoints of the rearrangements in many types of cancer cells at a potentially higher resolution [20–24]. Array-CGH has also been used, in combination with cytogenetic information, to determine the breakpoints in reciprocal translocations [25].

Esophageal cancer (EC) is a common malignant epithelial cancer worldwide, causing more than 40,000 deaths each year [26]. The most prevalent type of EC is esophageal squamous cell carcinoma (ESCC), and China is among the highest risk areas [26, 27]. Recently, our group reported the karyotype of ESCC cell line KYSE180 [28] and KYSE450 [29] by 12-color M-FISH, as well as the karyotype of KYSE410-4 by 6-color M-FISH [30]. CGH [31–34], SKY and CGH [35], and array-CGH [36–38] experiments from other groups have also been performed on ESCC cell lines and primary tumors. These studies have revealed numerical and structural chromosomal aberrations. However, genomic rearrangements, breakpoints and genes that are involved in ESCC remain to be decoded and clarified.

Our study intended to identify candidate recurrent breakpoints which might affect genes at or near the boundaries. In this study, we describe CNAs and unbalanced genetic rearrangements in six ESCC cell lines through a combination of M-FISH and 44K array-CGH techniques. We found recurrent breakpoint regions in the cell lines and breakage of several regions present in primary ESCC tumors, which may contribute to disruption of critical genes.

Methods

Cell lines and sample collection

ESCC cell lines KYSE30, KYSE150, KYSE180, KYSE450, KYSE510 and YES2 were kindly provided by Yutaka Shimada (Kyoto University, Japan). KYSE150 and KYSE510 were established from female patients, and KYSE30, KYSE180, KYSE450 and YES2 were from male patients. Each cell line was cultured in RPMI-1640 (Invitrogen, USA) supplemented with 10% fetal calf serum (FCS). ESCC tissue samples were procured from Chinese Academy of Medical Sciences Cancer Hospital. All the samples used in this study were residual specimens collected after diagnosis sampling. And all patients received no treatment before surgery, and signed separate informed consent forms for the sampling and molecular analyses. This study has been approved by the Ethics Committee/IRB of Cancer Institute (Hospital), PUMC/CAMS.

Metaphase chromosomes and interphase cell nuclei preparations

Metaphase chromosomes from ESCC cell lines and normal peripheral blood lymphocytes were harvested after incubation with 0.04 μg/ml Colcemid (Invitrogen) at 37°C for 1-2 hours, followed by treatment with a hypotonic solution (0.075 mol/L KCl) for 30 minutes and three successive changes of the fixative solution (methanol/acetic acid, 3:1). ESCC tissue samples were cut into small pieces in phosphate-buffered saline (PBS), and the interphase nuclei were then prepared following the procedures described above. Metaphase chromosomes and interphase cell nuclei in suspensions were stored at 4°C overnight and then stored at -20°C until use. The nuclear suspensions were dropped onto clean slides and aged at room temperature for 2-3 days prior to the FISH experiments.

The slides for M-FISH and dual-color break-apart FISH analyses were pretreated with RNase A (100 mg/ml in 2 x saline sodium citrate [SSC]) and pepsin (50 mg/ml in 0.01 mol/l HCl). The slides were subsequently denatured in 70% formamide/2 x SSC at 73°C-75°C for 3 minutes, quickly cooled with two rinses of 2 x SSC at 4°C, dehydrated in a gradient series of ethanol (75%, 85% and 100%), and air dried. The labeled probes were precipitated, and redissolved in the hybridization solution (50% formamide, 10% dextran sulfate, 1% Tween-20, 2 x SSC), denatured at 75°C for 8 minutes, and quick-chilled on ice for 2 minutes. Hybridization was performed in a humid chamber at 37°C for 24-48 hours. Post-hybridization washes were performed in 50% formamide/2 x SSC for 15 minutes at 43°C and were performed twice for 3 minutes each in 2 x SSC. The slides were dehydrated in 75%, 85% and 100% ethanol, air dried, counterstained with 40,6-diamidino-2-phenylindole (DAPI) (1 mg/ml) and covered with coverslips.

For 12-color FISH analysis [28], the slides were hybridized twice on metaphase spreads as previously described, which was named two-round FISH. After digital fluorescence image acquisition, coverslips on the slides were removed by dipping in 100% ethanol for 30 min, and washed twice in 100% ethanol for 3 min each time, then air dried, and then the slides could be denatured as the above procedures.

Genomic DNA from ESCC cell lines was isolated using DNeasy Blood & Tissue Kit (Qiagen, Germany). Genome-wide copy number studies were then performed using an Agilent 44K oligo array platform (Agilent Technologies, USA), with sex-matched normal human DNA (Promega Corporation, USA) used as the reference. Briefly, 1 μg samples of the tested and reference DNA were digested with AluI and RsaI, and differentially labeled with Cy3-dUTP and Cy5-dUTP using Agilent Genomic DNA Enzymatic Labeling Kit Part Number 5190-0449 (Agilent Technologies), respectively. Then Microcon YM-30 (Millipore) was used to clean up the labeled probes. Tested and reference DNA probes were combined and hybridized onto the microarrays enclosed in Agilent SureHyb-enabled hybridization chambers for 40 hours. After hybridization, slides were washed sequentially and scanned with an Agilent DNA Microarray Scanner. Annotations for the probes were based on UCSC hg18 (NCBI Build 36). CNAs and breakpoint data were analyzed via the Agilent Genomic Workbench Software 5.0, set to use the ADM-2 algorithm, an aberration threshold of 5.0 and an absolute average log2 ratio ≥ 0.5.

Statistical analysis

Statistical analyses were carried out by using the SPSS 17.0 software package. The association between splitting of breakpoint regions and clinico-pathological characteristics were assessed by the χ2 test, Fisher’s exact test or Kruskal–Wallis test. Logistic regression analysis was performed to determine the independent predictors of lymph node metastasis. P values < 0.05 were considered significant.

Results

Copy number alterations

M-FISH was performed on the metaphase chromosomes of four ESCC cell lines (KYSE30, KYSE150, KYSE510 and YES2). Modal karyotypes of the cell lines are shown in Figure 1. M-FISH karyotypes of two other cell lines KYSE180 [28] and KYSE450 [29] have been previously reported by our laboratory. We found multiple numerical alterations in the six cell line, which exhibited high level of aneuploidy. An overview of CNAs indicated that imbalances occurred throughout the entire genome of the cell lines. Gains were observed at 1p, 1q, 3p, 3q, 4p, 5q, 7p, 8q, 9q, 11q, 14q, 16p, 16q, 17p, 17q, 18p, 19p, 19q, 20q, and 22q. Losses were primarily detected at 3p, 4p, 4q, 6p, 6q, 9p, and 18q.

Figure 1

M-FISH profiling of ESCC cell lines. A 24-color analysis technique was used for KYSE30 (A), KYSE150 (B), and YES2 cells (C) and 12-color detection was used for KYSE510 cells (D).

The detail CNAs of these cell lines were detected by array-CGH, and the profiles of gains and losses are shown in Figure 2 and Additional file 1: Table S1. Our results were compared with the data available from Cancer Cell Line Project on the Wellcome Trust Sanger Institute Cosmic website (http://www.sanger.ac.uk/genetics/CGP/cosmic). Copy number data of KYSE150, KYSE450 and KYSE510 on the website were analyzed using Affymetrix SNP6.0 arrays. Copy number profiles derived from our Agilent 44K platform are very similar to those from the Affymetrix platform. We then compared CNAs among the six cell lines according to the array-CGH data, and frequent gains and losses in at least two cell lines were summarized in Table 1. More gains were found than losses. The results were combined with the data from other 17 ESCC cell lines available on Cosmic website, including KYSE70, KYSE140, KYSE270, KYSE410, KYSE520, Colo-68N, EC-GI-10, HCE-4, TE-1, TE-5, TE-6, TE-8, TE-9, TE-10, TE-11, TE-12 and TE-15. The gains with high frequencies were shown in Additional file 2: Table S2.

Figure 2

CNAs and unbalanced breakpoints in six ESCC cell lines detected by array-CGH. Gains and amplifications are presented as lines on the right side of the chromosomes, while the lines for losses and deletions are on the left side. Unbalanced breakpoints are at the boundaries of CNAs. Numbers on top of the lines are indicated as the cell lines. 30: KYSE30, 150: KYSE150, 180: KYSE180, 450: KYSE450, 510: KYSE510, 2: YES2.

Table 1

Frequent gains and losses in six ESCC cell lines analyzed by array-CGH

Regions with average log2ratio values greater than 1 were defined as amplifications. High-level amplifications (HLAs) and homozygous deletions (HDs) were identified if the absolute average values were no less than 2. According to the positions of HLAs and HDs boundaries, the smallest HLA and HD regions and involved genes among these cell lines were listed in Table 2. HLAs include 7p11 (2/6, 33%), 8q24.21 (2/6, 33%) and 11q13.3-q13.4 (3/6, 50%), harboring several oncogenes, including EGFR, MYC and CTTN (Table 2). Homozygous deletion of 9p21.3, containing tumor suppressor genes CDKN2A, CDKN2B and CDKN2B-AS1, occurred in 67% (4/6) of the cell lines.

Table 2

High level amplifications and homozygous deletions in different cell lines

HLA: high level amplification; HD: homozygous deletion; the Start and Stop positions are annotated by hg18.

a Regions of amplifications or deletions in at least two cell lines.

Unbalanced breakpoints

Breakpoints were restricted to the boundaries between two adjacent DNA fragments with significantly distinctive log2 ratio values, reflecting different copy numbers. Using this scheme, 261 candidate unbalanced breakpoints were identified ( Additional file 3: Table S3). Among these candidates, 39 occurred in the centromeric regions, and the other 224 were present on chromosome arms. Fifty-seven of arm breakpoints were localized in the vicinity of fragile sites. Breakpoints on chromosome arms and copy number status of the regions at both sides of the breakpoints were listed in Additional file 3: Table S3. Cell lines were ranked according to the number of breakpoints, and the top three were KYSE30, KYSE510 and YES2, respectively. This tendency was similar to that in M-FISH results.

Chromosomal structural aberrations

M-FISH results of four cell lines (KYSE30, KYSE150, KYSE510 and YES2) as well as that of the previously reported two cell lines (KYSE180 and KYSE450) showed that a total of 156 derivative chromosomes resulted from translocations, most of which were unbalanced; only 12.8% (20/156) were reciprocal. Approximately, 35% of the translocation derivative chromosomes were fused at the centromeric regions. Chromosomes 1, 2, 3, 5, 6, 7, 8, 9, 11, 12, 14, 15 and X were frequently rearranged. Combining M-FISH with array-CGH, we further characterized multiple rearrangements present in these cell lines (Table 3). KYSE30 is the cell line with the most complex rearrangements, and array-CGH results have also indicated that much more breakpoints were present in KYSE30 than the other cell lines, which are consistent with M-FISH results.

Table 3

Chromosomal structural aberrations analyzed by a combination of M-FISH and array-CGH

Positions of the breakpoints were compared among different cell lines, and further the distances of near breakpoint regions were calculated. Five regions (11q13.4, 9p21.3, 15q25.3, 3q28 and 10q26.3) had breakpoints less than 1 Mb apart, and twelve (11q13.3, 4p13, 11p13, 8q24.21, 2q35, 1q31.1, 21q21.1, 9p21.3, 18q12.2, 3p14.2, 3q12.1-q12.2 and 6p12.3-p12.2) had breakpoints less than 2 Mb apart in different cell lines (Table 4). For example, breakpoints at 11q13.4 were detected in KYSE30 and KYSE510, while breakpoints at 11q13.3 were detected in KYSE30 and KYSE180 (Figure 3A). The three cell lines presented gain of 11q and amplification of 11q13, in which copy numbers of the regions flanking centromere to 11q13 amplicon was higher than the region distal to the amplicon. Losses of the region distal to 11q13 were also found as del(11q13.4-qter) in KYSE30, del(11q14.3-q21) and del(11q22.3-qter) in KYSE510. Translocations of highly amplified regions were also observed (Table 3).

Table 4

Recurrent breakpoint regions analyzed by array-CGH in ESCC cell lines

Region

Cell line

CN statusa

BP Intervals (hg 18) in array-CGH

Distance between cell linesb

Genesc

CFSs

9p21.3 a

KYSE180

Neutral/Del

21958099-21968346

115.1 kb

C9orf53§, CDKN2A*, CDKN2B†, CDKN2B-AS1†

FRA9C

KYSE450

Neutral/Del

21853263-21968346

MTAP§, C9orf53§, CDKN2A*

KYSE510

Loss/Del

21958099-21968346

C9orf53§, CDKN2A*, CDKN2B†, CDKN2B-AS1†

YES2

Neutral/Del

21853263-21957548

MTAP§, C9orf53†, CDKN2A†

15q25.3

KYSE450

Neutral/Gain

86361096-86429254

154.2 kb

LINC00052§, NTRK3*, MRPL46†, MRPS11†

KYSE510

Neutral/Gain

86275066-86429254

LINC00052§, NTRK3*, MRPL46†, MRPS11†

11q13.4

KYSE30

Amp/Loss

70964483-71305189

340.1 kb

KRTAP5-11§, FAM86C1, DEFB108B, RNF121†

FRA11H

KYSE510

Amp/Neutral

70964483-71305189

KRTAP5-11§, FAM86C1, DEFB108B, RNF121†

3q28

KYSE180

Gain/Gain

191171376-191222891

517.0 kb

TP63§, LEPREL1*, CLDN1†

KYSE510

Neutral/Gain

191610761-191688399

CLDN1§, CLDN16, TMEM207†, IL1RAP†

10q26.3

KYSE30

Neutral/Gain

133045086-133476780

801.8 kb

TCERG1L§, PPP2R2D†

YES2

Neutral/Gain

133795639-133846905

BNIP3§, JAKMIP3*, DPYSL4†

11q13.3

KYSE30

Amp/Amp

69339391-69569221

1.05 Mb

FGF3§, ANO1†, FADD†

FRA11H

KYSE180

Amp/Amp

70182767-70386856

CTTN§, SHANK2*, DHCR7†

4p13

KYSE150

Amp/Loss

41832777-42109513

1.15 Mb

SLC30A9§, BEND4, SHISA3†, ATP8A1†

KYSE510

Neutral/Loss

40955943-41226036

UCHL1§, LIMCH1*, PHOX2B†

11p13

KYSE510

Neutral/Gain

33107818-33136537

1.2 Mb

TCP11L1§, PIGCP1§, CSTF3*, HIPK3†

YES2

Neutral/Gain

34278741-34307224

NAT10§, ABTB2*, CAT†

8q24.21

KYSE450

Amp/Neutral

129216964-129574570

1.23 Mb

FAM84B§, POU5F1B, MYC†, PVT1†

KYSE510

Amp/Loss

129972316-130159085

FAM84B§, POU5F1B, MYC†, PVT1†

YES2

Amp/Neutral

130159144-130451718

far from genes, PVT1§, GSDMC†

2q35

KYSE30

Loss/Neutral

217432472-218386863

1.37 Mb

DIRC3§, TNS1, CXCR2P1†

KYSE150

Loss/Gain

218517852-218801703

TNS1§, CXCR2P1, CXCR2, CXCR1, HMGB1P9, ARPC2†

1q31.1

KYSE150

Gain/Loss

186080345-186315797

1.40 Mb

PLA2G4A§, FAM5C†

FRA1K

KYSE450

Gain/Neutral

184912220-185051701

PTGS2§, PLA2G4A†

21q21.1

KYSE30

Gain/Neutral

17060792-17145790

1.48 Mb

USP25§, C21orf34§, CXADR†

YES2

Gain/Neutral

18435266-18540695

CHODL-AS1§, CHODL†, TMPRSS15†

9p21.3 b

KYSE180

Del/Neutral

21999029-22136626

1.45 Mb

CDKN2A*, CDKN2B, CDKN2B-AS1, DMRTA1†

FRA9C

KYSE450

Del/Neutral

21980581-21993651

CDKN2A*, CDKN2B, CDKN2B-AS1*, DMRTA1†

KYSE510

Del/Loss

22992377-23425976

DMRTA1*, ELAVL2†

YES2

Del/Neutral

21999029-22136626

CDKN2B, CDKN2B-AS1, DMRTA1†

18q12.2

KYSE450

Gain/Loss

33583906-33747373

1.64 Mb

CELF4§, LOC647946†

FRA18A

KYSE510

Amp/Loss

32107441-32200063

MOCOS§, FHOD3, C18orf10†

3p14.2

KYSE150

Gain/Loss

58573676-58887412

1.69 Mb

FAM107A§, FAM3D, C3orf67, FHIT†

FRA3B

KYSE450

Neutral/Del

59933661-60267262

FHIT#

3q12.1-q12.2

KYSE180

Neutral/Amp

100190484-100877203

1.86 Mb

TFG§, ABI3BP*, IMPG2†

KYSE450

Neutral/Gain

102009730-102076392

DCBLD2§, COL8A1†

6p12.3-p12.2

KYSE150

Loss/Amp

51932658-52161439

1.9 Mb

PKHD1§, IL17A†, MCM3†

YES2

Gain/Gain

50261630-50627364

DEFB112§, TFAP2D†

a Copy number status on the left and right side of the breakpoint regions. CN: copy number, Amp: amplification, Del: deletion.

b The distance between two outermost breakpoints of all the different cell lines.

c These genes are located at or close to breakpoints in each cell line. “*”: Obvious breakpoints were detected inside of genes. “§” and “†”: Genes at the left and right side of the breakpoint regions, respectively. Genes that are not labeled are located in the breakpoint regions, but positions of the exact breakpoints are not determined. “#”: Genes with an inside homozygous deletion (HD), and thus might also be disrupted.

Figure 3

11q13.3 and 11q13.4 are recurrent breakpoint regions in both the ESCC cell lines and primary tumors. (A) Breakpoints in 11q13.3 and 11q13.4 in the cell lines detected by array-CGH. Breakpoints fall between the probes that displayed significant copy number discrepancies (blue arrows). Amplifications are indicated with red signals. (B) The 1-Mb BAC clones that are used in dual-color FISH experiments are shown in the ideogram. Also, 11q13.3 and 11q13.4 are divided into smaller regions according to the positions of BAC DNA clones: 11q13.3(1) (from NONSC16D6 to Cancer_1D11), 11q13.3(2) (from Cancer_1D11 to NONSC2E5) and 11q13.4(1) (from NONSC2E5 to NONSC3C5). (C) Detection of 11q13.3-q13.4 breakpoint regions by dual-color FISH in ESCC tumors. BAC DNA probes at two sides of the breakpoint regions are labeled with Cy3-dUTP (red) and Green-dUTP (green). The BAC clones used for each region are listed beside the panel. Two examples of positive tumors are shown for each region. “Tumor 1” and “Tumor 2” samples shown for different regions may not be from the same cases. Most splitting-positive nuclei exhibited amplifications on one side, even high-level amplifications, revealing the breakages between red and green signals. Normal: peripheral blood from normal persons.

Genes which might be interrupted by the recurrent breakpoints in each cell line were listed in Table 4. Ten of these common breakpoint regions were localized in the vicinity of fragile sites. Genes in these cell lines with inner breakpoints included CDKN2A, LEPREL1, JAKMIP3, LIMCH1, CSTF3, ABTB2, CDKN2B-AS1, FHIT and ABI3BP. For these genes, one breakpoint could be detected. Small HDs were also observed inside some genes, resulting in two breakpoints, such as FHIT gene in KYSE450. Other genes flanking or close to the boundaries might also be influenced by the breakpoints.

To determine whether genomic aberrations found in these cell lines are also present in primary tumors, we first tested a small sample of 15 ESCC tumors by dual-color FISH. This analysis revealed splitting of regions 11q13.3-q13.4, 9p21, 15q25.3 and 3q28, which presented the highest frequency of disruption in the cell lines. Splitting of these regions had occurred in 5, 1, 2 and 3 out of 15 tumors, respectively. We also examined online data of ESCC cell lines. The results showed that both high level amplifications and breakages existed at 67-72 Mb positions in 11q13 (Figure 3). Multiple breakpoints are present in most of the cell lines, revealing these positions may be highly rearranged.

Due to the highest splitting frequency of 11q13.3-q13.4 in the initial 15 cases, we further expanded the sample pool to further characterize splitting of this region in primary ESCC cases (Figures 3B and 3C). Splitting frequencies of 11q13.4 and 11q13.3 were 36.6% (49/134) and 23.4% (32/137), respectively. Overall, breakage of 11q13.3-q13.4 was observed in 58 out of 134 cases (43.3%). Next, we divided the whole 11q13.3-q13.4 region into several parts, including 11q13.3(1) (CPT1A, MRPL21, IGHMBP2, MRGPRD, MRGPRF, TPCN2, MYEOV), 11q13.3(2) (CCND1, ORAOV1, FGF19, FGF4, FGF3, ANO1, FADD, PPFIA1, CTTN and SHANK2), 11q13.4(1) (DHCR7, NADSYN1, KRTAP5-7, KRTAP5-8, KRTAP5-9, KRTAP5-10, KRTAP5-11, FAM86C1, DEFB108B, RNF121, IL18BP, NUMA1, LRTOMT, FOLR3, FOLR1, FOLR2, INPPL1, PHOX2A and CLPB). Regions of 11q13.3(1), 11q13.3(2), 11q13.4(1), 11q13.3(2)-q13.4 and 11q13.3(2)-q13.4(1) were split in 16.3% (22/135), 8.4% (11/131), 24.1% (32/133), 41.0% (55/134) and 30.0% (39/130) of the primary ESCC tumors, respectively. Almost all of the array-CGH images of the cell lines in Figure 3A and Additional file 4: Figure S1 showed amplification of the region proximal or distal to the breakpoints. Similarly, most of the splitting-positive ESCC tumors examined by FISH presented focal high-level amplification of the region. The majority of breakpoints between NONSC16D6 and Cancer_1D11 were proximal to the amplicon, while most of the breakpoints between Cancer_1D11 and NONSC2E5 as well as those between NONS2E5 and NONSC15F5 were distal to the amplicon (Figure 3C and Additional file 5: Table S4).

Correlations between split and amplified regions and clinicopathological characteristics

Clinicopathological parameters of each patient were listed in Additional file 6: Table S5, and the relationships between regional splitting events and clinicopathological characteristics were summarized in Table 5. Splitting of 11q13.3-q13.4 was significantly correlated with lymph node metastasis (LNM) (P = 0.004) and advanced stages (P = 0.004). In LNM-positive group, 54.9% (39/71) of the tumors exhibited splitting compared with 30.2% (19/63) in LNM-negative group. Tumors at stages IIb/III/IV (53.9%, 41/76) showed higher frequencies of splitting than those at stages I/IIa (29.3%, 17/58). Breakpoint regions of 11q13.4, 11q13.4(1), 11q13.3(2)-q13.4(1) and 11q13.3(2)-q13.4 were also associated with LNM and advanced stages (P < 0.05, Additional file 7: Table S6). No correlations were observed between splitting in these regions and other clinico-pathological parameters, including gender, age, tumor size, and differentiation (Table 5 and Additional file 7: Table S6). We also tested the relationship between amplification of this region and clinical features. 11q13.3-q13.4 amplifications were defined as the number of FISH signals was at least 3 ( Additional file 5: Table S4). A positive correlation was observed between 11q13.3-q13.4 amplification and LNM (P = 0.022) or advanced stages (P = 0.039) (Table 5). Amplification of 11q13.3 or 11q13.4 alone was not associated with the parameters ( Additional file 8: Table S7).

Table 5

Relationship between splitting and amplification of 11q13.3-q13.4 and clinical features of primary ESCC tumors

Clinical features

Splitting

Amplification

Frequency

P value

Frequency

P value

Gender

Male

47.2% (50/106)

0.089 a

86.2% (94/109)

1.000 a

Female

28.6% (8/28)

89.3% (25/28)

Age

< 60

47.8% (33/69)

0.274

87.1% (61/70)

1.000 a

≥ 60

38.5% (25/65)

86.6% (58/67)

Tumor size

T1, T2

50.0% (10/20)

0.511

85.0% (17/20)

0.728 a

T3, T4

42.1% (48/114)

87.2% (102/117)

Lymph node metastasis

N0

30.2% (19/63)

0.004

79.4% (50/63)

0.022 a

N1

54.9% (39/71)

93.2% (69/74)

Stage

I, IIa

29.3% (17/58)

0.004

79.3% (46/58)

0.039 a

IIb, III, IV

53.9% (41/76)

92.4% (73/79)

Differentiation

G1

40.7% (11/27)

0.762 b

85.2% (23/27)

0.118 b

G2

41.7% (30/72)

91.9% (68/74)

G3

48.6% (17/35)

77.8% (28/36)

a Fisher’s test.

b Kruskal–Wallis test.

The P value which is not labeled with “a” or “b” is assessed by χ2 test.

In order to create a multivariate model describing the risk for LNM, univariate and multivariate logistic regression analyses were performed with respect to gender, age, tumor size, differentiation status, as well as 11q13.3-q13.4 splitting and amplification. Multivariate analysis indicated that only splitting of 11q13.3-q13.4 was an independent predictor for LNM in ESCC (P = 0.026, RR = 2.357, Table 6).

Table 6

Logistic regression analyses of the impact of clinico-pathological factors and 11q13.3-q13.4 splitting and amplification on LNM

Clinical features

LNM

Univariate analysis

Multivariate analysis

N0

N1

P value

RR (95% CI)

P valuea

RR (95% CI)

Gender

Male

48

61

0.368

1.466 (0.637-3.374)

-

Female

15

13

Age

< 60

32

38

0.948

0.978 (0.499-1.915)

-

≥ 60

31

36

Tumor size

T1, T2

11

9

0.384

1.528 (0.589-3.964)

-

T3, T4

52

65

Differentiation

G1

16

11

0.295

1.308 (0.792-2.161)

-

G2

31

43

G3

16

20

Splitting status

Non-splitting

44

32

0.004

2.822 (1.384-5.757)

0.026

2.357 (1.110-5.004)

Splitting

19

39

Amplification status

Non-amp

13

5

0.022

3.588 (1.202-10.712)

0.165

2.265 (0.715-7.175)

Amp

50

69

a The P value of each variable which is not significantly correlated with LNM in univariate analysis is indicated with “-” in the multivariate analysis. Multivariate logistic regression analysis is performed using forward procedures.

Discussion

Genomic numerical and structural alterations are common features in ESCC. Our study characterized CNAs, structural aberrations, and recurrent breakpoints in six ESCC cell lines by a combination of M-FISH and array-CGH analyses, which helps provide accurate karyotypes of these cell lines. We further found the correlation between splitting of an amplified region 11q13.3-q13.4 and lymph node metastasis.

Genomic CNAs may influence gene expression through the following mechanisms. A well known mechanism is that gains or losses may result in gene amplifications or deletions, and thus upregulate or downregulate the protein expression [40]. Different situations may occur on genes at the boundaries of gain or loss regions. CNA boundaries inside of the genes usually indicate gene breakage. Gene rearrangements may result from such breakages, leading to the formation of an aberrant gene product [41]. If the CNA boundaries occur in non-coding regions flanking genes, expression may be controlled by proximity to regulatory sequences from other genes. Alternatively, the recurrent breakpoint may indicate loss of a tumor suppressor gene distal to the CNA boundary [42]. Small deletions inside of the genes may result in structural aberrant proteins, truncated proteins, or even loss-of-function proteins. Small amplifications and deletions inside of genes may also indicate gene breakage, and the gene products may also be affected by rearrangements with the partner gene. On the other hand, many recurrent rearrangements occurred at boundaries of the breakpoints, resulting in fusion genes, truncated genes, as well as other structural variants [2]. Therefore, we focused on the breakpoints with CNAs involved in genomic rearrangements and breakpoints mapped to specific sites.

Copy number profiling of ESCC have been analyzed from different studies. Gains involved regions 5p, 7p, 7q, 8q, 11q, 12p, 14q, 16p, 19q, 20q and have been reported in ESCC cell lines by SKY and CGH [35], as well as at least 60% primary tumors by 32K array-CGH [36]. Recently, gains of 19p13.3-q13.43, 11q13.1-q13.4, 20p13-q13.33, 3q24-q29, 22q11.21-q12.1 have been reported [38]. We detected six cell lines, and compared with online database, the high frequency of gains mainly include 3q26.33-qter, 5p14.1-p11, 7pter-p12.3, 8q24.13-q24.21, 9q31.1-qter, 11p13-p11, 11q11-q13.4, 17q23.3-qter, 18pter-p11, 19p, 19q and 20q13.32-qter. Gain of 3q is prevalent in ESCC, and 3q26-qter was found in 76.5% (39/51) primary tumors [43] and 66.7% (4/6) cell lines [44], suggesting cancer-related genes may be present in 3q26-qter. Evidence has been found that PIK3CA[35, 45], PRKCI[46], and ZNF639[47] are amplified and overexpressed in ESCC. PIK3CA[45] and PRKCI[46] are associated with LNM and overexpression of PIK3CA and TFRC are associated with poor prognosis [48]. We found gain of 5p14.1-p11 existed in 83.3% (5/6) cell lines. In the previous study, gain of 5p13 was detected in 10% (3/29) ESCC cell lines. SKP2 on 5p13 was amplified and overexpressed in 50% (23/46) ESCC tumors, and was associated with LNM and stage. SKP2 overexpression could protect cancer cells from anoikis, which was mediated in part by the phosphoinositidyl 3-kinase-Akt pathway [49]. Gains of 18p11.2-p11.3 and 18p11.3 were also found in 20.7% and 17.2% cell lines by CGH [31] and FISH [50]. AURKA at 20q13 encodes a cell cycle-regulated kinase. Yang et al. found that overexpression of AURKA was existed in 85.7% (6/7) cell lines and 93.1% (27/29) tumors [51]. Recent studies have reported that AURKA is a direct target of the MAPK pathway [52]. Overexpression of AURKA is independently associated with chromosomal instability in colorectal cancer [53], and AURKA expression has a prognostic value in ovarian carcinoma [54]. High-level amplifications of 11q13.3-q13.4, 7p11.2, and 8q24.21 have been observed in this study. Amplification of 11q13.3-q13.4 will be discussed later. Amplification of 7p11.2, which harbors an important oncogene EGFR, was also found in ESCC from other studies [36]. Amplification and overexpression of EGFR gene may play roles in the invasion and progression in cancer [55], and the elevated expression may be an indicator for tumor recurrence and lower survival in head and neck squamous cell carcinoma (HNSCC) [56]. Amplification of MYC in 8q24.21 may contribute to the progression of breast cancer [57].

Losses were previously found on 3p, 5q, 8p, 9p and 11q [36], as well as 4p16.3-q35.2, 13q12.11-q34, 18p11.32-q23, and chromosome Y [38]. In our study, losses were observed on 18q12.2-qter, 3p14.1-p11, 4p15.32-p14, 4q22.1-q32.3, 9pter-p24.1, 9p23-p11, 11q23.3-qter, 18q12.2-q21.1, Xpter-p11, Xq21.1-q23 in more than 50% (3/6) of the cell lines, indicating several tumor suppressor genes may be located in these regions. 9p21.3 (CDKN2A and CDKN2B) is homozygously deleted in some of ESCC [58, 59]. It has been found that CDKN2A was fused to IGH through the translocation t(9;14)(p21;q32) in a pre-B acute lymphoblastic leukemia cell line [60]. The functional study demonstrated that restoring wild-type CDKN2A into the gene deleted ESCC cells significantly inhibited cell invasion, suggesting that inactivation of CDKN2A may be involved in ESCC invasion [59]. Our array-CGH results confirmed that HD frequency of 9p21.3 was 66.7% (4/6). Interestingly, CDKN2A was deleted in only one cell line, and the other three harbored at least one breakpoint inside of CDKN2A. For CDKN2B, the inside breakpoint was detected in one cell line, while the other three were homozygously deleted.

Recurrent breakpoint regions were detected in at least two cell lines, including 1q31.1, 2q35, 3p14.2, 3q12.1-q12.2, 3q28, 4p13, 4q22.1, 6p12.3-p12.2, 6p22.2-p22.1, 7q22.2-q22.3, 8q24.21, 9p21.3, 10q26.3, 11p13, 11q13.3, 11q13.4, 13q21.32, 15q25.3, 18q12.2 and 21q21.1. Many of these breakpoints were different from those detected by SKY in other ESCC cell lines [35]. The correlation between breakpoints of fusion genes and fragile sites has been emphasized in previous studies. Burrow et al. analyzed 444 pairs of genes involved in cancer-specific recurrent translocations, and found that 52% of the breakpoints in at least one gene of the fusion-gene pairs were localized within the vicinity of a fragile site [61]. Thus, understanding breakpoints near fragile sites may be helpful for further discovering cancer-related gene rearrangements.

11q13 is an important region that presents various aberrations in many malignancies. Gain of 11q13 has previously been described in ESCC [34, 36–38, 62, 63] and other solid tumors, including oral [42], gastric [64], breast [65–67], ovarian [68], prostate [69], bladder [70], laryngeal [71], nasopharyngeal [72], and liver tumors [73] as well as head and neck [74, 75] cancer. Gain frequencies of 11q13 varied between studies. The probable target genes that are amplified and/or overexpressed in different cancers have been reported to include CCND1[62, 76, 77], FGF4[62, 78], PPFIA1[67], CTTN[62, 76] and ORAOV1[79]. We observed that 11q13.3-q13.4 was a region with high-level amplification adjacent to the breakpoint boundaries. Breakpoints in 11q13.4 and 11q13.3 were both found in two cell lines, and breakpoints that were identified in 11q13.4 between KYSE30 and KYSE510 were closer to each other. Furthermore, breakpoints in the entire region of 11q13.3-q13.4 were present in more than 40% of ESCC tumors, suggesting that 11q13.3-q13.4 may be a frequently split region in ESCCs.

11q13 is also involved in various chromosomal rearrangements in both hematological malignancies and epithelial carcinomas. The t(11;14)(q13;q32) translocation is associated with 70%-90% of mantle cell lymphomas (MCL) [80, 81], a small portion of multiple myeloma (MM) [81, 82], acute myeloid leukemia (AML) [83] and other lymphoproliferative disorders [84]. As a result of this translocation, CCND1 is fused to the enhancer of immunoglobulin heavy chain gene (IGH), and thus overexpressed in MCL and MM [81]. MYEOV, proximal to CCND1, was also overexpressed in a subset of t(11;14)-positive MM cell lines, in which both MYEOV and CCND1 were under the control of IGH enhancers due to translocations [85]. There are also other partner genes fused to CCND1, including IGK involved in t(2;11)(p11;q13) in leukemic small-cell B-non-Hodgkin lymphoma (NHL) [86] and an unknown partner gene in AML with t(5;11)(q35;q13) [87]. 11q13 rearrangements with 6p21, 7q11.2, 1p13 and 5q35 were observed in renal oncocytoma. Jhang et al. demonstrated that CCND1 was overexpressed only in the cases with 11q13 translocation. However, not all of the cases with 11q13 translocations could lead to CCND1 overexpression [88]. Evidence of fusions involving other genes in 11q13 has been reported. NUMA-RARA and RUNX1-MACROD1 were present in the monocytic leukemia with t(11;17)(q13;q21) [89–91] and APL with t(11;21)(q13;q22) [92], respectively. Rearrangement of LRP5 was found in AML, although the partner gene has not been identified [93]. Most of the above translocations in lymphomas and leukemia are balanced and not complicated, while more complex rearrangements of 11q13 were detected in epithelial carcinomas, including cervical carcinoma cell lines [94], serous ovarian adenocarcinoma [95], hepatocellular carcinoma [96], gastric cancer [97] and oral squamous cell carcinoma (OSCC) [98, 99]. Our M-FISH results were much similar to the observations in these carcinomas. We found that chromosome 11 was frequently rearranged, especially in KYSE30, KYSE150 and KYSE510. In each cell line, five to six derivative chromosomes associated with chromosome 11 were easily found, and some complex derivative chromosomes involved q13 band of the chromosome. Genes involved in breakpoints of these rearrangements remain to be clarified.

The current array-CGH profiling enabled us to set the boundaries of 11q13 amplicons in ESCC cell lines. We observed that multiple breakpoints existed in high level amplification regions involving 11q13.3 were located in 67-72 Mb position in three ESCC cell lines we detected and ten online cell lines, which is similar to the amplification peak in HNSCC [75]. The mechanisms for formation of several amplicons have been well described by a model of breakage-fusion-bridge (BFB) cycle. According to this model, the formation of amplicons is initiated by distal DNA breakages at fragile sites. During DNA replication, a dicentric chromosome with an inverted duplication may be resulted from the sister chromatid fusion (SCF). Breakage-fusion-bridge cycle may continue when another break between two centromeres occurs. The cycle may be then stabilized by a telomere or by translocation [42, 98, 100–102]. Albertson suggested that amplicon boundaries might also be set by selection for overexpressed genes in the amplicons, or by selection against expression changes of genes outside of amplified regions induced by CNAs [101, 103]. 11q13 harbors three fragile sites, FRA11A, FRA11H and FRA11F[104]. FRA11A is a rare fragile site, while FRA11H and FRA11F are common fragile sites. FRA11A is located between RIN1 (11q13.2) and CCND1 (11q13.3) [98]. FRA11H is positioned at 11q13, but the exact location still needs to be characterized. FRA11F is located between the BAC clones of RP11-281H14 and RP11-841F15 in 11q14.2 [42]. Reshmi et al. found that OSCC cell lines with complex 11q rearrangements were affected by FRA11F, and gene amplifications in 11q13 region in OSCC cell lines may be initiated by breakage at FRA11F[42]. Shuster et al. demonstrated that breakages at FRA11A between RIN1 and CCND1 may promote the BFB cycles [98]. They also found the involvement of FRA11H in some OSCC cases with amplifications of genes in 11q13 [42, 105]. In the present study, distal boundaries of amplicons in the majority of ESCC cell lines and primary tumors with 11q13 amplification were clustered within 67-72 Mb region of 11q13.3, which may involve FRA11H breakages for these cases. Another breakpoint was observed at NAALAD2 gene in KYSE510, and it was located within FRA11F. In addition, breakpoints proximal of 11q13 amplicons in KYSE180, KYSE510 and five online cell lines were located in FRA11A region in 11q13.3, while the proximal breakpoints in KYSE30 and other online cell lines were distal to FRA11A or in FRA11H. In the tested ESCC tumors, the majority of breakpoints in 11q13.3(1) were proximal to the amplicons, and most of those in 11q13.3(2) and 11q13.4 were distal to the amplicons. Thus, we speculate that initial distal breakages may primarily occur at FRA11H, and the process may involve FRA11F in some cases. FRA11A or FRA11H may contribute to setting amplicon boundaries by promoting subsequent steps of BFB cycle. Concerning the presentation of multiple breakpoint boundaries in some of ESCC cell lines and primary tumors with high-level amplification of 11q13, several cycles of random breakages may be undergone.

In the current study, we have noticed that copy numbers of the regions from centromere to boundaries at initial breaks were higher than those of the regions distal to breakpoints in most of ESCC cell lines with 11q13 amplifications. Gains of proximal regions, losses of distal regions, intrachromosomal or interchromosomal rearrangements of 11q13 have been found in the cell lines or primary tumors of human cancer and demonstrated to be indicators of BFB cycle [98, 101]. At the end of BFB cycles, distal breakpoints of 11q13.3-q13.4 amplicon may undergo intrachromosomal rearrangements or translocating to other chromosomes, which may affect genes at distal boundaries through forming intragenic rearrangements or fusing to other genes. Notably, most of 11q13.3-q13.4 splitting cell lines according to our and online array-CGH data showed high-level amplification of 11q13 proximal or distal to the breakpoints in ESCC cell lines and primary tumors. Moreover, amplicons involving intrachromosomal or interchromosomal rearrangements have also been detected. Thus, recurrent breakage at 11q13.3-q13.4 may reflect the following aspects. On one hand, genes between two BACs flanking the regions may be amplified, with proximal gain and gene overexpression. On the other hand, breakages between two BACs and thus rearrangements of genes at the amplicon boundaries may also dysregulate expression of these genes.

The relationship between gain of 11q13 and LNM or prognosis have been analyzed and discussed in several studies. However, contrary opinions still exist. Tada et al. conducted CGH on 36 ESCC specimens, and demonstrated that gain of 11q13 did not occur at a significantly different rate between LNM and non-LNM groups [106]. Genes located in 11q13 were analyzed. Amplification of CTTN was correlated with LNM, while no significant association was found between CCND1 amplification and LNM [76], however, predicting CCND1 amplification using plasma DNA may be an independent prognostic factor in ESCCs [107]. Komatsu et al. found that overexpression of ORAOV1 showed a significant association with LNM and stages. Gain of 11q13.2 was determined to be an independent prognostic factor for predicting poor outcome, and amplification of CPT1A in 11q13.2 was correlated with shorter overall survival in ESCC. Here, we found the correlation between 11q13.3-q13.4 amplification and LNM as well as advanced stages. The relationship between gene status in 11q13 and LNM has also been evaluated in other cancers. Amplification of 11q13 DNA is associated with lymph node involvement in HNSCC [108]. CCND1 amplification and overexpression are significantly associated with LNM and survival in OSCC [109]. Another study confirmed amplifications of 11 genes in 11q13, and found two amplification cores, including core 1 (TPCN2 and MYEOV), and core 2 (from CCND1 to CTTN). Amplification of CTTN (core 2) and/or TPCN2/MYEOV (core 1) was further demonstrated to be associated with LNM in OSCC [110]. However, Huang et al. reported that there was no correlation between LNM and amplification or expression of the tested genes in 11q13 in OSCC [111]. Fortin et al. also found 11q13 amplifications not appear to be a reliable marker for subclinical LNM prediction in oral and oropharyngeal carcinomas [112]. A study by Xia et al. indicated that amplifications of ORAOV1 and CTTN are indicated to be associated with LNM [113]. In the breast cancer, PPFIA1 is coamplified with CCND1, which is significantly associated with high-grade phenotype but not tumor stage or nodal stage [67].

In this report, we demonstrated that both of 11q13.3-q13.4 splitting and amplification are significantly correlated with LNM and advanced stages, indicating that breakage and amplification of this region may play important roles in the tumor progression. According to multivariate logistic regression analysis, however, it was splitting rather than amplification that could be an independent predictor for the higher tendency of metastasis. In addition, two smaller regions 11q13.4(1) (SHANK2, DHCR7, NADSYN1, KRTAP5-7, KRTAP5-8, KRTAP5-9, KRTAP5-10, KRTAP5-11, FAM86C1, RNF121, IL18BP, NUMA1, LRTOMT, FOLR3, FOLR1, FOLR2, INPPL1, PHOX2A and CLPB) and 11q13.3(2)-q13.4(1) (CCND1, ORAOV1, FGF19, FGF4, FGF3, ANO1, FADD, PPFIA1, CTTN, SHANK2, DHCR7, NADSYN1, KRTAP5-7, KRTAP5-8, KRTAP5-9, KRTAP5-10, KRTAP5-11, FAM86C1, RNF121, IL18BP, NUMA1, LRTOMT, FOLR3, FOLR1, FOLR2, INPPL1, PHOX2A and CLPB) and were also associated with these parameters. However, no significant difference was found for 11q13.3(1) (CPT1A, MRPL21, IGHMBP2, MRGPRD, MRGPRT, TPCN2, MYEOV). These results suggested that genes located in 11q13.3(2) and 11q13.4(1) may play more important roles in LNM than 11q13.3(1). It will be interesting to further investigate gene and protein alterations caused by genomic breakages. Since the exact breakpoint locations may be distinctive in different cases, further studies will be focused on determining which of these genes were rearranged or disrupted in specific cases, identifying possible rearranged forms and roles of these alterations may play in ESCC development.

Conclusions

Our data provide detailed information on chromosomal and genomic aberrations present in six ESCC cell lines. Using a combination of M-FISH and array-CGH enabled us to produce more accurate karyotypes, which will help to determine appropriate applications of these cell lines for cytogenetic and molecular biological studies. The recurrent genomic breakpoints present in both the cell lines and primary tumors may help to identify aberrant genes associated with the development and progression of ESCC.

Declarations

Acknowledgements

We thank Professor Hong Chen (Basic Medical Institute, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China) for the kind gift of Whole-chromosome painting (WCP) probes. This work was supported by Chinese Hi-Tech R&D Program Grant (2012AA02A503), National Science Fund (81021061, 30872936) and Specialized Research Fund for the Doctoral Program of Higher Education of China (20101106110015).

Competing interests

There was no conflict of interest in this study.

Authors' contributions

HJJ carried out M-FISH experiments, participated in the data analyses, and draft the manuscript. SZZ carried out array-CGH experiments. ZZX organized clinico-pathological information. ZY participated in the design of the study. GT carried out part of the FISH experiments. LCX carried out part of the FISH experiments. ZT performed some statistical analysis. CY performed some statistical analysis. DJT provided FISH probe templates (BAC-DNA) and gave experimental suggestions. FSB provided the statistical analysis suggestion. ZQM gave experimental design suggestions. WMR conceived of the study, and participated in its design and coordination and helped to draft the manuscript. All authors have read and approved the final manuscript.

Pre-publication history

Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.