Figures

Abstract

The cell of origin of the five subtypes (I-V) of germ cell tumors (GCTs) are assumed to be germ cells from different maturation stages. This is (potentially) reflected in their methylation status as fetal maturing primordial germ cells are globally demethylated during migration from the yolk sac to the gonad. Imprinted regions are erased in the gonad and later become uniparentally imprinted according to fetal sex. Here, 91 GCTs (type I-IV) and four cell lines were profiled (Illumina’s HumanMethylation450BeadChip). Data was pre-processed controlling for cross hybridization, SNPs, detection rate, probe-type bias and batch effects. The annotation was extended, covering snRNAs/microRNAs, repeat elements and imprinted regions. A Hidden Markov Model-based genome segmentation was devised to identify differentially methylated genomic regions. Methylation profiles allowed for separation of clusters of non-seminomas (type II), seminomas/dysgerminomas (type II), spermatocytic seminomas (type III) and teratomas/dermoid cysts (type I/IV). The seminomas, dysgerminomas and spermatocytic seminomas were globally hypomethylated, in line with previous reports and their demethylated precursor. Differential methylation and imprinting status between subtypes reflected their presumed cell of origin. Ovarian type I teratomas and dermoid cysts showed (partial) sex specific uniparental maternal imprinting. The spermatocytic seminomas showed uniparental paternal imprinting while testicular teratomas exhibited partial imprinting erasure. Somatic imprinting in type II GCTs might indicate a cell of origin after global demethylation but before imprinting erasure. This is earlier than previously described, but agrees with the totipotent/embryonic stem cell like potential of type II GCTs and their rare extra-gonadal localization. The results support the common origin of the type I teratomas and show strong similarity between ovarian type I teratomas and dermoid cysts. In conclusion, we identified specific and global methylation differences between GCT subtypes, providing insight into their developmental timing and underlying developmental biology. Data and extended annotation are deposited at GEO (GSE58538 and GPL18809).

Data Availability: All data is available via GEO (GSE58538).The extended annotation for the Illumina 450K platform including its documentation is available at GEO (GPL18809).

Funding: MR is supported by a Translational Grant, Erasmus MC. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

During fetal development primordial germ cells (PGC) migrate from the yolk sac, via the hindgut to the genital ridge and enter the gonad where they undergo further maturation into the sex specific lineage, i.e. oogonia for females and spermatogonia for males. During migration and maturation an epigenetic “reset” takes place. This includes global DNA CpG demethylation during the early phases of migration. Specific areas like imprinted regions remain methylated until the PGCs arrive in the developing gonads where imprinting is subsequently gradually erased. After these maturing gonadal germ cells reach mitotic (male) or meiotic (female) arrest, de novo methylation is initiated and uniparental sex specific imprinting is acquired [1–8]. Another informative marker of developmental stage is X chromosome reactivation which occurs in female germ cells before the initiation of oogenesis. Studies report varying results regarding the exact timing of the various steps of the epigenetic reset, i.e. during migration or after arrival in the gonads. However, PGCs with an XX chromosomal constitution have been shown to lack X chromosome reactivation if they never reach the gonad [9–12]. For ethical reasons, most of these data have been experimentally investigated and validated in mice. Even though germ cell development differs between mice and men [13], methylation patterns during germ cell development are reported to be highly similar [14,15].

Germ cell tumors (GCT) originate from germ cells at different developmental stages and are thought to inherit their methylation profile from their ancestors. The WHO classification supports five GCT subtypes. Each subtype has specific molecular, clinical and histopathological properties [16–19]. GCT subtypes have been put in context of normal germ cell development (Fig 1A) based on gene/microRNA expression, (targeted) epigenetic analysis and genomic constitution as described below and reviewed extensively elsewhere [13,16,17,20–22]. Most of these studies were targeted at specific genes/genomic regions or concerned a subset of the GCT subtypes only, most prominently type I or II.

(A) GCT subtypes in the hypothesized context of normal germ cell development as proposed in earlier studies (grey box). Developmental schemes are indicated in blue (male), red (female) or when possible in both sexes (white). DG does not originate from CIS but is indicated together with SE for reasons of consistency. (B) Samples included in this study. Abbreviations match Fig 1A and roman numbers indicate the GCT type to which the histological subtypes belongs. n indicates the number of tumor samples per group. All samples are from male patients except the DGs, DCs and a subset of the type I TEs. Please note that when only TE is denoted, this indicates the group of all type I TEs together. Otherwise II.TE (type II pure TE) or the abbreviations for specific localizations are used as indicated in this figure. Four GCT cell lines were included; tumor of origin between brackets. (C) Reference to (abbreviations of) the functional genomic regions as mentioned in the rest of the manuscript. Probes were classified according to their relation to gene coding regions, micro-RNA (MIR) coding regions, CpG islands and/or transposon elements (LINE/SINE). The distance to the transcription start site (TSS) was used in accordance with the Illumina manifest: 200 or 1500 bp. Of note, the TSSAssociated category contains all probes with a distance < 1500 bp to the TSS in contrast to the TSS1500 category from Illumina which is only contains probes 200-1500bp from the TSS. Probes within imprinting associated regions were classified as (1) mapped inside a known imprinting control region (ICR) or (2) either mapped inside an ICR or mapped close to the TSS of a transcript of an imprinted gene (200/1500bp upstream, not mutually exclusive). P/M indicates the expressed allele, i.e. paternal/maternal respectively. Numbers between brackets indicate the number of valid probes within each specific category (total number of valid probes: 437,881). *The visualization did not permit including the probe count for all categories. The counts for the empty categories are: 5’UTR = 59,338; ISLAND = 136,339; IMPR_P200 = 638; IMPR_P1500 = 1,659; IMPR_M200 = 610; IMPR_M1500 = 2,265.

Type I (“infantile”) GCTs manifest clinically as teratoma (TE) and/or yolk sac tumor (YS) along the migration route of developing PGCs, i.e. the midline of the body. Extra-gonadal, sacral TEs occur most frequently and are mostly benign. Typically these rare tumors (incidence 0.12/100 000) arise before the age of 6 and no Carcinoma In Situ (CIS, see below) is found. They show global methylation patterns that are reminiscent of their embryonic stem cell progenitor (i.e. bimodal with modes at ≈0 and ≈100% methylation). These tumors showed somatic/biparental (≈50%) imprinting status in earlier studies. Therefore, type I GCTs have been suggested to originate from PGCs at an early stage, prior to global demethylation and imprinting erasure [16–18,23–25].

Type II GCTs present most frequently in the gonads and are also called germ cell cancer (GCC). The incidence of these tumors peaks between 25–35 years of age depending on the subtype [16,17,19]}. They comprise ≈1% of all solid cancers in Caucasian males and are responsible for 60% of all malignancies diagnosed in men between 20 and 40 years with increasing incidence in the last decades [26] (8.38/100,000 Dutch population. Dutch Caner Registration (IKNL), www.cijfersoverkanker.nl). Risk factors have been thoroughly investigated and are integrated in a genvironmental risk model, in which risk is determined by a combination of micro/macro-environmental and (epi)genetic factors [19,26–32]. A common precursor lesion called CIS or intratubular germ cell neoplasia unclassified (IGCNU, WHO definition [18]) is identified for type II GCT [16,17,33,34]. Because of the non-epithelial origin these tumors, CIS is technically not a proper term but will be used throughout this article in the interest of consistency with existing literature. Type II GCT consist of non-seminomatous (NS) and seminomatous (SE) tumors (Fig 1A), which differ in clinical behavior and molecular profile. SE and embryonal carcinoma (EC) are the stem cell components of type II GCT and EC can further differentiate in the other NS subtypes: TE, YS and choriocarcinoma (CH) [16,17]. Type II GCT originate from maturation arrested, germ line committed PGCs or gonocytes and historically have been suggested to exhibit erasure of genomic imprinting [13,16–19,22,35]}

Type III, IV and V GCTs originate from more differentiated germ cell progenitor cells. Type III GCTs are also known as spermatocytic seminoma (SS) and occur solely in the testis. They arise after the age of 50 and are generally benign and rare (incidence: 0.2/100000). Their presentation in elderly males, morphology and immunohistochemical profile separates SS from SE. They originate from germ cells around the spermatogonium stage and are paternally imprinted [16,36–40]. Type IV tumors are historically hypothesized to originate from a maternally imprinted, committed female germ cell. Type V GCT were excluded from this study because they show an independent pathogenesis. They originate from the fertilization of an empty ovum by two sperm cells, resulting in a completely paternally imprinted genomic constitution. This explains their mono-directional lineage of differentiation, unrelated to the germ cell origin [16–18].

This study aims to identify specific and global differences between the genome-wide methylation profiles of GCT subtypes. Type I, II, III and IV GCTs and four cell lines representative of type II GCTs are investigated (Fig 1A and 1B). Differences in methylation profile provides insight into the developmental timing and underlying biology of GCTs. The findings ultimately relate GCT subtypes to specific stages of (early) developing (embryonic) germ cells. Emphasis was placed on combining the results with the available literature and on providing extensive accompanying data to supply an integrated, hypothesis generating data source for future research.

SS and SE/DG show global hypomethylation when compared to EC/mNS and TE

Fig 2A shows the methylation distributions for all probes, probes associated with the TSS, 3' UTR, LINES, microRNAs and CpG Islands, respectively. The distributions of the remaining functional categories are presented in S2A Fig. SS showed global hypomethylation (Fig 2A), i.e. a large concentration of probes showing a low percentage of methylation and few probes showing a high methylation percentage. Hypermethylated configurations contain a large concentration of probes showing a high percentage of methylation and few probes showing a low methylation percentage. Hypomethylation was also shown in DG and SE samples albeit to a lesser extent, as can be observed from the mode at 50–60% methylation (Fig 2A). The SE group showed consistent hypomethylation (S2B Fig, page 2), in contrast to study of Nettersheim et al who showed separate groups of hypo- and hypermethylated SE in a larger sample series [42]. In contrast to the SE and DG samples, the EC and partly differentiated mNS, type I TE and DC samples consistently showed a bimodal pattern with one mode around 10% and one around 90% (Fig 2A and Fig 1: relation between subtypes). This bimodal pattern was also observed in three EC cell lines and a single SE cell line (Fig 2A, CL_SE & CL_EC). In line with previous reports [14,43], the EC cell lines were more methylated than the SE cell TCam-2 (Fig 2A). The transcription regulatory region upstream of the TSS (TSSAssociated, TSS200) was generally hypomethylated in all tumor types as were regions annotated as first exon, 5’UTR and CpG islands. The gene body, 3’-UTR, micro-RNAs and LINE/SINE elements were generally hypermethylated except in SS, which show a bimodal pattern (Fig 2A and S2A Fig). At these sites, SE/DG showed a median methylation level of 50% in line with the maximal methylation of their global profile and previous reports [20,44]. Hypermethylation of LINE/SINE elements NS and hypomethylation (Fig 2A) in SE was in line with a recent genome wide study [20] but contrasted with a targeted study that showed hypomethylation of 3 specific repetitive elements in both SE and NS [45].

To illustrate differences in methylation status between histological GCT subtypes two (visualization) methods were applied. Firstly, the methylation pattern over the whole genome and specific functional categories (Fig 1C) is visualized using the distribution of the methylation percentage β in all samples of a certain GCT subtype. Next, the discriminatory power of the methylation pattern for each individual sample is shown using principal component analysis. (A) Distribution of methylation percentage. Violin plots: grey areas indicate a kernel density plot of the methylation percentage (β) of all probes in all samples in a certain category. The boxplot indicates the interquartile range (black bars) and median (white squares). X-axis labels indicate histological subgroup according to Fig 1A and 1B. TE indicates type I TE only. (B) Principal Component Analysis. The first two principal components (PC) are plotted to evaluate the discriminative power of the methylation pattern between the subtypes. Abbreviations of histological subtypes are explained in Fig 1A. CL indicates cell lines. Please note that in the legend of the PCA the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. A more detailed visualization of the TE classes is provided in S2 Fig, which also includes the full series of 18 functional categories, bootstrap validation of the PCA and an estimation of the variance explained by the first two principal components.

GCT subtypes can be distinguished based on their methylation profile

Principal component analysis (PCA) showed robust separation of homogeneous clusters of EC/mNS, SE/DG, TE/DC and SS samples when all probes were considered (Fig 2B and S2A Fig). In line with the larger inter-sample variation (S2B Fig), SE/DG and SS were more scattered in the PCA plot. Some mNS, which consist partly of differentiated tissue, showed a tendency towards the differentiated TE/DC group. The type I TE and DC showed an indistinguishable global methylation profile. Similar observations were made when subsets of probes were considered that were annotated to specific functional genomic regions (Figs 2B and S2A).

Zooming in: GCT subtype specific methylation patterns

To further pinpoint differences between pairs of GCT subtypes, DMPs were identified (Table 1, S1 Table), tested for functional and chromosomal enrichment (Fig 3, S3 Fig, Table 1 and S2 Table) and grouped into DMRs (Fig 4, S4 Fig, Table 1, S3 Table, GSE58538: File S1). SE + DG and EC + mNS (including type II pure TE) subtypes were merged because of high similarity of the observed methylation profile (Figs 2A and 2B, S3A Fig), in line with literature regarding their similar origin [46] and their close relation in the current WHO classification [16,18]. Recurrent DMRs were identified as genes occurring more than once within or between comparisons, which may indicate regions of importance (S3 Table, n = 149).

DMPs were classified according to their functional genomic location (Fig 1C). Statistical over- and underrepresentation of probes in certain categories provides clues to differences between GCT subtypes in regarding function of methylation. Enrichment was assessed by comparing the number of probes in a functional category in a subset of DMPs with the that in the total dataset (Fisher’s Exact test, see Materials & Methods section). Results are shown for four pairwise (A vs B) comparisons of histological subtypes: (A) SE/DG versus EC/mNS; (B) SE/DG vs type I TE; (C) EC/MNS vs type I TE and (D) SE/DG vs SS. (LEFT) The number (n) of DMPs identified in either the DMP[A-B] (hypermethylated in A, green) or DMP[A-B] (hypermethylated in B, red) group. (MIDDLE/RIGHT) Functional enrichment in the DMP[A-B] and DMP[A-B] group respectively. X-axis: positive numbers indicate a significant overrepresentation of DMPs in a functional category compared to non-DMPs while negative numbers indicate a significant underrepresentation. Depicted is the log2 ratio of (1) the % of either DMP group assigned to a category and (2) the % of non-DMPs assigned to that category. Only significant enrichments are depicted (2-sided Fisher’s Exact test, see Methods section for Bonferroni corrected α threshold). DMPs[se/dgvsSS].IMPR_P1500 showed significant underrepresentation, but could not be plotted on log scale (0 probes in DMP group). Details of calculations and raw counts and percentages are presented in S2 Table. Y-axis: functional categories as specified in Fig 1C.

Visualization of the methylation percentage at specific loci is used to zoom in on a predefined region and investigate local methylation differences between GCT subtypes. (A) DMRT3, (B) SOX2, (C) POU5F1 (OCT3/4), (D) TEX14. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

Comparing SE/DG, EC/mNS and type I TE.

Regardless of their presumed common origin, EC/mNS and SE/DG showed vastly different methylation profiles. The relative hypermethylation in EC/mNS versus SE/DG was concentrated in regions not involved in transcription regulation (Fig 3A). This pointed to a global difference in methylation status rather than differential methylation of specific regulatory elements. This also held for the hypermethylation of type I TE when compared to SE/DG (Fig 3B). The 61 DMPs hypermethylated in SE/DG relative to type I TE were concentrated at three specific genes: NCOR2, ALOX12 and ECEL1P2 (Table 1, S3 Table, S4A Fig).

DMPs between type I TE and EC/mNS indicated a more methylated profile of the EC/mNS group (Fig 3C). Moreover, the majority of the probes hypermethylated in type I TE were located on the X chromosome and can therefore be traced back to hemi-methylation of chromosome X in females (TE = male/female, EC/mNS = male only) (Table 1, S3B Fig). DMRs included many genes involved in male gametogenesis like DMRT3 (Fig 4A). The EC marker SOX2 [17,64] was present as one of the only 15 hypermethylated autosomal DMRs in type I TE (Fig 4B). These DMRs presumably relate to the cell of origin as well as to the sex of the patient (S4B Fig, Table 1 and S3 Table).

Type III (SS) versus type II seminomatous GCT (SE/DG).

The general, probes significantly hypomethylated in SS as compared to SE/DG were enriched for regions associated with paternal expression (Fig 3D). DMRs hypermethylated in SE/DG predominantly included recurrent DMRs and DMRs within genes associated with germ cell and testis development (Table 1 and S3 Table). The promoter of POU5F1 was relatively hypomethylated in SS, while it is a marker for the stem cell component of type II GCTs and not expressed in SS [17,46,65] (Fig 4C, discussed in Table 2). DMRs hypermethylated in SS also included genes associated with male germ cell determination, fertility and GCTs, enforcing the epigenetic relation between GCT cells and their cell of origin (Table 1 and S3 Table).

Specific GCT associated genes.

A number of genes has been associated with (methylation in) GCTs, both regarding pathogenesis and diagnosis. Table 2 summarizes the literature for these genes and combines this with the methylation data from this study, e.g. overlap with DMRs and methylation profile of these genes (see also Fig 5 and S5A Table). A recent meta-analysis of GCT GWAS studies identified 19 SNPs associated with 13 genes [29]. For most genes their methylation profile was non discriminative between the GCT subtypes, the exceptions being TEX14 which was also independently identified as a DMR[SE/DG-ss] (Fig 4D) and BAX1, which also contained a DMR[se/dg-SS] (all SNP related genes: S5B Table).

Visualization of the methylation percentage at specific loci is used to zoom in on a predefined region and investigate local methylation differences between GCT subtypes. The genes are reviewed in Table 2. (A) AR, (B) miR-371-2-3, (C) NANOG, (D) SOX17. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

Imprinting status and X chromosome reactivation

As reviewed in the introduction, gradual and tightly controlled establishment of uniparental imprinting and X chromosome reactivation (female only) has been demonstrated in developing germ cells which is at least partly mirrored in their malignant counterparts. Regarding imprinting controlled regions (Fig 1C and S4 Table) in the tumor groups probes covering regions that are regulating paternally expressed genes (ICR_P) showed somatic methylation in type I and II GCTs with a trend towards hypermethylation in DC (Fig 6A). SS and the cell lines showed hypomethylation of ICR_Ps, a distinction also visible in the PCA plots. In IMPR_P200/1500 the pattern of the ICR_P probes seemed to be pooled with a set of unmethylated probes (type I, II, IV GCT) presumably indicating contamination by non-imprinting related regions and hence not informative for imprinting status (S2A Fig, pages 15 and 16). A somatic methylation state was shown for ICR_M except in the SS (bimodal) and the CL_SE (hypomethylated); a difference corroborated by the separation of these groups in the PCA plot (Fig 6B). IMPR_M200/IMPR_P1500 probes showed hypomethylation similar to non-imprinted genes in all groups (S2A Fig, pages 18 and 19). No reactivation of chromosome X was seen in GCTs from female patients, which is reflected by the consistent 50% median methylation of the X chromosome in these cases (Fig 6C). The cell lines did not reflect the imprinting status of their in vivo counterpart, warranting caution when using the cell lines as a GCT model system in methylation based experiments.

Fig 6. Methylation of imprinting control regions and the X chromosome.

Analogous to Fig 2 the differences in methylation status between histological GCT subtypes is illustrated by two methods. Firstly, the methylation pattern is visualized using the distribution of the methylation percentage β. Next, the discriminatory power of the methylation pattern for each individual sample is shown using principal component analysis. (A) All probes associated with paternally expressed genes (ICR_P). (B) All probes associated with maternally expressed genes (ICR_M). (C) All probes located on the X chromosome. (D) Distribution of methylation in individual TE samples ordered by sex and localization. To compare type I and II TE the n = 3 type II pure TEs from the mNS were included in this visualization. Methylation levels of all probes, and probes associated with ICRs (P/M) and probes on the X chromosome are subsequently shown. (Distribution plots of methylation percentage.) Violin plots: grey areas indicate a kernel density plot of the methylation percentage (β) of all probes in all samples in a certain category. The boxplot indicates the interquartile range (black bars) and median (white squares). X-axis labels indicate histological subgroup according to Fig 1A and 1B. TE indicates type I TE only. (Principal Component Analysis.) The first two principal components (PC) are plotted to evaluate the discriminative power of the methylation pattern between the subtypes. Abbreviations of histological subtypes are explained in Fig 1A. CL indicates cell lines. Please note that in the legend of the PCA the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female.

Methylation status of ICR_Ps and ICR_Ms was similar between individual samples of the same histology (S2B Fig) with the exception of type I TE and DC (Fig 6D and S2B Fig). In addition to the earlier analysis, where the type II TEs were grouped with the mNS and the type I TEs were assessed as one class, TEs were also investigated individually, grouped according to sex and anatomical site, in line with sex specific imprinting occurring during fetal/germ cell development (Fig 6D). The genome-wide methylation pattern was similar for all TEs. No reactivation of chromosome X was seen in the GCTs from female patients. Sacral type I TEs showed somatic imprinting patterns both in males and females. In line with sex specific imprinting, ICR_P sites in testicular type I TEs were relatively hypomethylated compared to sacral TEs. In contrast, ovarian type I TEs showed a tendency towards hypermethylation. Of note, testicular type I TE also showed a trend towards hypomethylation in ICR_M (only 18 probes). On the other hand, the expected inverse pattern of ICR_P was seen in the ovarian TEs at the ICR_M sites. A pattern similar to ovarian type I TE was observed in the individual DC samples: heterogeneity and gradual deviation from biparental imprinting towards uniparental maternal imprinting. Two out of three type II TEs showed a somatic imprinting pattern of both ICR_P and ICR_M.

Validated ICRs (S4 Table) were also studied individually. After merging overlapping validated ICRs from literature, 28 unique ICRs remained of which 21 were covered by the 450K array (4 ICR_M, 16 ICR_P, 1 unknown). ICRs controlling the expression of H19/IGF2, SNURF/SRPN and MEST have been studied in GCTs previously (review & results in Table 2). In the ICR_Ps which constitute the majority of the validated ICRs, the dominating pattern is: (1) somatic methylation in the type II tumors (2) hypomethylation in the type I testicular TEs and SS and (3) a trend towards hypermethylation in DC and ovarian TE. (Fig 7A and 7B, S6 Fig).

Visualization of the methylation percentage at specific loci is used to zoom in on a predefined region and investigate local imprinting differences between GCT subtypes. Two illustrative regions are depicted. (A) ICR_P: MEST. (B) ICR_M: H19-IGF2. The overlapping H19 transcript is an aberrant, long alternative transcript (H19-012, ENST00000428066). This ICR regulates H19 and IGF2 expression and lies upstream all other transcripts of H19. The other ICRs are visualized in S6 Fig and listed in S4 Table. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

Discussion

This study provides a detailed overview of the differences in global and local methylation status between type I-IV GCTs (Fig 1) and relates it to their cell of origin during normal germ cell development. Normal germ cell maturation includes complete de- and subsequent remethylation. Establishment of sex specific uniparental imprinting is physiological as is reactivation of chromosome X in female gametes. The largest methylation differences were detected between the hypermethylated EC/mNS + type I TE and hypomethylated SS + SE/DG clusters, in line with previous reports [14,43,117,135] (Fig 2A). However, the methylation profiles also allowed for a more detailed separation of EC/mNS, SE/DG, TE/DC and SS clusters, which is in line with the differentiation status of the tumors and their cell of origin. This distinction was also apparent when specific functional genomic regions were evaluated (Fig 2B). Hypermethylation in EC/mNS and type I TE is concentrated at non-transcription related regions when compared to SE/DG, pointing to a global difference in methylation status rather than differential methylation of specific regulatory elements. Moreover, EC/mNS is somewhat more methylated than type I TE and shows specific differences at transcription regulating genomic regions including genes implicated in male germ cell development. Regarding type III tumors, differential hypomethylation in SS relative to SE/DG is enriched for paternally expressed imprinting associated regions and DMRs cover male germ cell related genes (Figs 3, 4 and 5, Tables 1 and 2). In addition, marked differences in imprinting status were observed. Ovarian type I TE and DC showed partial uniparental maternal imprinting, inverse of the uniparental paternal imprinting of SS. Testicular type I TE shows a trend towards imprinting erasure and type II GCTS (SE/DG/EC/mNS) showed somatic imprinting status (Figs 6 and 7). The local and global methylation difference observed between GCTs could be matched to physiological germ cell development, but did not match with their respective cell line models (Fig 8).

The top and bottom line charts depict normal germ cell development in female and male respectively (stages specified in the middle black bar). Methylation status during normal germ cell development is depicted for the global genome, ICRs and chromosome X (see Discussion). Putative cells of origin of the various types of GCTs are indicated in the brown boxes. ICR_P/M = ICR regulating paternally/maternally expressed genes. Bimodal indicates a methylation pattern peaking 0 and 100% with the exception of SE/DG (between 0 and ≈50). The table (bottom) provides a summary of the results, mainly Figs 2 and 6. Abbreviations: pf = primordial follicle. Type I tumors are indicated with their type (I), sex (m = male, f = female) and location (s = sacral, t = testis, o = ovary). Other GCT subtypes are indicated with their type (I, II, IV) and the abbreviation of each histological class, which are explained in the main text. Gradient bars indicate percentages of methylation (0→100%, green-white-grey-red) analogous to the gradient used in the other figures.

Limited knowledge exists about the progenitor of type I tumors. The absence of CIS and clinically different presentation (pediatric, frequently extra-gonadal, fully differentiated histology: TE/YST) sets them apart from the type II tumors [16–18]. Their bimodal global methylation status could a pattern generally observed in normal differentiated tissues and in very early germ cell progenitors (pre-migration. Historically type I and II tumors are also thought to be different with regard to their imprinting status. Imprinting status in these tumors was earlier shown to be somatic (biparental) or partially erased in case of the type I tumors and erased in case of the type II GCTs [16]. This positions the progenitor cell of type I tumors before imprinting erasure in the gonad. Indeed biparental (somatic) imprinting status in extra-gonadal TE was confirmed in this study and by Amatruda and colleagues [20]. There is a trend towards imprinting erasure in testicular type I TE. Ovarian type I TE show a trend towards completely maternal imprinting, but starting from a biparental status (50%), not showing any evidence of prior complete erasure (Fig 6D). This (partial) mimicking of female germ cells in ovarian type I TE is in line with in several studies [20,131,132]). However, the non-erased imprinting status, inactivated X chromosome and generally methylated state fits with the cell of origin at the very early PGC stage, which is then blocked in physiological complete demethylation, erasure and X reactivation and, when subjected to a gonadal micro-environment, shows partial erasure/uniparental imprinting [16–18] (Fig 8).

Most data is available on the epigenetic constitution of the type II tumors, as reviewed before [13,21]. A strongly hypomethylated state was recently shown for all CIS, the common precursor of SE and EC [136]. Earlier studies have suggested separated NS-CIS and a SE-CIS types [135], but the lack of methylation in CIS combined with absence of SOX2 (EC marker) expression [64,136,137] increases the likelihood of a single precursor and progression into SE or NS. The CIS-like state is evident in the hypomethylated profile of SE/DG as shown in this article and previous research [14,43,117,135,136]. EC and mNS show a (de novo) methylated profile (Fig 2A). This is in line with the previously reviewed increased methylation in the transition of CIS into NS [13,14,43,138], possibly illustrating reversal to a hypermethylated ES like state [7,16,139–142] or a bimodal methylation state normally present in differentiated tissues as shown in the differentiated NS. The consistent somatic imprinting pattern in general and at specific ICRs (Fig 6, S6 Fig and S4 Table) was in line with an earlier report [20] but contrasted with targeted studies suggesting erased imprinting status at specific ICRs in these tumors using mainly indirect methods (allele specific expression analysis) and or non-quantitative methylation analysis (bisulfite specific restriction enzymes) (for review Table 2). The hypomethylated progenitor and somatic imprinting pattern (Fig 6A and 6B) situates the cell of origin of the type II tumors possibly earlier than previously described [16]: after global demethylation but before imprinting erasement, which is also in line with the occurrence of extra-gonadal type II GCTs (brain, anterior mediastinum) and their totipotent, embryonic stem cell like potential [16,139–142] (Fig 8).

The other GCT subtypes are historically hypothesized to originate from more mature germ cell progenitors. Their marker profile has placed the type III tumors at the pre-spermatogonium state with regard to their cell of origin [36–39,46]. Earlier epigenetic data showed a heterogeneous profile of histone modification and methylation profiles, not corresponding with a pre-spermatogonial origin [143]. Our limited series of SS show a consistent pattern of distinct hypomethylation and loss of imprinting at the paternally expressed ICRs (ICR_M: heterogeneous ≥ 50%, Fig 2B). This matches with a cell of origin between the gonocyte and spermatogonium stage, after establishment of uniparental imprinting but before initiation of de novo methylation. The type IV tumors (DC) show a pattern comparable to other differentiated tissues (ovarian type I TE) and show a general trend towards uniparental maternal imprinting but not starting from a completely erased state, potentially placing their cell of origin and pathogenesis parallel to the type I ovarian TE and not as a separate entity originating from a completely maternally imprinted an differentiated female germ cell as described before [16] (Figs 2B, 6 and 8).

In conclusion this exploratory study of genome wide methylation profiles of GCT subtypes identified specific and global methylation differences, providing novel insight into the developmental timing and underlying biology of the various subtypes of GCTs and their (embryonic) cells of origin (Fig 8). Methylation profiles allowed for separation of clusters of NS, SE/DG, SS and TE/DC, largely in line with the current WHO classification. SE/DG/SS were globally hypomethylated, in line with previous reports and the demethylated state of their precursor. Differential methylation between subtypes reflected the presumed cell of origin as did imprinting status. However, somatic imprinting in type II GCT might indicate a cell of origin after global demethylation but before imprinting erasure. This is earlier than previously described, but agrees with the totipotent/embryonic stem cell like potential of type II GCTs and their rare extra-gonadal localization. The results support the common origin of the type I TEs and show strong similarity between ovarian type I TE and DC. However, the limited samples size and conflicting results with some of the current literature warrants careful interpretation of the results and validation in a larger/extended dataset. Moreover, to interpret the function of differential methylation between GCT subtypes, targeted validation the findings using matched expression data or careful evaluation of the effects of methylation in cell line models of GCTs is a crucial next step, even though validation of a biological relevant and representative DMR in microRNA-371/2/3 (Table 2) showed excellent match with the results of bisulfite sequencing. The in-depth review of related literature and extensive accompanying online data (supplementary and on GEO) serve as a hypothesis generating source for future research.

Materials and Methods

Samples

Patient samples.

Use of tissue samples remaining after diagnosis for scientific reasons was approved by Medical Ethical Committee (MEC) of the Erasmus MC Rotterdam (The Netherlands), permission 02.981. This included the permission to use the secondary tissue without further consent. Samples were used according to the “Code for Proper Secondary Use of Human Tissue in The Netherlands” developed by the Dutch Federation of Medical Scientific Societies (FMWV (Version 2002, update 2011)). An overview of the samples in this study is presented in Fig 1A and 1B. Samples were collected when submitted to the pathology department and stored in liquid nitrogen.

Methylation profiling

DNA was isolated as described in [110]. The GCT material used contained > 75% tumor cells. Bisulfite conversion and methylation detection was performed using Illumina’s HumanMethylation450 BeadChip (450K array) and exported as described in [69]. This array does not distinguish between DNA methylation variants like 5mC and 5hmC [153].

Data analysis

Data (pre-)processing.

Further processing was carried out in R using the LUMI package [154] according to [155,156]. In the raw data, no structural differences in quality or batch effects were observed. Poorly performing probes (detection p<0.01 in > 95% of the samples), cross hybridizing probes and probes with a SNP at or within 10 bp of the target CpG (allele frequency > = 0.05) were excluded [156]. As a result 44,540 probes were discarded, leaving 437,881 valid, methylation related probes for processing and analysis. Finally, color adjustment, quantile normalization and BMIQ-based correction for probe type bias (Infinium I vs II) were performed [154,155,157]. Data processing resulted in two quantifications of a CpG site’s methylation status: the methylation percentage β and an associated M-value which (logit2(β)). M-values were used for statistical computations because of a more favorable tradeoff between true positive rate and detection rate [41]. All data is available via GEO (GSE58538).

(Additional) annotation 450K array.

The 450K annotation manifest (v1.2) as supplied by Illumina contains a number of functional genomic classes like a probe’s association with CpG islands, gene coding regions, etc. The manifest was extended with (additional) functional genomic classes, based on the GRch37/hg19 assembly. Briefly, probes close to small nuclear RNAs and microRNAs from snoRNABase and miRBase were identified, as were probes within repeats defined by RepeatMasker (source: UCSC). Probes close to the transcription start site (TSS) of imprinted genes were also identified (geneimprint.com / igc.otago.ac.nz). Known imprinting control regions (ICR) and their association with either paternal or maternal expression were retrieved from WAMIDEX and igc.otago.ac.nz. Imprinting is indicated using the expressed allele. Illumina probe classes were extended with a number of merged categories. Where applicable, the upstream (-) and downstream (+) margins reported in this manuscript are analogous to the Illumina annotation (-1500+0; -200+0). The eighteen functional categories of primary interest to this manuscript are illustrated in Fig 1C. The extended annotation including its documentation is available at GEO (GPL18809).

Analysis protocol.

Below, the subsequent steps of the data analysis are described. More details are presented in S1 Fig. Depending on the context, “feature” can refer to a probe or a segment. All results are based on the GRch37/hg19 assembly.

Global methylation: Violin plots were created per histological subtype using all (global) or functional subsets of 450K probes. Violin plots (vioplot package) integrate the benefits of a boxplot and a kernel density plot. Two-dimensional principal component analysis (PCA) was applied and validated using bootstrapping to assess how well the methylation values of (subsets of) the probes separated the histological subtypes. Formal statistical testing of the distribution of the methylation values identified (small) significant differences between almost all tumor classes (data not shown, Kruskal Wallis test followed by pairwise, Benjamini-Hochberg corrected Mann Whitney U tests). The PCA and violin plot based approach was preferred/used to identify the largest, most relevant differences.

Defining genomic segments and discriminative methylation states: To detect regions of interest rather than only selecting individual differentiating probes (CpG sites) a HMM was trained on the tumor samples. Without a priori information about tumor type, the Hidden Markov Model (HMM) combines adjacent probes into segments and assigns these segments to k mutually exclusive states, each with distinct methylation profiles over all tumor samples. k = 20 was used as the likelihood of the model saturated around this number of states (S1 Fig, page 11). In total, 133,730 segments were identified. The median methylation value (M or β) of all probes in a segment or state was taken as methylation proxy. As a proof of concept, S1 Fig (page 17) shows clear separation of male and female samples based on state 15 which almost exclusively contains probes on the X chromosome. The result of the HMM is included in the GEO submission of the data (GSE58538) and its properties/procedures are summarized in S1 Fig.

Differentiating features (probes or segments): Features showing low variability over all samples were excluded before formally testing for differential methylation (σM,probes<0.8, n = 77,154/437,881 (17,62%) & σM,segments<0.6, n = 13,229/133,730 (9,89%), S1 Fig, page 8). A Mann Whitney U test was applied to each feature, comparing the distribution of M values between two histological subtypes. If significant (p<0.05, Benjamini-Hochberg corrected [158]), the subtype specificity was validated in 100 stratified bootstrap samples. If the feature proved to be significant in ≥95% of the validation samples and showed a difference in median M values > |0.9| between the pair of histological subtypes it was considered potentially discriminating. The value of 0.9 was chosen as the mean of the cut-off range recommended by Du and coworkers (0.4–1.4) [41]. Although a less stringent setting might result in a higher detection rate, it will considerably reduce the true positive rate [41]. The sign of the difference in median M value was used to assign a relative methylation status (hyper/hypo) in either of the two subtypes under pairwise consideration.

Differentiating Hidden Markov Model (HMM) states: To identify non-adjacent regions that showed similar patterns of methylation a logistic LASSO regression model was fitted on the M values of the HMM states (glmnet packgage) [159]. Coefficients > 0 were selected from the most regularized regression model within 1 standard deviation of the model with minimal cross validation error. A 10 fold cross validated λ was used. Features included in the selected state(s) and showing a difference in median M values > |0.9| (see above) between the pair of histological subtypes compared were considered potentially discriminating. The sign of the difference in median M value was used to assign a relative methylation status (hyper-/hypomethylated) in each of the subtypes.

Final selection of differentially methylated probes (DMPs): Features of interest were identified in the intersection of (1) all probes in discriminating states, (2) all probes in discriminating segments and (3) all individually discriminating probes (S1 Fig (p. 3/5), also see above). This was done separately for probes that showed relative hypermethylation in either of the subtypes under pairwise comparison. This way, two groups of differentially methylated probes (DMPs) were identified, showing relative hypermethylation in one subtype and relative hypomethylation in the other.

Functional enrichment: The sets of DMPs were subjected to enrichment analysis for 18 functional categories (Fig 1C) using a two sided Fisher’s Exact test. Analogously, association with chromosome and state was tested. p<0.05/(18+24+20) = 0.00080645161 was considered significant, hence retaining a Bonferroni corrected Type I error rate of 5% (18 functional categories, 24 chromosomes, 20 states).

Differentially methylated regions (DMRs): Regions with ≥ 5 adjacent DMPs and a maximal inter-DMP distance ≤ 1 kb were identified as DMRs between the tumor groups. Annotations were retrieved for DMRs including flanking regions of 20% of the length of each DMR.

Analysis of the cell lines: Cell lines were compared to tumor samples in the evaluation of specific regions of interest in the tumor samples and with regard to their global methylation profile. Moreover, they were analyzed using the DMRforPairs package to identify specific DMR in these unique samples ([160], using the default settings except min_dM = 0.9, see above). For the NCCIT and TCam-2 cell lines this analysis matches the one performed in [69].

Supporting Information

A detailed flowchart and description of the analysis protocol is presented, followed by a visual illustration of the selection of differentially methylated features. The motivation for the threshold when filtering low variability probes is presented next. Finally, the properties of the Hidden Markov Model (HMM) are presented together with a detailed description of is construction. HMM state 15 is presented as a proof of concept, discriminating between male and female samples based on X-chomosomal localization of the large majority of the probes in this class.

In addition to the selected categories presented in Fig 2 this figure contains all 18 functional categories presented in Fig 1C and includes the primary PCA as well as an its validation (i.e. robustness of the result). PCA was performed on the total dataset (left) and validated using stratified bootstrapping (middle: training, right: validation). (Distribution plot of methylation percentage) Violin plots: grey areas indicate a kernel density plot of the methylation percentage (β) of all probes in all samples in a certain category. The boxplot indicates the interquartile range (black bars) and median (white squares). X-axis labels indicate histological subgroup according to Fig 1A and 1B. TE indicates type I TE only. (Principal Component Analysis) The first two principal components (PC) are plotted to evaluate the discriminative power of the methylation pattern between the subtypes. Abbreviations of histological subtypes are explained in Fig 1A. CL indicates cell lines. Please note that in the legend of the PCA the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. S2B Fig, Methylation patterns in GCT subtypes and cell lines—Global methylation patterns in individual samples. X-axis indicates arbitrary sample ID. The sex of the patient from which the sample originates is indicated in blue (male) or red (female). Density plots are explained in the legend of Fig 2. Distributions are shown for all probes individual per sample. The ICR_P and ICR_M categories are presented separately to facilitate the discussion about imprinting. The red dashed line indicates somatic imprinting (50%). Please note that details on the TE group are presented in the main text (Fig 6D) and that this category is therefore omitted here. This also holds for the n = 3 type II pure TE included in the mNS group. (Distribution plot of methylation percentage) Violin plots: grey areas indicate a kernel density plot of the methylation percentage (β) of all probes in all samples in a certain category. The boxplot indicates the interquartile range (black bars) and median (white squares). X-axis labels indicate histological subgroup according to Fig 1A and 1B. TE indicates type I TE only.

The SE+DG and EC+mNS categories were merged because of high similarity in biological classification and methylation profile. Despite their similarities, the DC and type I TE because they belong to different histological classes. S3B Fig, Enrichment of differentially methylated probes (DMPs) for chromosomal position and HMM state—Association between DMPs and chromosome / HMM state. Stacked bar charts indicate the fraction of probes in a subset (DMP[A-B], DMP[A-B], non-DMP) that is mapped to a specific chromosome or assigned to a specific state. Grey indicates the non-DMPs and red and green indicated the DMPs hypermethylated in the subtype with the matching color in the figure (alternating green/white = A, alternating red/white = B). * = significant over-/underrepresentation of DMPs relative to the non-DMP subset (tested per chromosome/state, 2-sided Fisher’s exact test, see Methods for Bonferroni corrected α threshold). In the right bottom of each figure the coefficients of the LASSO regression model are depicted. These roughly match the strongest over- and underrepresentations identified by the Fisher’s Exact tests on the states. The LASSO selected states are marked orange in the table indicating the significant associations between each state and either DMP group.

This figure depicts the DMRs between GCT subtypes discussed in the main text in addition to those already visualized in Fig 4. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines. S4B Fig, Methylation profile at GCT subtype specific differentially methylated regions (DMRs)—continued—EC/mNS versus type I TE. This figure depicts the DMRs between GCT subtypes discussed in the main text in addition to those already visualized in Fig 4. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines. S4C Fig. Methylation profile at GCT subtype specific differentially methylated regions (DMRs)—continued—SE/DG versus SS. This figure depicts the DMRs between GCT subtypes discussed in the main text in addition to those already visualized in Fig 4. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

This figure (together with S5B Fig) depicts the genes discussed in the main text and Table 2 in addition to those already visualized in Fig 5. Genes are annotated 1.5kb upstream of their TSS and 1.5kb downstream of their transcription termination site. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines. S5B Fig. Methylation status of genes with SNPs significantly associated with GCTs. This figure (together with S5A Fig) depicts the genes discussed in the main text and Table 2 in addition to those already visualized in Fig 5. Genes are annotated 1.5kb upstream of their TSS and 1.5kb downstream of their transcription termination site. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

S6 Fig. Methylation status of known imprinting control regions (ICRs).

ICRs identified as described in the materials and methods sections were checked for coverage on the 450K array. 21/28 unique ICRs were covered by one or more probes. These were visualized here (overview: S4 Table). H19_IGF2 regions: the overlapping transcript is an aberrant, long alternative transcript (H19-012, ENST00000428066). These ICRs regulates H19 and IGF2 expression and lie upstream all other transcripts of H19. (Visualizations) From top to bottom the following is depicted: (1) Four-color heat map indicating methylation % for each individual probe in the depicted region. For the sample groups specified on the left the median methylation % is shown. (2) Position of all probes in the region of interest (ROI) is annotated as black rectangles. (3) HMM segments are displayed as grey boxes spanning the segment’s width and grouped per state. Numbers indicate the state of each (group of) segment(s). (5) GC% was obtained from the UCSC genome browser database (gc5Base table). (6) Transcripts overlapping with the ROI are plotted at the bottom. Plot generated using the Gviz package. Abbreviations of histological subtypes are explained in Fig 1A. Please note that the TE group is subdivided based on gender and localization: I = type I; II = type II/formally part of the mNS group, s = sacrum, t = testis, o = ovary, m = male, f = female. CL indicates cell lines.

(A) SE/DG vs EC/mNS; (B) SE/DG vs type I TE; (C) EC/mNS vs type I TE; (D) SE/DG vs SS. Rows indicate the functional categories. Columns indicate the number of probes in the non-DMP and both subtype specific DMP sets. Next, the fraction (%) of this count relative to all non-DMPs or either set of DMPs is presented. The log-scores are calculated as log2(%DMP/%non-DMPs) and visually presented in Fig 3 for those categories showing significant over-/underrepresentation. Significance of the enrichment was evaluated using a two-sided Fisher Exact test with a Bonferroni corrected α threshold as specified in the Materials & Methods section.

List of DMRs for each pair of GCT subtypes. (Recurrent tumor DMRs) Gene symbols that occurred in more than one DMR; either irrespective of DMR subset (n.total.occurences) or in multiple independent DMR subsets (n.dmr.lists) (Overlap tumor and CL DMRs) Gene symbols involved in DMRs identified between both the tumor groups and the cell lines. The second column indicates in which tumor comparisons the gene symbol was involved in a DMR.

Acknowledgments

The authors thank the Department of Bioinformatics, Erasmus MC, Rotterdam, for their support. They especially thank Ms Sylvia de Does and Mr Ivo Palli. Prof. dr. J.W. Oosterhuis, Department of Pathology, Erasmus MC, Rotterdam is kindly acknowledged for his critical reading of the manuscript.