Abstract

Background

Proper expression and functioning of transcription factors (TFs) are essential for regulation of different traits and thus could be crucial for the development of complex diseases. Subjects with Down syndrome (DS) have a higher incidence of acute lymphoblastic leukemia (ALL) while solid tumors, like breast cancer (BC) and oral cancer (OC), show rare incidences. Triplication of the human chromosome 21 in DS is associated with altered genetic dosage of different TFs. V-ets erythroblastosis virus E26 oncogene homolog 2 (ETS2) and Single Minded 2 (SIM2) are two such TFs that regulate several downstream genes involved in developmental and neurological pathways. Here we studied functional genetic polymorphisms (fSNP) in ETS2 and SIM2 encoding genes in a group of patients and control subjects to better understand association of these variants with DS phenotypes.

Methods

We employed an in silico approach to identify potential target pathways of ETS2 and SIM2. fSNPs in genes encoding for these two TFs were identified using available databases. Selected sites were genotyped in individuals with DS, their parents, ALL, BC, OC as well as ethnically matched control individuals. We further analyzed these data by population-based statistical methods.

Conclusions

We infer from the present investigation that the difference in frequencies of fSNPs and their independent as well as interactive effects may be the cause for altered expression of SIM2 and ETS2 in DS and malignant groups, which affects different downstream biological pathways. Thus, altered expression of SIM2 and ETS2 could be one of the reasons for variable occurrence of different malignant conditions in DS.

Keywords

Background

Transcription factors (TFs) regulate pathways related to diseases either through their direct action on the target genes or by controlling downstream pathways. Hence they are important candidates for investigating etiology of complex diseases. There are several TF encoding genes in the human 21st chromosome (HSA21) and deregulated expression of any of these could influence downstream pathways. Due to trisomy of the HSA21 in Down syndrome (DS) (MIM# 190685), genetic overdosage of a number of TF encoding genes is a distinct possibility. DS patients are prone to acute leukemia, including acute lymphoblastic leukemia (ALL), while solid tumors especially breast cancer (BC) is rare [1]. We hypothesized that DS related abnormalities like intellectual disability, immunological imbalance, hormonal alteration, and predisposition to childhood acute leukemia could be due to improper expression and functioning of TFs located in the HSA21. Because disease association studies have revealed higher differential expression ratio in different tissues for the TF genes encoding Single minded 2 (SIM2) and V-ets erythroblastosis virus E26 oncogene homolog 2 (ETS2) within HSA21 [2], here we explored the role of these two TFs in DS phenotype and related malignancies.

SIM2 is important for normal neuronal development. SIM2 can heterodimerize with aryl hydrocarbon receptor nuclear translocator (ARNT) and translocate to the nucleus to transcriptionally regulate gene expression [3]. Expression of SIM2 mRNA has been detected in fetal brain regions associated with DS pathology [4]. SIM2 also plays an important role in carcinogenesis. After entry into a cell, carcinogenic compounds bind to the cytoplasmic Aryl hydrocarbon receptor (AhR) and are carried to the nucleus. Ligand-bound AhR together with ARNT [5] bind to the Xenobiotic Response Element present in the promoter region of certain genes encoding for oxidative enzymes [6–8]; transcriptional activation of these enzymes accelerates carcinogen metabolism [9]. SIM2 inhibits AhR/ARNT dimerization, thereby inhibiting carcinogen metabolism and promoting carcinogenesis [5, 10]. In addition, SIM2 is the second most consistently over expressed gene in prostate cancer [11] and over expression of the short isoform of SIM2 (SIM2s) is reported in malignant colon, pancreas, and prostate tissues as compared to the corresponding normal tissues [9–11]. SIM2 has further been proposed to have a breast tumor suppressive activity [12] and a genome-wide linkage scan identified three putative breast cancer susceptibility loci, one of which (21q22) harbors SIM2 [13]. Therefore, SIM2 functions as a tumour selective marker and drug target in several types of malignancies [10].

Besides SIM2, ETS2 over expression induces craniofacial defects as well as skeletal anomalies in transgenic mice resembling DS [14]. Increased rate of neuronal apoptosis [15] and amyloid precursor protein (APP) gene transactivation are also observed upon ETS2 over expression [16], which might play an important role in the early onset of Alzheimer’s disease and neuronal abnormalities in DS [16].

ETS2 can act both as a transcriptional activator as well as a repressor during cellular proliferation, differentiation and tumorigenesis [17–26]. For instance, cell cycle regulator genes like bcl-xL, c-myc, cyclin D1 and p53 are activated by ETS2 [27, 28], while BRCA1 expression is repressed in breast cancer tissue [29]. Interestingly, certain genetic translocations in ETS2 were observed in DS patients suffering from leukemia [30]. An interaction of ETS2 and ERG with GATA1 mutations were reported in DS subjects with acute megakaryoblastic leukemia and activation of the JAK/STAT pathway, a frequent attribute of megakaryocytic malignancies, was identified as common phenomena for this malignant transformation [31].

Given the functional importance of SIM2 and ETS2, we sought out to investigate alterations in their expression in disease etiology. Functional single nucleotide polymorphisms (fSNP) in candidate genes are important indicators for their association with disease phenotypes. In the present study, fSNPs of SIM2 and ETS2 were analyzed for their potential role in Indian individuals suffering from DS, ALL and solid tumors that includes BC and oral cancer (OC).

Methods

In Silico analysis to predict pathways regulated by SIM2 and ETS2

We undertook computational methods to determine the probable pathways regulated by SIM2 and ETS2. The promoter sequences (from-5000 bp to +1000 bp) were retrieved from the Eukaryotic promoter database (EPD) (http://www.epd.isb-sib.ch/) and Transcriptional Regulatory Element Database (TRED) (http://rulai.cshl.edu/TRED). Presence of SIM2 and/or ETS2 binding sites in the promoter sequences were identified by a Perl based program “Consensus-Finder”.

We retrieved expression profile of the genes harboring binding sites for SIM2 and ETS2 in different tissues by using GNF SymAtlas database (http://symatlas.gnf.org/SymAtlas/). Tissue-specific differential expression pattern (fold change) were calculated separately by comparing the median value of expression; value greater than the median was considered as over expression and value less than median was considered as under expression. We then determined co-expression of putative target genes at the site of over expression/under expression of SIM2 and ETS2 and promoter sites of these genes were analyzed by GENEDOC and Promoter Scan tools respectively (http://www-bimas.cit.nih.gov/molbio/proscan/). Functions of these putative target genes along with SIM2 and ETS2 in various biological pathways was analyzed by Panther (http://www.pantherdb.org/pathway/) and KEGG pathway (http://www.genome.ad.jp/kegg/pathway.html). The entire process is presented schematically in Figure 1.

Figure 1

Schematic presentation of thein silicomethods used for identification of putative target pathways of SIM2 and ETS2.

dCEU: Caucasians from Utah with ancestry from western and northern Europe; YRI: Yoruba from Ibadan, Nigeria; HCB: Han Chinese from Beijing, China and JPT: Japanese from Tokyo, Japan.

Subjects

Five ethnically matched groups of individuals were recruited for analysis of fSNPs. Healthy volunteers, without any clinical history of intellectual disability or malignant disorder, were recruited as controls (N = 149). Nuclear families having child with DS (N = 132) were recruited from the outpatient department of Manovikas Kendra, Kolkata and trisomic status of the probands was confirmed by karyotyping. ALL patients (N = 38) were recruited from the Netaji Subhash Chandra Bose Cancer Research Institute, Kolkata. Genomic DNA from post-operative normal tissue, adjacent to malignant BC (N = 49) and OC (N = 54) were collected from Chittaranjan National Cancer Research Institute and Indian Institute of Chemical Biology, Kolkata respectively. All samples were acquired after obtaining informed written consent for participation. Institutional Human Ethical Committee approved the study protocol.

Sample collection, DNA isolation and genotyping

Peripheral blood (~5 ml) collected from control individuals, DS probands, their parents and ALL patients was used for extraction of genomic DNA [33]. Target sequences were amplified and PCR amplicons were subjected to genotyping (Table 2).

Statistical analyses

Difference in allelic and genotypic frequency of the studied fSNPs in different study groups as compared to control was calculated by simple r x c contingency table (http://www.physics.csbsju.edu/stats/contingency_NROW_NCOLUMN_form.html). Minor allele frequency (MAF) of eleven fSNPs of Indian control individuals was also compared with four major populations studied in the HapMap project [Caucasians from Utah with ancestry from western and northern Europe (CEU), Han Chinese from Beijing, China (HCB) and Japanese from Tokyo, Japan (JPT) and Yoruba from Ibadan, Nigeria (YRI)]. Allelic odds ratios were calculated by Odds ratio calculator (http://www.hutchon.net/ConfidORnulhypo.htm). All P values obtained by allelic and genotypic association test were corrected for multiple testing by PLINK [34] and R program http://www.r-project.org/. Linkage Disequilibrium (LD) between the SNPs was measured by Haploview 4.1 using default settings. Haplotype frequency of fSNPs was inspected by Unphased program (Version 2.404) [35]. Interaction among the genotypes of SIM2 and ETS2 was analyzed by multifactor dimensionality reduction (MDR) software (version 2.0 beta 8.1) [36] and values were expressed as information gain (IG). Power of all chi square tests was calculated by Piface program [37]. Genotype data of four fSNPs (rs461155, rs1051425 in ETS2 and rs2073601, rs2073416 in SIM2) were also included for LD, haplotype and SNP-SNP interaction analysis. For convenience, triplicate homozygous genotypes were considered as diploid homozygous genotypes in DS probands while the triplicate heterozygous genotypes were considered as the diploid heterozygous genotype for all the calculations to compare with respective reference diploid groups [32, 38].

Results

In Silicoanalysis to predict pathways regulated by SIM2 and ETS2

Computational expression analysis by GNFSymAtlas showed that all the splice variants of SIM2 and ETS2 were over expressed in 13 tissues and under expressed in 12 tissues (Additional file 1: Table S1). Both SIM2 and ETS2 binding sites were identified in 464 genes by the ‘Consensus-Finder’ program from the eukaryotic promoter database. These putative target genes of SIM2 and ETS2 were sorted into four groups (Additional file 1: Table S2). Gene set I contains 71 genes, which showed over expression in all the tissues where SIM2 and ETS2 were also over expressed. Gene set II comprised of 9 genes, which showed down regulation in all the tissues where SIM2 and ETS2 were also down regulated. The 3rd and 4th set of genes exhibited reverse pattern of expression as compared to SIM2 and ETS2. In addition, SP1 and AP2 were identified as common TFs for both SIM2 and ETS2 target genes.

In silico identification of functional variants

Different in silico tools identified functional genetic variants in SIM2 and ETS2. Among them, thirty five (seven in SIM2 and twenty eight in ETS2) SNPs were genotyped in this study. Functional significance of the SNPs is indicated in Table 1.

Allelic and genotypic frequency distribution

Comparative analysis of MAF in different populations revealed significant difference in many SNPs (rs374575, rs2070529, rs2070530, rs1051476, rs11254 and rs711 in CEU; rs2269188 and rs7276961 in HCB; rs2269188, rs2070529 and rs2070530 in JPT; rs2269188, rs374575, rs2070529 and rs711 in YRI) (Table 4). Among seven SNPs studied in SIM2 (Table 1), only rs2269188 was polymorphic in the studied population. This SNP showed significant difference in allelic (χ2 =6.333, P = 0.012, Power = 82.3%) and genotypic (χ2 =6.41, P = 0.041, Power = 74.17%) frequency only in ALL compared to the control (Additional file 1: Table S3). However, the differences were not significant after correction for multiple testing.

Table 4

Minor allele frequencies in different populations as compared to the Indian control individuals

Twenty eight SNPs in ETS2 genomic region were analyzed and ten of them were polymorphic in the studied population. Eight of the ten ETS2 SNPs (rs374575, rs2070529, rs2070530, rs2070531, rs6517481, rs7276961, rs1051475 and rs1051476) did not show any significant difference in allelic frequency in DS probands, their parent and malignant groups (Additional file 1: Table S3). rs11254 showed a marginal allelic association in DS probands (P = 0.04712) which failed to stand Bonferroni (BF) and Benjamini-Hochberg (BH) correction for multiple testing (Table 5). rs711 showed significant increase in the ‘G’ allele frequency in probands with DS (χ2 =8.51, BF P and BH P =0.03, Power = 43.47%) as compared to controls. Although a significant increase in the ‘G’ allele (χ2 =6.83, P = 0.00895, Power = 85.03%, OR = 2.6) was noticed in ALL patients, it was found to be marginally significant after correction for multiple testing (BH P = 0.06). On the other hand, a significant increase in the ‘A’ allele (χ2 =9.91, BF P and BH P =0.01, Power = 88.26%) was observed in BC patients (Table 5).

LD and haplotype analysis

SNP pairs that showed higher LD (high D’ or r2 value) in at least one combination or different LD patterns in control and case groups during pair wise analysis by Haploview 4.1 were sorted out. In control individuals and parents of probands with DS, all the studied SNPs exhibited strong LD (Table 6). In particular, rs6517481-rs7276961, rs1051475-rs1051476, rs2070529-rs2070530, rs2070531-rs6517481, rs2070531-rs7276961 pairs exhibited strong LD in other studied groups. Some paired combinations showed different LD pattern in different disease groups. For instance, rs11254 showed weak LD with all the sites in DS and BC groups, while rs461155 showed weak LD in OC. Statistically significant differences in frequency of several haplotypes were noticed between test and control groups (Figure 2). Notably the ‘A-C-C-C-T-C-C-A-A-T-C-C-C-G-G’ haplotype showed significant frequency difference in BC, DS proband and their parents when analyzed by Unphased. However, comparison by simple Chi-square test followed by analysis of the power of association by Piface (Additional file 1: Table S4) showed statistically significant difference only for the DS probands and BC groups (p value 0.054 and 0.013 respectively).

Table 6

Pair wise LD pattern of studied SNPs

D′

rs6517481

rs7276961

rs1051475

rs1051476

rs2070529

rs2070530

rs2070531

rs461155

rs11254

r2

Control

rs6517481

-

1

0.926

0.926

0.909

0.930

0.949

0.735

0.719

rs7276961

1

-

0.926

0.926

0.909

0.930

0.949

0.735

0.719

rs1051475

0.774

0.774

-

1

0.899

0.923

0.944

0.765

0.722

rs1051476

0.774

0.774

1

-

0.899

0.923

0.944

0.765

0.722

rs2070529

0.499

0.499

0.441

0.441

-

1

0.955

0.836

0.660

rs2070530

0.501

0.501

0.445

0.445

0.958

-

0.977

0.849

0.680

rs2070531

0.902

0.902

0.805

0.805

0.551

0.553

-

0.757

0.718

rs461155

0.366

0.366

0.362

0.362

0.622

0.615

0.389

-

0.612

rs11254

0.491

0.491

0.503

0.503

0.250

0.255

0.490

0.240

-

Father of DS proband

rs6517481

-

1

0.856

0.853

0.871

0.954

0.905

0.907

0.845

rs7276961

1

-

0.856

0.853

0.871

0.954

0.905

0.907

0.845

rs1051475

0.579

0.579

-

1

0.710

0.797

0.888

0.799

0.897

rs1051476

0.560

0.560

0.974

-

0.717

0.802

0.886

0.805

0.931

rs2070529

0.423

0.423

0.369

0.387

-

0.974

0.956

0.807

0.876

rs2070530

0.450

0.450

0.399

0.415

0.841

-

1

0.848

1

rs2070531

0.796

0.796

0.606

0.586

0.494

0.479

-

0.903

0.905

rs461155

0.417

0.417

0.398

0.414

0.579

0.703

0.400

-

0.861

rs11254

0.714

0.714

0.643

0.674

0.442

0.494

0.795

0.376

-

Mother of DS proband

rs6517481

-

1

0.828

0.830

0.919

0.918

0.979

0.769

0.916

rs7276961

1

-

0.828

0.830

0.919

0.918

0.979

0.769

0.916

rs1051475

0.549

0.549

-

1

0.658

0.671

0.849

0.548

0.926

rs1051476

0.563

0.563

0.981

-

0.679

0.693

0.849

0.571

0.927

rs2070529

0.538

0.568

0.346

0.362

-

1

1

0.745

0.918

rs2070530

0.528

0.528

0.346

0.363

0.982

-

1

0.750

0.916

rs2070531

0.920

0.920

0.552

0.565

0.612

0.601

-

0.759

0.937

rs461155

0.370

0.370

0.228

0.243

0.535

0.563

0.346

-

0.728

rs11254

0.823

0.823

0.663

0.677

0.527

0.515

0.861

0.316

-

DS proband

rs6517481

-

1

0.865

0.862

0.875

0.975

0.939

0.761

0.255

rs7276961

1

-

0.865

0.862

0.875

0.975

0.939

0.761

0.255

rs1051475

0.710

0.710

-

1

0.830

0.926

0.903

0.774

0.274

rs1051476

0.680

0.680

0.966

-

0.807

0.903

0.885

0.756

0.266

rs2070529

0.501

0.501

0.476

0.458

-

0.949

0.825

0.812

0.256

rs2070530

0.593

0.593

0.565

0.546

0.859

-

0.911

0.823

0.273

rs2070531

0.754

0.754

0.735

0.718

0.523

0.608

-

0.699

0.269

rs461155

0.375

0.375

0.410

0.404

0.659

0.645

0.376

-

0.167

rs11254

0.039

0.039

0.043

0.039

0.026

0.028

0.037

0.011

-

ALL

rs6517481

-

1

1

1

1

1

1

0.761

1

rs7276961

1

-

1

1

1

1

1

0.761

1

rs1051475

0.939

0.939

-

1

0.926

1

1

0.698

1

rs1051476

0.939

0.939

1

-

0.926

1

1

0.698

1

rs2070529

0.698

0.698

0.638

0.638

-

1

0.926

0.748

0.858

rs2070530

0.625

0.625

0.665

0.665

0.894

-

1

0.731

1

rs2070531

0.939

0.939

1

1

0.638

0.665

-

0.698

1

rs461155

0.428

0.428

0.383

0.383

0.529

0.451

0.383

-

0.636

rs11254

0.883

0.883

0.940

0.940

0.582

0.708

0.940

0.339

-

BC

rs6517481

-

1

0.884

0.884

1

1

1

0.5

0.122

rs7276961

1

-

0.884

0.884

1

1

1

0.5

0.122

rs1051475

0.741

0.741

-

1

1

1

0.884

0.891

0.077

rs1051476

0.741

0.741

1

-

1

1

0.884

0.891

0.077

rs2070529

0.631

0.631

0.605

0.605

-

1

1

0.495

0.274

rs2070530

0.605

0.60

0.582

0.582

1

-

1

0.495

0.269

rs2070531

1

1

0.741

0.741

0.631

0.605

-

0.5

0.122

rs461155

0.125

0.125

0.350

0.350

0.204

0.204

0.125

-

0.425

rs11254

0.013

0.013

0.001

0.001

0.041

0.038

0.013

0.088

-

OC

rs6517481

-

1

0.948

0.948

1

1

1

0.334

0.899

rs7276961

1

-

0.948

0.948

1

1

1

0.334

0.899

rs1051475

0.697

0.697

-

1

1

1

0.950

0.632

0.907

rs1051476

0.697

0.697

1

-

1

1

0.950

0.632

0.907

rs2070529

0.607

0.607

0.471

0.471

-

1

1

0.391

0.856

rs2070530

0.630

0.630

0.488

0.488

0.963

-

1

0.306

0.861

rs2070531

0.960

0.960

0.728

0.728

0.582

0.605

-

0.306

0.902

rs461155

0.060

0.060

0.168

0.168

0.137

0.049

0.049

-

0.656

rs11254

0.656

0.656

0.787

0.787

0.360

0.378

0.687

0.189

-

Figure 2

Haplotypes showing significantly different frequency in Father of probands with DS-DSF (A), Mother of probands with DS-DSM (B), probands with DS- DSP (C), ALL (D), BC (E), OC (F). Order of the SNPs in the haplotypes is rs461155-rs1051425-rs11254-rs374575-rs2070529-rs2070530-rs2070531-rs6517481-rs7276961-rs1051475-rs1051476-rs2269188-rs2073601-rs2073416-rs711.

Analysis of gene-gene interaction

Gene-gene interaction analysis by MDR 2.0 beta 8.1 indicated that different combinations of SNPs were interacting with each other in different ways within these groups. No highly synergistic interaction was observed in DS probands, while individual effect of different SNPs were found to be high in DS [rs2073601 (1.92%), rs461155 (2.64%), rs1051425 (1.21%), rs2070529 (1.47%), rs2070530 (3.04%), rs2070531 (1.09%), rs6517481 (1.66%), rs7276961 (1.66%), rs1051475 (2.24%), rs1051476 (2.56%), rs11254 (29.78%) and rs711 (4.26%)] (Table 7).

While there was no high individual effect of rs2070529, rs2070530, rs2070531, rs6517481 and rs7276961, high synergistic interaction of these SNPs with rs11254 was noticed in mother of probands with DS (Table 7). In father of probands with DS, rs2070531, rs6517481 and rs7276961 also made a cluster together with rs2073416. High individual effect of rs2073601 (1.77%), rs1051475 (2.09%), rs1051476 (2.29%) and rs711 (3.08%) was observed in father of probands with DS (Table 7), while rs2073601 (1.38%), rs2073416 (1.96%), rs2269188 (17.17%), rs461155 (1.06%), rs1051425 (1.11%), rs374575 (1.10%), rs1051475 (1.47%), rs1051476 (1.34%) and rs711 (4.03%) showed high individual effect in mothers (Table 7).

Discussion

The present study was aimed at identifying possible involvement of SIM2 and ETS2, two TFs known to have gene overdosage in probands with DS exhibiting trisomy of HSA21. To identify SIM2 and ETS2 targets, we focused on 464 genes containing binding site for both these factors in their regulatory regions (−5000 bp to +1000 bp). Following categorization based on expression pattern by GNF SymAtlas, 91 genes were identified as up- or down regulated by these TFs (Additional file 1: Table S2). Genes like ABP1, HRB2, S100A8, THBS1, CYB561, GATA1, GATA3, SP1 and AP2 indicated potential activation by SIM2 and ETS2 (gene set I and II), while genes such as GCNT2, MASP1, LOC338328, PCSK4, ICAM1, LPPR4, SLC25A21, H1F0 and ATP1A1 indicated potential repression by these TFs (gene set III and IV). Many of the genes with binding sites for SIM2 and ETS2, viz. KLK8, LCK, TRRAP, GATA3, etc. were earlier reported to have role in neurological as well as malignancy related pathways [28, 39, 40]. Analysis in the present study by Panther also revealed that genes such as KLK8, KRT16, and LCK carrying binding sites for SIM2 and ETS2, are involved in the development and function of the neurological system. Hence over expression of SIM2 and ETS2 might alter expression of the downstream target genes leading to different DS phenotypes.

Previous analysis of DS revealed ambiguous observations on expressions of genes in HSA21 and other autosomes. For instance, a dosage dependent increase in transcription across different tissue/cell types was noticed in DS [41]. Analysis of lymphoblastoid cell lines generated from unrelated individuals revealed over expression of several HSA21 genes even in normal healthy volunteers [42]. In contrast, gene expression profile analysis of hearts of human fetuses with trisomy of HSA21 showed significant downregulation of 278 genes and upregulation of 195 genes as compared to controls [43]. On the other hand, serial analysis of gene expression in lymphocytes from children with DS revealed modest deregulation of autosomal genes [44]. Whole genome microarray in adult DS brains showed upregulation of 27% of genes on HSA21 as compared to 4.4% of genes on other autosomes [45]. Contrary to that, microarray analysis of cultured amniocytes and chorionic villus cells from fetuses with trisomy 13, 18, or 21 revealed lack of over expression of most of the HSA21 genes with only modest changes for genes on all other chromosomes [46]. It is possible that the differences in gene expression in HSA21 and other autosomes are due to the tissue of origin [47].

Differential expression of SIM2 and ETS2 target genes was also reported in different malignancies. For instance, the TRRAP gene, involved in transcriptional regulation and DNA repair, was found to be high in bone metastases from prostate cancer, intermediate in BC, and low in lung and kidney cancers [39]. KLK8 was upregulated in colorectal cancer and ovarian cancer while underexpressed in esophageal and cervical cancer [40, 48]. Differential gene expression profiling of approximately 8000 genes in sixty different cancer cell lines revealed difference in gene expression pattern to be correlated with the tissue of origin and the physiological properties (e.g., doubling time, drug metabolism, and interferon response) of cell lines [49]. Difference in expression between specific cancer cell line and their nonmalignant counterparts was also noticed [49]. It is intriguing to note that genes like MAGEA3 and ATP1A1, which indicated potential over expression in our study, are also over expressed in leukemia/lymphoma [48, 50–52]. However, THBS1, which also indicated potential upregulation in the present study, was down regulated in leukemia and upregulated in lymphoma [48, 53]. Thus, it remains unclear whether differential expression is also taking place for genes identified by our present in silico analysis. Further validation, involving expression analysis in various tumor tissues of individuals with DS, will be necessary.

Our next goal was to identify fSNPs in these two TFs. A number of SNPs with deleterious effects were identified in both the genes by our in silico approach. Analysis of allelic frequencies showed significant difference in MAF for the Indian control population as compared to other Asian, i.e. Japanese and Chinese, as well as Caucasian populations. Frequency distribution analysis revealed that the rs2269188 ‘G’ allele was significantly high in ALL subjects, which failed to stand test for multiple correction. Haplotypes showing significant difference in ALL, BC and OC groups harbored the rs2269188 ‘G’ allele. MDR analysis revealed high individual effect of this SNP in ALL (2.56%) and in mother of DS probands (17.17%) but not in any other groups. ‘G’ allele is responsible for AhR binding to SIM2. AhR binding with ARNT is an important step for carcinogen metabolism, which is inhibited by SIM2 [6–10]. We speculate that increased frequency of the rs2269188 ‘G’ allele may result in inappropriate metabolism of carcinogenic compounds, thus contributing to the development of leukemia.

rs711 is a site for SR protein mediated splicing regulation and may generate splice variants. In the Korean population, rs711 was reported to be associated with increased risk for acute myeloid leukemia [54]. In the present study, difference in allelic frequency for this site showed a trend to be significant in ALL even after correction for multiple testing, while DS probands, parents of DS probands and BC showed significant differences. MDR analysis supported evidence of individual effect of this SNP in all the studied groups.

rs11254, rs2070530 and rs1051476 showed significant difference in genotype distribution in DS probands (BH P = 0.001, 0.01 and 0.03 respectively). Though there was individual effect of these SNPs (29.78%, 3.04%, and 2.56% respectively), no significant synergistic effect was observed. rs11254 showed a very high individual effect (29.78%) in DS probands which could be due to 100% reduction in heterozygosity. On the other hand in malignant groups, rs11254 showed interactive effect in synergistic mode with other SNPs (rs2070530, rs2070531, rs6517481, rs7276961, rs1051475 and rs1051476). Therefore, this SNP may act differently in DS and other malignant groups.

While comparing differences in haplotype frequencies generated by fifteen SNPs, we analyzed each pair by simple Chi square tests to avoid errors due to multiple comparisons. The ‘A-C-C-C-T-C-C-A-A-T-C-C-C-G-G’ haplotype showed statistically significant higher occurrence in the control group compared to DS probands and BC. Frequency of this haplotype was also higher compared to other haplotypes generated from these 15 SNPs, which may be conferring protection towards the diseases.

MDR analysis exhibited high individual entropy value for rs461155 in both BC and OC groups. Involvement of risk allele of rs461155 in subjects with these two solid tumors has also been reported earlier [32]. Therefore, from the present study we predict that rs461155 may individually play an important role in solid tumor groups (BC and OC). On the other hand, rs2070530, rs2070531, rs6517481, rs7276961, rs1051475, rs1051476 and rs11254 may act together in ALL, BC and OC groups, where rs11254 act as a nodal SNP. In silico analysis revealed that, rs11254 has a potency to change miRNA and TF binding sites in the 3'UTR of ETS2. Presence of risk allele and inappropriate interaction of rs11254 probably can hamper proper expression of ETS2. There are various reports on loss of heterozygosity (LOH) of different genes under different malignant conditions like ovarian tumors [55], BC [56], head and neck squamous cell carcinoma [57], pituitary tumors [58], AML [59] etc. We found 100% LOH for rs11254 in DS probands.

Analysis of LD pattern of studied SNPs exhibited that rs6517481, rs7276961, rs1051475, rs1051476, rs2070529, rs2070530, rs2070531, rs461155 and rs11254 are in high LD in the studied population. MDR analysis also provided evidence of interaction between these SNPs in the malignant groups and parents of probands with DS and thus, may suggest combined effect of these fSNPs in the studied groups.

Similar to the present observation, SNP pairs rs2070529-rs2070530 were found to be in high LD in other populations studied in the HapMap; LD data for other SNP pairs were not available. Both haplotype distribution pattern and LD between different SNPs were found to vary in different groups examined in the present investigation, which could be attributed to the difference in allelic frequencies. Whether the observed difference is contributing to the disease etiology requires further analysis.

Our results do not imply that ETS2 and SIM2 are the only TFs in the HSA21 with a role in oncogenesis because several other TFs, located in the HSA21, also have association with malignancies [31, 60]. For example, increased expression of BACH1 (transcriptional regulator of megakaryocytic differentiation process) and SON (homologous sequence with MYC family of oncoproteins) were reported in association with myeloid leukemia in DS [61]. RUNX1 and ERG were hypothesized as candidates for leukemia in non-DS patients; however, triplicate dosages of these two genes were incapable to generate transient myeloproliferative leukemia in Ts1Cje mice and thus, these two genes may not be directly responsible for development of leukemia in individuals with DS [62]. Further analysis of these TFs, in association with SIM2 and ETS2, would help us to understand their actual role in DS associated malignancies.

Conclusions

We summarize that, a) the rs2269188 ‘G’ allele, showing trend for higher occurrence in ALL patients (BH P = 0.06, OR = 2.6), may play a regulatory role in ALL by altering carcinogen metabolism; in mother of probands with DS also, this SNP may contribute some regulatory role as the individual effect of this SNP calculated by MDR analysis was very high (Table 7); that b) rs711 may have very important role in DS and associated malignancies; that c) the fSNP rs11254 may act as a core SNP in the interaction cluster of rs6517481, rs7276961, rs1051475, rs1051476, rs2070529, rs2070530 and rs2070531, thus playing a role in malignant development in BC, OC, ALL; in parents of DS probands, these SNPs also showed strong interaction while in DS, a high individual effect of rs11254 was found; and that d) rs2070530, rs711 and rs11254 (with 100% LOH) showed strong genotypic association with DS. This prominent difference in status of fSNPs of SIM2 and ETS2 may indicate a significantly different pattern of SIM2 and ETS2 regulation in the studied groups, eventually leading to altered expression of their downstream genes associated with distinct disease phenotypes.

Declarations

Acknowledgement

Authors are thankful to all the study participants and the Indian Council of Medical Research for providing senior research fellowship to AC (#45/1/2010-Hum/BMS).

Pre-publication history

Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.