Abstract

Background

Recent meta-analyses of genome-wide association studies have identified single nucleotide polymorphisms (SNPs) within/near 54 genes associated with lung function measures. Current understanding of the contribution of these genes to human lung development is limited. We set out to further define i) the expression profile of these genes during human lung development using a unique set of resources to examine both mRNA and protein expression and ii) the link between key polymorphisms and genes using expression quantitative trait loci (eQTL) approaches.

Methods

The mRNA expression profile of lung function associated genes across pseudoglandular and canalicular stages of lung development were determined using expression array data of 38 human fetal lungs. eQTLs were investigated for selected genes using blood cell and lung tissue data. Immunohistochemistry of the top 5 candidates was performed in a panel of 24 fetal lung samples.

Results

Twenty-nine lung function associated genes were differentially expressed during lung development at the mRNA level. The greatest magnitude of effect was observed for 5 genes; TMEM163, FAM13A and HHIP which had increasing expression and CDC123 and PTCH1 with decreased expression across developmental stages. Focussed eQTL analyses investigating SNPs in these five loci identified several cis-eQTL’s. Protein expression of TMEM163 increased and CDC123 decreased with fetal lung age in agreement with mRNA data. Protein expression in FAM13A, HHIP and PTCH1 remained relatively constant throughout lung development.

Conclusions

We have identified that > 50 % of lung function associated genes show evidence of differential expression during lung development and we show that in particular TMEM163 and CDC123 are differentially expressed at both the mRNA and protein levels. Our data provides a systematic evaluation of lung function associated genes in this context and offers some insight into the potential role of several of these genes in contributing to human lung development.

Keywords

Lung functionDevelopmentExpressionGeneticsTMEM163CDC123

Background

Determinants of lung function and adult lung diseases, such as Chronic Obstructive Pulmonary Disease (COPD) may have early origins [1–4]. Current knowledge of human lung development is based on evidence provided by anatomical dissections and histological staging. Historically, lung development has been divided into 5 stages of growth due to the presence of select morphological features, Embryonic (26 days post conception −5 weeks), Pseudoglandular (5–16 weeks), Canalicular (16–26 weeks), Saccular (26 weeks-birth) and Alveolar (birth to 6 months). Previous work by Kho et al. sought to further our understanding of human lung development and consider differences in expression at the molecular level [5]. The authors investigated the hypothesis that the temporal changes in gene expression regulated the morphological transformation of early cells to a fully functional differentiated organ. Profiling the gene expression signatures in a panel of human fetal lungs identified differential gene expression during organ development and revealed the presence of several molecular stages. The early pseudoglandular stage was enriched for genes involved in chromosomal organisation processes associated with mitosis. The late pseudoglandular stage was enriched for genes related to surfactant function-gas exchange and immunological-MHC class II attributes providing evidence of at least a second distinct molecular phase of development [5].

Recently, five Genome-Wide Association Study (GWAS) meta-analyses have been completed that identified single nucleotide polymorphisms (SNPs) within or near 54 genes associated with spirometry measures including; Forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC) and the ratio of FEV1/FVC [6–10]. These measures of lung function are commonly used to define respiratory diseases such as COPD. To date, no studies have explored the potential role of these genes in human lung development by expression profiling this set of genes in human fetal lung. We hypothesised that these SNPs identified by GWAS tag lung function genes which may contribute to the lung developmental signatures described [5]. This underlying hypothesis is supported by data from large meta-analyses of these lung function GWAS as association signals were observed both in childhood and adult cohorts suggesting early origins of effects [6]. We therefore set out to identify whether the genes potentially underlying these associations identified in lung function GWAS were i) differentially expressed across stages of human lung development using mRNA analyses; ii) if we could identify expression quantitative trait loci (eQTL) in these genomic regions contributing to the levels of these specific genes underlying mechanisms and (iii) if the alteration in mRNA is translated to altered protein levels for selected candidates.

Methods

Selection of genes and gene expression array analysis

We identified genes to study by focussing on studies using GWAS meta-analysis approaches and taking the gene(s) in each loci that were reported as the likely candidate in these loci by the authors based on available evidence [6–10]. This approach generated a list of 54 genes for analyses (Additional file 1. Table S1). Publically available Affymetrix U133 Plus 2 expression array data of 38 fetal lung samples with gestational age 7–22 weeks (Gene Expression Omnibus (GEO) dataset, GSE14334 [5, 11]) were used to identify differential expression during lung development for the 54 genes. A total of 190 probes were interrogated with some genes represented by a single probe, while others had up to 12 probes (Additional file 2: Table S2). Briefly, differential gene expression analysis relative to gestational age was performed using a linear regression model [11]. Multiple testing was corrected for using the Benjamini and Hochberg method. Adjusted p values of < 0.05 were used to show a significantly differential gene expression pattern over the gestational ages studied.

In silico linkage disequilibrium and eQTL analyses

The top 5 differentially expressed genes were taken forward for in silico investigation. We focussed our analyses to the pivotal SNPs reported in the original manuscript and all other SNPs in linkage disequilibrium with these SNP (r2 > 0.8), identified using Haploreg version 2 which is data based on 1000 Genomes Phase I individuals (db SNP 137). These SNPs were subsequently assessed for cis- and trans-eQTL’s in (i) blood using the online blood eQTL browser (http://genenetwork.nl/bloodeqtlbrowser/) detected at a false discovery rate (FDR) of 5 % [12] and in (ii) 1111 lung tissue resections from a study by Hao et al. [13] at a FDR of 10 %.

All probes in adjusted p value of <0.05. Probe = Affymetrix U133 Plus 2 array probe ID. Beta coefficient corresponds to the mean change in gene expression per day during the studied period (7–22 weeks of gestational age)

Identification of genes for further study

Genes with at least 2 of the most significantly differentially expressed probes, a consistent direction of effect and the majority of tested probes showing this effect were taken forward for further analysis. These included Hedgehog interacting protein (HHIP), Transmembrane protein 163 (TMEM163), Family with sequence similarity 13, member A (FAM13A), Patched1 (PTCH1) and Cell division cycle 123 (CDC123). HHIP, TMEM163 and FAM13A showed an increased mRNA expression with fetal lung age (Table 1), whilst PTCH1 and CDC123 showed a significant decrease in expression (Table 2). For visualisation, the most significant probe results for each of these selected genes were plotted (Figs. 1 and 2, Additional file 5: Figure S2, Additional file 6: Figure S3 and Additional file 7: Figure S4). Expression intensities were plotted against gestational age of the developing lung samples.

To investigate whether lung function associated SNPs within the top 5 differentially expressed genes during lung development had evidence of regulated expression, in silico analyses were performed to identify (i) SNPs in Linkage Disequilibrium (LD) with the SNPs of interest (R2 > 0.8) and (ii) expression quantitative trait loci (eQTL) for the SNP of interest and those in LD. Apart from HHIP SNP rs1032295, all lung function associated SNPs had SNPs in LD (Additional file 8: Table S4). eQTL were identified for SNPs in LD with lung function associated SNPs in PTCH1, TMEM163 and FAM13A using the blood cell dataset (Additional file 8: Table S4). 12 cis-eQTLs were identified for SNPs in LD with rs16909898 and rs10512249 (PTCH1) which potentially regulate the expression levels of PTCH1 mRNA. Additionally 2 trans-eQTLs were identified for lung function associated SNP rs16909898 and rs10512249 (PTCH1), both regulating the expression of NEFH (Neurofilament, Heavy Polypeptide, chromosome 22). 3 cis-eQTLs were identified for SNPs in LD with TMEM163 SNP rs1942055, regulating the expression of either MGAT5 (encoding a member of the glycosyltransferase family) or CCNT2 (encoding Cyclin T2). eQTL were only identified for SNPs in LD with HHIP using the human lung dataset (Additional file 9: Table S5). Both rs1980057 and rs12504628 shared 2 SNPs in LD (rs12509311 and rs13141641) and were cis-eQTL’s for HHIP (Additional file 9: Table S5). eQTLs in both Blood and Lung cohorts studied were not identified for SNPs in LD with CDC123 SNP rs7068966.

Discussion

Recent meta-analyses of genome-wide association studies have identified genetic associations which contribute to determination of lung function. We hypothesised that the genes underlying these associations may contribute to lung development potentially through differential expression. To specifically test this hypothesis we expression profiled lung function associated gene mRNA levels in human fetal lung samples and used eQTL analyses to link associated SNPs to the genes of interest, followed by expression profiling at the protein level in human fetal tissue. We found that all 54 genes evaluated had measurable mRNA expression in human fetal lung and 29/54 genes showed evidence of differential expression across the Pseudoglandular and Canalicular stages of lung development. We identified cis-eQTLs in Blood for SNPs in LD with lung function associated SNPs in PTCH1, TMEM163 and FAM13A and cis-eQTLs in HHIP in the lung. Although no cis-eQTL were identified for SNPs in LD with CDC123 or the sentinel SNP associated with lung function and CDC123, we still sought to explore the protein expression of this highly differentially expressed candidate. The rationale for this inclusion was that while eQTL analyses can be useful for functional translation of SNP effects these analyses are tissue, cell and context specific [14]. Therefore based on our analyses of blood and lung eQTL data we cannot exclude that eQTL mechanisms may exist for CDC123. Overall, these data provided evidence that the associated SNPs in the relevant loci showed some evidence of regulating the genes that were differentially expressed in our initial analyses and selected for further study.

Exploring the top 5 candidates further (based on magnitude of effect and reproducibility across probes), we found that TMEM163 protein expression generally increased with fetal age and CDC123 protein expression decreased in agreement with the mRNA data. Of the 5 candidates taken forward for the immunohistochemical study in fetal lungs we found both CDC123 expression (decreased) and TMEM163 (increased) provided the greatest supporting evidence at the protein level, whilst the mRNA and protein level data for PTCH1, FAM13A and HHIP poorly correlated.

In our first set of analyses we identified that 29/54 (>53 %) genes identified in lung function associated loci in GWAS were differentially expressed across human lung development. This is nearly double the percentage of probes differentially expressed during the analysed period of lung development (28 %, see [11]). This enrichment is striking and adds supporting evidence that genetic determinants of lung function may have early origins for a large proportion of loci. While differential expression is not conclusive evidence of a significant biological role, it is also important to note that many of the lung function genes have prior evidence for a role in lung development from both mouse and human studies; e.g., HHIP, PTCH1 [15, 16].

Interestingly, these genes were in our 5 top candidates and are part of the Sonic hedgehog (SHH) signalling pathway. PTCH1 encodes the receptor potentiating signalling, whilst expression of HHIP represses the transcription of HH target genes. We found expression of PTCH1 mRNA and protein expression was generally high throughout lung development, with probe expression intensities > 9 and strong immunopositivity in the majority of fetal lungs which is consistent with other studies in human fetal lung [15]. High protein levels of PTCH1 have been seen in nearly all tissues in adulthood including pneumocytes, macrophages and epithelium of the lung (www.proteinatlas.org). Thus, the evidence suggests that PTCH1 not only has a role in lung development but may also play an important role in determining lung function later in adult life [17]. The second SHH pathway candidate investigated was HHIP, an important morphogen in a variety of developmental processes during embryonic development and SNPs near this gene have previously been associated with risk of COPD [18, 19]. In 2013, S. Collins et al., summarised that polymorphisms in HHIP affect fetal, childhood and adult lung function [20]. HHIP is known to have a role in lung development through fibroblast growth factor 10 (FGF10) and its control of lung branching [21].

Of the most robust findings; i.e., where mRNA differential expression was supported by differential protein expression in the same direction; CDC123 expression (decreased) and TMEM163 (increased) warrant further study. CDC123 is thought to be required for translation initiation and thus could facilitate the biogenesis of the eukaryotic initiation factor 2 (eIF2) and is also potentially involved in protein modifications [22, 23]. At present, there is evidence that TMEM163 (SV13) is a zinc finger binding protein involved in vesicular transport [24] and more recent work has shown TMEM163’s modulation of cellular zinc levels alongside TRPML1 [25].

A separate candidate was FAM13A, which has previously been associated with other lung conditions, e.g., idiopathic pulmonary fibrosis and COPD. It has been hypothesised that FAM13A has a role in Rho GTPase signalling pathways [26] and most recently Jin et al. has identified the regulation of nuclear–cytoplasmic shuttling of FAM13A by B56-containing PP2As and Akt [27]. Additionally, they identified that FAM13A had the ability to activate the Wnt pathway. However, on production of FAM13A-/- mutant mice, they were found to be viable and healthy, showing FAM13A was not essential for embryonic development and physiological functions.

Working with fetal tissues has both strengths and limitations. Key strengths of the current study are the unique use of human fetal samples across gestational ages for both mRNA and protein analyses with associated translational potential. However, it was not possible to use the same sample for both mRNA and protein analyses which introduces sample variation potentially confounding correlations between mRNA and protein expression. Correlation between mRNA and protein throughout lung development may not be present due to the complexities of post-transcriptional regulation. Other limitations of the study were the availability of only a small number of samples for immunohistochemistry at the Canalicular stage of lung development and that the protein analyses were based on immunohistochemistry rather than a more quantitative method e.g., Western blotting, prohibiting quantitative or semi-quantitative analyses.

Conclusion

This is the first study to comprehensively investigate the mRNA expression of the genes closest to association signals seen from GWAS of lung function during human lung development with focussed analyses for 5 candidates at both the RNA and protein level. The data presented demonstrate that >50 % of these genes show some evidence of differential expression during normal human lung development. We have provided evidence that TMEM163 (a transmembrane protein) and CDC123 (a cell cycle control protein) are differentially expressed at both the mRNA and protein level during lung development. These candidates now warrant further investigation as they may play an important role in determining lung function later in adult life.

Declarations

Acknowledgements

We would like to thank The Children’s Brain Tumour Research Centre, University of Nottingham, for the use of their Olympus BX14 light microscope. Human fetal lung samples were collected for in collaboration with the Human Developmental Biology Resource (HDBR) and control tissues for immunohistochemistry were provided by the Nottingham Health Science Biobank.

Funding

This study was funded by a grant from the Medical Research Council, UK (G1000861). EM has received funding from The Swedish Research Council, The Swedish Heart-Lung Foundation and Stockholm County Council (ALF). The human embryonic and fetal material was provided by the Joint MRC/Wellcome Trust (grant # 099175/Z/12/Z), Human Developmental Biology Resource (www.hdbr.org). Funding bodies did not have roles in the design of the study, the collection, analysis and interpretation of data or the writing of the manuscript.

Availability of data and materials

Authors’ contributions

IS and IH conceived the study. SM completed the eQTL searches and IHC. EM and SKM collaborated on the fetal lung tissue Affymetrix data and analyses. SM and IS drafted the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

All samples collected for immunohistochemical staining were consented for in accordance with national bio banking procedures and the UK Human Tissue Act (2004). The human embryonic and fetal material from the Human Developmental Biology Resource (www.hdbr.org) was used under ethics approvals: 08/H0906/21 + 5 (NRES Committee North East - Newcastle & North Tyneside 1) and 08/H0712/34 + 5 (NRES Committee London - Fulham). Control tissues for immunohistochemistry were provided by the Nottingham Health Science Biobank (Nottingham, UK) under ethical approval (08/H0407/1).

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.