On Expression Patterns and Developmental Origin of Human Brain Regions.

Abstract

Anatomical substructures of the human brain have characteristic cell-types, connectivity and local circuitry, which are reflected in area-specific transcriptome signatures, but the principles governing area-specific transcription and their relation to brain development are still being studied. In adult rodents, areal transcriptome patterns agree with the embryonic origin of brain regions, but the processes and genes that preserve an embryonic signature in regional expression profiles were not quantified. Furthermore, it is not clear how embryonic-origin signatures of adult-brain expression interplay with changes in expression patterns during development. Here we first quantify which genes have regional expression-patterns related to the developmental origin of brain regions, using genome-wide mRNA expression from post-mortem adult human brains. We find that almost all human genes (92%) exhibit an expression pattern that agrees with developmental brain-region ontology, but that this agreement changes at multiple phases during development. Agreement is particularly strong in neuron-specific genes, but also in genes that are not spatially correlated with neuron-specific or glia-specific markers. Surprisingly, agreement is also stronger in early-evolved genes. We further find that pairs of similar genes having high agreement to developmental region ontology tend to be more strongly correlated or anti-correlated, and that the strength of spatial correlation changes more strongly in gene pairs with stronger embryonic signatures. These results suggest that transcription regulation of most genes in the adult human brain is spatially tuned in a way that changes through life, but in agreement with development-determined brain regions.

(A) Illustration of the ontology region tree showing 16 brain structures studied. The full ontology contains 1534 regions, not shown. (B) A 3D model brain illustrating 16 brain regions using the same colors as in A. The left cortex is not shown in order to expose the inner structures. (C) Hierarchical clustering of 16 human brain structures. Agglomerative linking of regions by their average expression profile yields a tree structure that agrees the with ontology tree. The color above the region name matches the colors in the region ontology tree in Fig 1A. (D) The joint distribution of expression distances and ontology distances across all pairs of tissue samples, as computed for the gene NEUROD1. The two distance measures are strongly correlated (Spearman ρ = 0.65, n = 6.85M, p-value < 10−15), showing that the spatial expression pattern agrees with the ontology.

(A) Heatmap showing the joint distribution of BRO-agreement scores of all genes in 16 regions of ABA6-2013 (absica) and Kang-2011 (ordinate). Colors correspond to the density. (B) A scatter plot showing BRO-agreement scores for the two datasets in A. Each light-grey dot corresponds to a single gene (a total of 17K genes). Dark-grey dots correspond to permuted data (see ). The BRO scores are significantly correlated across the two datasets (Spearman, ρ = 0.53, n = 16947, p-value<10−16). (C) Marginal distribution of BRO scores in the ABA6-2013 dataset. BRO scores for most genes are significantly greater than randomized scores. (D) Same as C, for the Kang-2011 data. (E) BRO-agreement scores traced through life based on the full developmental dataset in Kang-2011. Samples are aggregated based on the developmental stages defined in []. Numbers above the line denote the number of subjects in each age group.

(A) The median BRO score as a function of the evolutionary age of genes. Older genes receive on average higher BRO scores than evolutionary recent genes. (B) Focus on the distribution of BRO scores for the oldest gene group and of the most recent (primates) gene group. Genes in the cellular organisim group have a median BRO-score of 0.129, while genes in the primate group have a median BRO-score of 0.068. The two distributions are signifatcly different.(Wilcoxonon two-tail test: p-value = 10−6).

(A) Distribution of gene pairs with anti-correlated spatial expression. Kegg-based gene pairs include 1496 pairs with (1) sequence similarity > 30%, and (2) sharing a sub-component in one of 17 KEGG synaptic pathways (see ). Ensembl-based paralogs include 3503 pairs of paralogs (as defined by Ensembl) where both genes in a pair are included in one of same 17 pathways (see ). Baseline corresponds to the distribution expected at random. (B) Consistency of spatial correlations across two datasets ABA2013 and Kang-2011. The spatial correlations of paralog pairs across the two dataset, show a significant agreement (Spearman ρ = 0.48, p-value < 10−78). Each point correspond to the median correlation across adult subjects, in one gene pair (total of 1496 pairs). (C-F) Examples of development of spatial correlations in the Serotonin system. (C) The pair of genes coding for Serotonin receptors HTR2A and HTR1F exhibit a continuous rise in spatial correlation, riding from slightly negative in early embryonic development to strong positive correlation. (D) The pair of genes coding for Serotonin receptors HTR5A and HTR2C show a sharp transition from positive to negative spatial correlation in early development, which is then preserved through life. (E) The paralogs CACNA1A and CACNA1D exhibit a rise in spatial correaltion (F) The paralogs HTR7 and HTR5A show a continuous change in spatial correation, from positive corealation during embryonic development to negative one at adulthood.