Abstract

The enormous complexity of the human brain ultimately derives from a finite set of molecular instructions encoded in the human genome. These instructions can be directly studied by exploring the organization of the brain's transcriptome through systematic analysis of gene coexpression relationships. We analyzed gene coexpression relationships in microarray data generated from specific human brain regions and identified modules of coexpressed genes that correspond to neurons, oligodendrocytes, astrocytes and microglia. These modules provide an initial description of the transcriptional programs that distinguish the major cell classes of the human brain and indicate that cell type-specific information can be obtained from whole brain tissue without isolating homogeneous populations of cells. Other modules corresponded to additional cell types, organelles, synaptic function, gender differences and the subventricular neurogenic niche. We found that subventricular zone astrocytes, which are thought to function as neural stem cells in adults, have a distinct gene expression pattern relative to protoplasmic astrocytes. Our findings provide a new foundation for neurogenetic inquiries by revealing a robust and previously unrecognized organization to the human brain transcriptome.

Network analysis of gene expression in human cerebral cortex, caudate nucleus and cerebellum identifies distinct modules of coexpressed genes. (a–d) Dendrograms produced by average linkage hierarchical clustering of genes on the basis of topological overlap (Methods). Modules of coexpressed genes were assigned colors and numbers as indicated by the horizontal bar beneath each dendrogram. Modules from different networks with significant overlap (corrected hypergeometric P < 0.05) were assigned the same color and number, with networks denoted by letters (for example, M9A for CTX, M9B for CTX_95, M9C for CN and M9D for CB). We used 67 samples and 5,549 probe sets for CTX (a), 42 samples and 3,203 probe sets for CTX_95 (b), 27 samples and 4,050 probe sets for CN (c), and 24 samples and 4,029 probe sets for CB (d). Samples from a, c and d were analyzed on Affymetrix U133A microarrays, whereas samples from b were analyzed on Affymetrix U95A/v2 microarrays.

Many gene coexpression modules are present in multiple human brain networks. Comparison of the 19 gene coexpression modules identified in CTX with modules identified in CTX_95, CN and CB. Modules with significant overlap (corrected hypergeometric P < 0.05) are depicted by horizontal bars (CTX_95, bottom; CN, middle; CB, top). For example, M1 did not show significant overlap between CTX and any modules in CTX_95, but overlapped 80% with a module found in CN (P = 4.0 × 10−18) and 92% with a module found in CB (P = 5.9 × 10−24). Numbers in parentheses on the right indicate the maximum possible number of shared genes per pair of modules (that is, the denominator used to calculate percent overlap). NS, not significant.

Module membership identifies groups of genes that are consistently coexpressed in the human brain. Expression levels of genes with the highest average module membership for M9 (a), M15 (b), and M16 (c) (columns) across all four networks (rows). In each network, module membership was averaged for genes represented by multiple probe sets. Module membership values for all genes were then averaged across all networks (genes that were not represented on both Affymetrix U133A and U95A/v2 microarrays were excluded). Log2-transformed expression levels of the top ten genes ranked by average module membership are shown for all samples in each network (gene symbols appear in legends) (similar plots for all modules can be found in ).

Overlap and functional characterizations reveal a meta-network of gene coexpression modules in the human brain. Summary of module characteristics and overlap. Black lines connect modules from different networks with significant overlap (corrected hypergeometric P < 0.05) that were assigned the same color and number, with the width of the line corresponding to the extent of the overlap; red lines connect modules from different networks with significant overlap that were assigned different colors and numbers ( and Methods).

M13C identifies genes that are coexpressed in the adult subventricular neurogenic niche. We carried out immunohistochemistry for proteins encoded by genes with strong membership for M13C in 7 μM sections of adult human brain from anterior subventricular zone (SVZ). (a) Hematoxylin staining showed nuclei of cells in SVZ. The four major layers in this region are demarcated by brackets: ependymal cells (I), hypocellular gap (II), SVZ astrocyte ‘ribbon’ (III) and a transitional region (IV). (b) As expected,, GFAP expression was weak in I and strong in II and III. (c) Expression of TuJ1 (absent in the network, but considered to be a marker of immature neurons) was primarily restricted to III, consistent with previous reports,. (d) PLTP expression was strongest in I and III, with some staining of cell bodies in II. (e) CD24 was expressed in I–III, with prominent staining of processes in II. Expression in I was consistently targeted to the basal surface of ependymal cells. (f) Double labeling for CD24 and PLTP revealed overlapping expression patterns in I and III. (g,h) DPYSL3 (g) and ASCL1 (h) were expressed in I and III, with occasional staining of cell bodies in II and IV. (i) Double labeling for CD24 and ASCL1 revealed overlapping expression patterns in I and III. (j) GJA1 was expressed in I–III. (k) Double labeling for CD24 and GJA1 revealed overlapping expression patterns in I–III. (l) Unlike GFAP and GJA1, expression of ALDH1L1 was absent from processes in II (asterisks), despite robust staining of astrocytes in brain parenchyma (inset; image taken from same section). Single images are from one of three different individuals. Scale bars represent 50 μM.