Abstract

Recent genome-wide association studies have identified many promising schizophrenia candidate genes and demonstrated that common polygenic variation contributes to schizophrenia risk. However, whether these genes represent perturbations to a common but limited set of underlying molecular processes (pathways) that modulate risk to schizophrenia remains elusive, and it is not known whether these genes converge on common biological pathways (networks) or represent different pathways. In addition, the theoretical and genetic mechanisms underlying the strong genetic heterogeneity of schizophrenia remain largely unknown. Using 4 well-defined data sets that contain top schizophrenia susceptibility genes and applying protein-protein interaction (PPI) network analysis, we investigated the interactions among proteins encoded by top schizophrenia susceptibility genes. We found proteins encoded by top schizophrenia susceptibility genes formed a highly significant interconnected network, and, compared with random networks, these PPI networks are statistically highly significant for both direct connectivity and indirect connectivity. We further validated these results using empirical functional data (transcriptome data from a clinical sample). These highly significant findings indicate that top schizophrenia susceptibility genes encode proteins that significantly directly interacted and formed a densely interconnected network, suggesting perturbations of common underlying molecular processes or pathways that modulate risk to schizophrenia. Our findings that schizophrenia susceptibility genes encode a highly interconnected protein network may also provide a novel explanation for the observed genetic heterogeneity of schizophrenia, ie, mutation in any member of this molecular network will lead to same functional consequences that eventually contribute to risk of schizophrenia.

Proteins encoded by genes that were defined by the top 81 single-nucleotide polymorphisms (SNPs) from Schizophrenia Psychiatric Genome-Wide Association Study Consortium (PGC) form a highly significant interconnected network. Protein-protein interaction (PPI) network constructed by genes that defined by top 81 SNPs from Schizophrenia PGC. There were 104 disease proteins participating in the direct network and 343 direct interactions in total. This degree of interconnectivity is statistically highly significant (P = 9.9 × 10−4, corrected) compared with 10 000 random networks, which only have the 206 direct edges count expected by chance. The core of this highly interconnected network is composed of genes that are involved in nucleosome assembly (pink circle), suggesting an enrichment of nucleosome assembly genes in schizophrenia susceptibility loci. KEGG pathway analysis of the genes that participate in the direct network is shown in the red box. P values were corrected by the Benjamini-Hochberg procedure in DAVID.

Protein products encoded by genome-wide significant schizophrenia susceptibility genes significantly interacted. (A) Protein-protein interaction (PPI) network constructed with genes that were significantly associated with schizophrenia in recent genome-wide association studies (GWAS) of schizophrenia. (B) Significant network (P = .0001) compared with 10 000 random networks, suggesting significant physical interactions between protein products of top schizophrenia susceptibility genes. Structurally equivalent random networks were built from a within-degree node-label permutation method. An empirical distribution was constructed for a direct connectivity count and used to assess the significance of networks. Numbers on the x-axis represent the direct network connectivity (the number of edges in the direct network), which were enumerated for the disease networks and 10 000 random networks. The plotted histogram represents random expectation (the dashed arrowhead), and the solid arrowheads indicate the schizophrenia network (observed). The y-axis represents percent of permutated networks (eg, the percentage of networks with 5 edges in the direct network is 0.0001, which is statistically significant compared with 10 000 random networks).