Database Description

Protein-protein interaction networks are spatially and temporally merged from the results of various experimental and predicting methods (1). These often ignore localization data, which in turn deteriorates the data quality and its predictive potential. ComPPI is a cellular compartment-specific database of proteins and their interactions enabling an extensive, compartmentalized protein-protein interaction network analysis (URL: http://comppi.linkgroup.hu/). ComPPI enables the user to filter biologically unlikely interactions, where the two interacting proteins have no common subcellular localizations and to predict novel properties, such as compartment-specific biological functions. The overlap of the data among various databases can be very low (2), therefore the integration extended with manual curation protocols results in improved data quality. The compilation of 9 protein-protein interaction and 8 subcellular localization datasets (http://comppi.linkgroup.hu/help/input_databases) had 4 curation steps including a manually built, comprehensive hierarchical structure of cellular localization data localizing them to one or several of >1,600 Gene Ontology cellular component terms (3) assigning these proteins into distinct sub-cellular compartments (http://comppi.linkgroup.hu/help/subcell_locs). The proteome-wide dataset contains localizations for 5 main subcellular organelles (nucleus, mitochondrion, cytosol, secretory-pathway, membrane) and the extracellular compartment. The ComPPI database includes comprehensive and integrated data of four species (S. cerevisiae, C. elegans, D. melanogaster and H. sapiens) cataloguing 125,757 proteins, their 791,059 interactions and 195,815 major subcellular localizations in its current, 1.1 version (http://comppi.linkgroup.hu/help/statistics). ComPPI provides confidence scores for protein subcellular localizations and protein-protein interactions (http://comppi.linkgroup.hu/help/scores). ComPPI has user-friendly search options for individual proteins giving their subcellular localization, their interactions and the likelihood of their interactions considering the subcellular localization of their interacting partners (http://comppi.linkgroup.hu/protein_search). Download options of search results, whole-proteomes, organelle-specific interactomes, and subcellular localization data are available on its website (http://comppi.linkgroup.hu/downloads). Due to its novel features, ComPPI is an integrative, open source database useful for the analysis of experimental results in biochemistry and molecular biology, as well as for proteome-wide studies in bioinformatics and network science helping cellular biology, medicine and drug design.

Acknowledgements

Authors acknowledge the members of the LINK-Group (http://LinkGroup.hu) and Cellular Network Biology Group (http://NetBiol.elte.hu) for their advice, as well as Judit Gyurkó for the design of the ComPPI web-site. This work was supported by research grants from the Hungarian Science Foundation [grant number: OTKA-K83314], by a fellowship in computational biology at The Genome Analysis Centre (Norwich, UK) in partnership with the Institute of Food Research (University of Norwich, UK), and strategically supported by Biotechnological and Biosciences Research Council, UK [TK] and by a János Bolyai Scholarship of the Hungarian Academy of Sciences [TK].