Fagerberg, Linn

Abstract [en]

The fundamental goal of proteomics is to gain an understanding of the expression and function of the proteome on the level of individual proteins, on the level of defined cell types and on the level of the entire organism. In this thesis, the human proteome is explored using membrane protein topology prediction methods to define the human membrane proteome and by global protein expression profiling, which relies on a complex study of the location and expression levels of proteins in tissues and cells.

A whole-proteome analysis was performed based on the predicted protein-coding genes of humans using a selection of membrane protein topology prediction methods. The study used a majority decision-based method, which estimated that approximately 26% of the human genes encode for a membrane protein. The prediction results are displayed in a visualization tool to facilitate the selection of antigens to be used for antibody generation.

Global protein expression profiles in a large number of cells and tissues in the human body were analyzed for more than 4000 protein targets, based on data from the antibody-based immunohistochemistry and immunofluorescence methods within the framework of the Human Protein Atlas project. The results revealed few cell-type specific proteins and a high fraction of human proteins expressed in most cells, suggesting that cell and tissue specificity is attained by a fine-tuned regulation of protein levels. The expression profiles were also used to analyze the relationship between 45 cell lines by hierarchical clustering and principal component analysis. The global protein expression patterns overall reflected the tumor origin of the cells, and also allowed for identification of proteins of importance for distinguishing different categories of cell lines, as defined by phenotype of progenitor cell. In addition, the protein distribution in 16 subcellular compartments in three of the human cell lines was mapped. A large fraction of proteins were localized in two or more compartments and, in line with previous results, a majority of proteins were detected in all three cell lines.

Finally, mass spectrometry-based protein expression levels were compared to RNA-seq-based transcript expression levels in three cell lines. Highly ubiquitous mRNA expression was found and the changes of expression levels between the cell lines showed high correlations between proteins and transcripts. Large general differences in abundance of proteins from various functional classes were observed. A comparison between categories based on expression levels revealed that, in general, genes with varying expression levels between the cell lines or only expressed in one cell line were highly enriched for cell-surface proteins.

These studies show a path for a systematic analysis to characterize the proteome in human cells, tissues and organs.

Berglund, Lisa

Abstract [en]

Membrane proteins are key molecules in the cell, and are important targets for pharmaceutical drugs. Few three-dimensional structures of membrane proteins have been obtained, which makes computational prediction of membrane proteins crucial for studies of these key molecules. Here, seven membrane protein topology prediction methods based on different underlying algorithms, such as hidden Markov models, neural networks and support vector machines, have been used for analysis of the protein sequences from the 21 416 annotated genes in the human genome. The number of genes coding for a protein with predicted cc-helical transmembrane region(s) ranged from 5508 to 7651, depending on the method used. Based on a majority decision method, we estimate 5539 human genes to code for membrane proteins, corresponding to approximately 26% of the human protein-coding genes. The largest fraction of these proteins has only one predicted transmembrane region, but there are also many proteins with seven predicted transmembrane regions, including the G-protein coupled receptors. A visualization tool displaying the topologies suggested by the eight prediction methods, for all predicted membrane proteins, is available on the public Human Protein Atlas portal (www.proteinatlas.org).

Wester, Kenneth

Uhlén, Mathias

Abstract [en]

Defining the protein profiles of tissues and organs is critical to understanding the unique characteristics of the various cell types in the human body. In this study, we report on an anatomically comprehensive analysis of 4842 protein profiles in 48 human tissues and 45 human cell lines. A detailed analysis of over 2 million manually annotated, high-resolution, immunohistochemistry- based images showed a high fraction (>65%) of expressed proteins in most cells and tissues, with very few proteins (<2%) detected in any single cell type. Similarly, confocal microscopy in three human cell lines detected expression of more than 70% of the analyzed proteins. Despite this ubiquitous expression, hierarchical clustering analysis, based on global protein expression patterns, shows that the analyzed cells can be still subdivided into groups according to the current concepts of histology and cellular differentiation. This study suggests that tissue specificity is achieved by precise regulation of protein levels in space and time, and that different tissues in the body acquire their unique characteristics by controlling not which proteins are expressed but how much of each is produced. Molecular Systems Biology 5: 337; published online 22 December 2009; doi:10.1038/msb.2009.93

Fagerberg, Linn

Strömberg, Sara

El-Obeid, Adila

Gry, Marcus

Nilsson, Kenneth

Uhlén, Mathias

KTH, School of Biotechnology (BIO), Proteomics.

Ponten, Fredrik

Adplund, Anna

Show others...

(English)Manuscript (preprint) (Other academic)

Abstract [en]

Human cancer cell lines grown in vitro are frequently used to decipher basic cell biological phenomena but also to specifically study different forms of cancer. Here we present the first large-scale study of protein expression patterns in cell lines using an antibody-based proteomics approach. We analyzed the expression pattern of 5436 proteins in 45 different cell lines using hierarchical clustering, principal component analysis and two-group comparisons for the identification of differentially expressed proteins. The results show that protein profiles of cell lines, as determined using immunohistochemistry, allow for a hierarchical clustering that overall reflects tumor tissues of origin. Hematological cell lines appear to retain their protein profiles to a higher degree than cell lines established from solid tumors, resulting in a clustering that well reflects progenitor cell types. The discrepancy may reflect different levels of in vitro induced alterations in adherent and suspension grown cell lines, respectively. In addition, multiple myeloma cells and cells of myeloid origin were found to share a protein profile, relative the protein profile of lymphoid leukemia and lymphoma cells, possibly reflecting their common dependency of bone marrow microenvironment.

Abstract [en]

The subcellular locations of proteins are closely related to their function and constitute an essential aspect for understanding the complex machinery of living cells. A systematic effort has been initiated to map the protein distribution in three functionally different cell lines with the aim to provide a subcellular localization index for at least one representative protein from all human protein-encoding genes. Here, we present the results of over 4,000 proteins mapped to 16 subcellular compartments. The results indicate a ubiquitous protein expression with a majority of the proteins found in all three cell lines and a large portion localized to two or more compartments. The inter-relationships between the subcellular compartments are visualized in a protein-compartment network based on all detected proteins. Hierarchical clustering was performed to determine how closely related the organelles are in terms of protein constituents and compare the proteins detected in each cell type. Our results show distinct organelle proteomes, well conserved across the cell types, and demonstrate that biochemically similar organelles are grouped together.

Abstract [en]

An essential question in human biology is how cells and tissues differ in gene and protein expression and how these differences delineate specific biological function. Here, we have performed a global analysis of both mRNA and protein levels based on sequence-based transcriptome analysis (RNA-seq), SILAC-based mass spectrometry analysis and antibody-based confocal microscopy. The study was performed in three functionally different human cell lines and based on the global analysis, we estimated the fractions of mRNA and protein that are cell specific or expressed at similar/different levels in the cell lines. A highly ubiquitous RNA expression was found with > 60% of the gene products detected in all cells. The changes of mRNA and protein levels in the cell lines using SILAC and RNA ratios show high correlations, even though the genome-wide dynamic range is substantially higher for the proteins as compared with the transcripts. Large general differences in abundance for proteins from various functional classes are observed and, in general, the cell-type specific proteins are low abundant and highly enriched for cell-surface proteins. Thus, this study shows a path to characterize the transcriptome and proteome in human cells from different origins.