Abstract

Please see related commentary: http://www.biomedcentral.com/1741-7015/10/21/abstractAge-related macular degeneration (AMD) is a leading cause of blindness that affects the central region of the retinal pigmented epithelium (RPE), choroid, and neural retina. Initially characterized by an accumulation of sub-RPE deposits, AMD leads to progressive retinal degeneration, and in advanced cases, irreversible vision loss. Although genetic analysis, animal models, and cell culture systems have yielded important insights into AMD, the molecular pathways underlying AMD's onset and progression remain poorly delineated. We sought to better understand the molecular underpinnings of this devastating disease by performing the first comparative transcriptome analysis of AMD and normal human donor eyes.RPE-choroid and retina tissue samples were obtained from a common cohort of 31 normal, 26 AMD, and 11 potential pre-AMD human donor eyes. Transcriptome profiles were generated for macular and extramacular regions, and statistical and bioinformatic methods were employed to identify disease-associated gene signatures and functionally enriched protein association networks. Selected genes of high significance were validated using an independent donor cohort.We identified over 50 annotated genes enriched in cell-mediated immune responses that are globally over-expressed in RPE-choroid AMD phenotypes. Using a machine learning model and a second donor cohort, we show that the top 20 global genes are predictive of AMD clinical diagnosis. We also discovered functionally enriched gene sets in the RPE-choroid that delineate the advanced AMD phenotypes, neovascular AMD and geographic atrophy. Moreover, we identified a graded increase of transcript levels in the retina related to wound response, complement cascade, and neurogenesis that strongly correlates with decreased levels of phototransduction transcripts and increased AMD severity. Based on our findings, we assembled protein-protein interactomes that highlight functional networks likely to be involved in AMD pathogenesis.We discovered new global biomarkers and gene expression signatures of AMD. These results are consistent with a model whereby cell-based inflammatory responses represent a central feature of AMD etiology, and depending on genetics, environment, or stochastic factors, may give rise to the advanced AMD phenotypes characterized by angiogenesis and/or cell death. Genes regulating these immunological activities, along with numerous other genes identified here, represent promising new targets for AMD-directed therapeutics and diagnostics.

Abstract

Pluripotent stem cells derived from both embryonic and reprogrammed somatic cells have significant potential for human regenerative medicine. Despite similarities in developmental potential, however, several groups have found fundamental differences between embryonic stem cell (ESC) and induced-pluripotent stem cell (iPSC) lines that may have important implications for iPSC-based medical therapies. Using an unsupervised clustering algorithm, we further studied the genetic homogeneity of iPSC and ESC lines by reanalyzing microarray gene expression data from seven different laboratories. Unexpectedly, this analysis revealed a strong correlation between gene expression signatures and specific laboratories in both ESC and iPSC lines. Nearly one-third of the genes with lab-specific expression signatures are also differentially expressed between ESCs and iPSCs. These data are consistent with the hypothesis that in vitro microenvironmental context differentially impacts the gene expression signatures of both iPSCs and ESCs.

Abstract

The mechanism and magnitude by which the mammalian kidney generates and maintains its proximal tubules, distal tubules, and collecting ducts remain controversial. Here, we use long-term in vivo genetic lineage tracing and clonal analysis of individual cells from kidneys undergoing development, maintenance, and regeneration. We show that the adult mammalian kidney undergoes continuous tubulogenesis via expansions of fate-restricted clones. Kidneys recovering from damage undergo tubulogenesis through expansions of clones with segment-specific borders, and renal spheres developing in vitro from individual cells maintain distinct, segment-specific fates. Analysis of mice derived by transfer of color-marked embryonic stem cells (ESCs) into uncolored blastocysts demonstrates that nephrons are polyclonal, developing from expansions of singly fated clones. Finally, we show that adult renal clones are derived from Wnt-responsive precursors, and their tracing in vivo generates tubules that are segment specific. Collectively, these analyses demonstrate that fate-restricted precursors functioning as unipotent progenitors continuously maintain and self-preserve the mouse kidney throughout life.

Abstract

Stem cells have the unique property of differentiation and self-renewal and play critical roles in normal development, tissue repair, and disease. To promote systems-wide analysis of cells and tissues, we developed AutoSOME, a machine-learning method for identifying coordinated gene expression patterns and correlated cellular phenotypes in whole-transcriptome data, without prior knowledge of cluster number or structure. Here, we present a facile primer demonstrating the use of AutoSOME for identification and characterization of stem cell gene expression signatures and for visualization of transcriptome networks using Cytoscape. This protocol should serve as a general foundation for gene expression cluster analysis of stem cells, with applications for studying pluripotency, multi-lineage potential, and neoplastic disease.

Abstract

Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI:http://dx.doi.org/10.7554/eLife.00569.001.

Abstract

Signaling via protein lysine methylation has been proposed to play a central role in the regulation of many physiologic and pathologic programs. In contrast to other post-translational modifications such as phosphorylation, proteome-wide approaches to investigate lysine methylation networks do not exist.In the current study, we used the ProtoArray® platform, containing over 9,500 human proteins, and developed and optimized a system for proteome-wide identification of novel methylation events catalyzed by the protein lysine methyltransferase (PKMT) SETD6. This enzyme had previously been shown to methylate the transcription factor RelA, but it was not known whether SETD6 had other substrates. By using two independent detection approaches, we identified novel candidate substrates for SETD6, and verified that all targets tested in vitro and in cells were genuine substrates.We describe a novel proteome-wide methodology for the identification of new PKMT substrates. This technological advance may lead to a better understanding of the enzymatic activity and substrate specificity of the large number (more than 50) PKMTs present in the human proteome, most of which are uncharacterized.

Abstract

Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry.We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four.By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome webcite.

Abstract

Biological sequence repeats arranged in tandem patterns are widespread in DNA and proteins. While many software tools have been designed to detect DNA tandem repeats (TRs), useful algorithms for identifying protein TRs with varied levels of degeneracy are still needed.To address limitations of current repeat identification methods, and to provide an efficient and flexible algorithm for the detection and analysis of TRs in protein sequences, we designed and implemented a new computational method called XSTREAM. Running time tests confirm the practicality of XSTREAM for analyses of multi-genome datasets. Each of the key capabilities of XSTREAM (e.g., merging, nesting, long-period detection, and TR architecture modeling) are demonstrated using anecdotal examples, and the utility of XSTREAM for identifying TR proteins was validated using data from a recently published paper.We show that XSTREAM is a practical and valuable tool for TR detection in protein and nucleotide sequences at the multi-genome scale, and an effective tool for modeling TR domains with diverse architectures and varied levels of degeneracy. Because of these useful features, XSTREAM has significant potential for the discovery of naturally-evolved modular proteins with applications for engineering novel biostructural and biomimetic materials, and identifying new vaccine and diagnostic targets.