[{"id":483522,"pmid":20372783,"pmcid":null,"title":"Hs.137007 is a novel epigenetic marker hypermethylated and up-regulated in breast cancer.","year":2010,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"This study was conducted to mine novel breast-specific unigenes and analyze their epigenetic regulation in breast cancer. Differential digital display and methylation analysis identified the Hs.137007 gene containing a Kelch domain as a candidate novel epigenetic marker. In 50 pairs of breast cancer tissues and nearby normal tissues the methylation level of the 14 CpG sites at the promoter region (-778 to -485) of the gene was higher in cancer tissues (72-93%) than in normal tissues (31-83%), with a high correlation rate (p<0.05). End-point RT-PCR and real-time RT-PCR revealed that Hs.137007 was up-regulated in cancer tissues. A clear relationship between high methylation levels and up-regulated expression was also observed in the cultured breast cell lines. The MCF7 (90-100%) and MDAMB468 (100%) cancer cell lines that showed higher methylation than the BT549 (20-90%) and 184B5 (10-100%) at the 14 CpGs also showed elevated gene expression. Taken together, these results indicate that the Hs.137007 gene is a novel gene specifically expressed in the breast that can be utilized as an epigenetic marker of breast cancer.","journal":null,"figures":[],"_authors":null},{"id":8807,"pmid":20139978,"pmcid":null,"title":"Genome-wide association study of hematological and biochemical traits in a Japanese population.","year":2010,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"We report genome-wide association studies for hematological and biochemical traits from approximately 14,700 Japanese individuals. We identified 60 associations for 8 hematological traits and 29 associations for 12 biochemical traits at genome-wide significance levels (P < 5 x 10(-8)). Of these, 46 associations were new to this study and 43 replicated previous reports. We compared these associated loci with those reported in similar GWAS in European populations. When the minor allele frequency was >10% in the Japanese population, 32 (94.1%) and 31 (91.2%) of the 34 hematological loci previously reported to be associated in a European population were replicated with P-values less than 0.05 and 0.01, respectively, and 31 (73.8%) and 27 (64.3%) of the 42 European biochemical loci were replicated.","journal":null,"figures":[],"_authors":null},{"id":5,"pmid":15489334,"pmcid":null,"title":"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).","year":2004,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.","journal":null,"figures":[],"_authors":null},{"id":2,"pmid":12477932,"pmcid":null,"title":"Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.","year":2002,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).","journal":null,"figures":[],"_authors":null},{"id":3583,"pmid":10591208,"pmcid":null,"title":"The DNA sequence of human chromosome 22.","year":1999,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"Knowledge of the complete genomic DNA sequence of an organism allows a systematic approach to defining its genetic components. The genomic sequence provides access to the complete structures of all genes, including those without known function, their control elements, and, by inference, the proteins they encode, as well as all other biologically important sequences. Furthermore, the sequence is a rich and permanent source of information for the design of further biological studies of the organism and for the study of evolution through cross-species sequence comparison. The power of this approach has been amply demonstrated by the determination of the sequences of a number of microbial and model organisms. The next step is to obtain the complete sequence of the entire human genome. Here we report the sequence of the euchromatic part of human chromosome 22. The sequence obtained consists of 12 contiguous segments spanning 33.4 megabases, contains at least 545 genes and 134 pseudogenes, and provides the first view of the complex chromosomal landscapes that will be found in the rest of the genome.","journal":null,"figures":[],"_authors":null},{"id":12,"pmid":8889548,"pmcid":null,"title":"Normalization and subtraction: two approaches to facilitate gene discovery.","year":1996,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"Large-scale sequencing of cDNAs randomly picked from libraries has proven to be a very powerful approach to discover (putatively) expressed sequences that, in turn, once mapped, may greatly expedite the process involved in the identification and cloning of human disease genes. However, the integrity of the data and the pace at which novel sequences can be identified depends to a great extent on the cDNA libraries that are used. Because altogether, in a typical cell, the mRNAs of the prevalent and intermediate frequency classes comprise as much as 50-65% of the total mRNA mass, but represent no more than 1000-2000 different mRNAs, redundant identification of mRNAs of these two frequency classes is destined to become overwhelming relatively early in any such random gene discovery programs, thus seriously compromising their cost-effectiveness. With the goal of facilitating such efforts, previously we developed a method to construct directionally cloned normalized cDNA libraries and applied it to generate infant brain (INIB) and fetal liver/spleen (INFLS) libraries, from which a total of 45,192 and 86,088 expressed sequence tags, respectively, have been derived. While improving the representation of the longest cDNAs in our libraries, we developed three additional methods to normalize cDNA libraries and generated over 35 libraries, most of which have been contributed to our integrated Molecular Analysis of Genomes and Their Expression (IMAGE) Consortium and thus distributed widely and used for sequencing and mapping. In an attempt to facilitate the process of gene discovery further, we have also developed a subtractive hybridization approach designed specifically to eliminate (or reduce significantly the representation of) large pools of arrayed and (mostly) sequenced clones from normalized libraries yet to be (or just partly) surveyed. Here we present a detailed description and a comparative analysis of four methods that we developed and used to generate normalize cDNA libraries from human (15), mouse (3), rat (2), as well as the parasite Schistosoma mansoni (1). In addition, we describe the construction and preliminary characterization of a subtracted liver/spleen library (INFLS-SI) that resulted from the elimination (or reduction of representation) of -5000 INFLS-IMAGE clones from the INFLS library.","journal":null,"figures":[],"_authors":null},null,null,null,null]