[{"id":184,"pmid":20237496,"pmcid":null,"title":"New genetic associations detected in a host response study to hepatitis B vaccine.","year":2010,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The immune response to hepatitis B vaccination differs greatly among individuals, with 5-10% of healthy people failing to produce protective levels of antibodies. Several factors have been implicated in determining this response, chiefly individual genetic variation and age. Aiming to identify genes involved in the response to hepatitis B vaccination, a two-stage investigation of 6091 single-nucleotide polymorphisms (SNPs) in 914 immune genes was performed in an Indonesian cohort of 981 individuals showing normal levels of anti-HBs versus 665 individuals displaying undetectable levels of anti-HBs 18 months after initial dose of the vaccine. Of 275 SNPs identified in the first stage (476 normal/372 nonresponders) with P<0.05, significant associations were replicated for 25 polymorphisms in 15 genes (503 normal/295 nonresponders). We validated previous findings (HLA-DRA, rs5000563, P-value combined=5.57 x 10(-10); OR (95%CI)=0.61 (0.52-0.71)). In addition, we detected a new association outside of the human leukocyte antigen loci region that passed correction for multiple testing. This SNP is in the 3' downstream region of FOXP1, a transcription factor involved in B-cell development (P-value combined=9.2 x 10(-6); OR (95%CI)=1.38 (1.2-1.6)).These findings might help to understand the biological reasons behind vaccine failure and other aspects of variation in the immune responses of healthy individuals.","journal":null,"figures":[],"_authors":null},{"id":764,"pmid":16710414,"pmcid":null,"title":"The DNA sequence and biological annotation of human chromosome 1.","year":2006,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The reference sequence for each human chromosome provides the framework for understanding genome function, variation and evolution. Here we report the finished sequence and biological annotation of human chromosome 1. Chromosome 1 is gene-dense, with 3,141 genes and 991 pseudogenes, and many coding sequences overlap. Rearrangements and mutations of chromosome 1 are prevalent in cancer and many other diseases. Patterns of sequence variation reveal signals of recent selection in specific genes that may contribute to human fitness, and also in regions where no function is evident. Fine-scale recombination occurs in hotspots of varying intensity along the sequence, and is enriched near genes. These and other studies of human biology and disease encoded within chromosome 1 are made possible with the highly accurate annotated sequence, as part of the completed set of chromosome sequences that comprise the reference human genome.","journal":null,"figures":[],"_authors":null},{"id":5,"pmid":16344560,"pmcid":null,"title":"Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes.","year":2006,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.","journal":null,"figures":[],"_authors":null},{"id":1175,"pmid":16303743,"pmcid":null,"title":"Signal sequence and keyword trap in silico for selection of full-length human cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries.","year":2005,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"We have developed an in silico method of selection of human full-length cDNAs encoding secretion or membrane proteins from oligo-capped cDNA libraries. Fullness rates were increased to about 80% by combination of the oligo-capping method and ATGpr, software for prediction of translation start point and the coding potential. Then, using 5'-end single-pass sequences, cDNAs having the signal sequence were selected by PSORT ('signal sequence trap'). We also applied 'secretion or membrane protein-related keyword trap' based on the result of BLAST search against the SWISS-PROT database for the cDNAs which could not be selected by PSORT. Using the above procedures, 789 cDNAs were primarily selected and subjected to full-length sequencing, and 334 of these cDNAs were finally selected as novel. Most of the cDNAs (295 cDNAs: 88.3%) were predicted to encode secretion or membrane proteins. In particular, 165(80.5%) of the 205 cDNAs selected by PSORT were predicted to have signal sequences, while 70 (54.2%) of the 129 cDNAs selected by 'keyword trap' preserved the secretion or membrane protein-related keywords. Many important cDNAs were obtained, including transporters, receptors, and ligands, involved in significant cellular functions. Thus, an efficient method of selecting secretion or membrane protein-encoding cDNAs was developed by combining the above four procedures.","journal":null,"figures":[],"_authors":null},{"id":4,"pmid":15489334,"pmcid":null,"title":"The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).","year":2004,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.","journal":null,"figures":[],"_authors":null},{"id":8935,"pmid":15340161,"pmcid":null,"title":"Signal peptide prediction based on analysis of experimentally verified cleavage sites.","year":2004,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"A number of computational tools are available for detecting signal peptides, but their abilities to locate the signal peptide cleavage sites vary significantly and are often less than satisfactory. We characterized a set of 270 secreted recombinant human proteins by automated Edman analysis and used the verified cleavage sites to evaluate the success rate of a number of computational prediction programs. An examination of the frequency of amino acid in the N-terminal region of the data set showed a preference of proline and glutamine but a bias against tyrosine. The data set was compared to the SWISS-PROT database and revealed a high percentage of discrepancies with cleavage site annotations that were computationally generated. The best program for predicting signal sequences was found to be SignalP 2.0-NN with an accuracy of 78.1% for cleavage site recognition. The new data set can be utilized for refining prediction algorithms, and we have built an improved version of profile hidden Markov model for signal peptides based on the new data.","journal":null,"figures":[],"_authors":null},{"id":3,"pmid":14702039,"pmcid":null,"title":"Complete sequencing and characterization of 21,243 full-length human cDNAs.","year":2004,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"As a base for human transcriptome and functional genomics, we created the \"full-length long Japan\" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.","journal":null,"figures":[],"_authors":null},{"id":98563,"pmid":12928397,"pmcid":null,"title":"Functional requirements for interactions between CD84 and Src homology 2 domain-containing proteins and their contribution to human T cell activation.","year":2003,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"Cell surface receptors belonging to the CD2 subset of the Ig superfamily of molecules include CD2, CD48, CD58, 2B4, signaling lymphocytic activation molecule (SLAM), Ly9, CD84, and the recently identified molecules NTB-A/Ly108/SLAM family (SF) 2000, CD84H-1/SF2001, B lymphocyte activator macrophage expressed (BLAME), and CRACC (CD2-like receptor-activating cytotoxic cells)/CS-1. Some of these receptors, such as CD2, SLAM, 2B4, CRACC, and NTB-A, contribute to the activation and effector function of T cells and NK cells. Signaling pathways elicited via some of these receptors are believed to involve the Src homology 2 (SH2) domain-containing cytoplasmic adaptor protein SLAM-associated protein (SAP), as it is recruited to SLAM, 2B4, CD84, NTB-A, and Ly-9. Importantly, mutations in SAP cause the inherited human immunodeficiency X-linked lymphoproliferative syndrome (XLP), suggesting that XLP may result from perturbed signaling via one or more of these SAP-associating receptors. We have now studied the requirements for SAP recruitment to CD84 and lymphocyte activation elicited following ligation of CD84 on primary and transformed human T cells. CD84 was found to be rapidly tyrosine phosphorylated following receptor ligation on activated T cells, an event that involved the Src kinase Lck. Phosphorylation of CD84 was indispensable for the recruitment of SAP, which was mediated by Y(262) within the cytoplasmic domain of CD84 and by R(32) within the SH2 domain of SAP. Furthermore, ligating CD84 enhanced the proliferation of anti-CD3 mAb-stimulated human T cells. Strikingly, this effect was also apparent in SAP-deficient T cells obtained from patients with XLP. These results reveal a novel function of CD84 on human lymphocytes and suggest that CD84 can activate human T cells via a SAP-independent mechanism.","journal":null,"figures":[],"_authors":null},{"id":2,"pmid":12477932,"pmcid":null,"title":"Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences.","year":2002,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).","journal":null,"figures":[],"_authors":null},{"id":443385,"pmid":11313408,"pmcid":null,"title":"Cloning, expression, and function of BLAME, a novel member of the CD2 family.","year":2001,"pages":null,"doi":null,"keywords":[],"mesh":[],"abstractText":"The CD2 family is a growing family of Ig domain-containing cell surface proteins involved in lymphocyte activation. Here we describe the cloning and expression analysis of a novel member of this family, B lymphocyte activator macrophage expressed (BLAME). BLAME shares the structural features of the CD2 family containing an IgV and IgC2 domain and clusters with the other family members on chromosome 1q21. Quantitative PCR and Northern blot analysis show BLAME to be expressed in lymphoid tissue and, more specifically, in some populations of professional APCs, activated monocytes, and DCS: Retroviral forced expression of BLAME in hematopoietic cells of transplanted mice showed an increase in B1 cells in the peripheral blood, spleen, lymph nodes, and, most strikingly, in the peritoneal cavity. These cells do not express CD5 and are CD23(low)Mac1(low), characteristics of the B1b subset. BLAME may therefore play a role in B lineage commitment and/or modulation of signal through the B cell receptor.","journal":null,"figures":[],"_authors":null}]