Datamining

Manteia is an integrative database available online at http://manteia.igbmc.fr which provides a large array of OMICs data related to the development of the mouse, chicken, zebrafish and human. The system is designed to use different types of data together in order to perform advanced datamining, test hypotheses or provide candidate genes involved in biological processes or responsible for human diseases. In this new version of the database, Manteia has been enhanced with new expression data originating from microarray and next generation sequencing experiments...

Histone deacetylation plays an important role in transcriptional repression. Previous results showed that the genetic interaction between ttk and rpd3, which encodes a class I histone deacetylase, is required for tll repression. This study investigated the molecular mechanism by which Ttk69 recruits Rpd3. Using yeast two-hybrid screening and datamining, one novel protein was found that weakly interacts with Ttk69 and Sin3A, designated as Protein interacting with Ttk69 and Sin3A (Pits). Pits protein expressed in the early stages of embryos and bound to the region of the tor response element in vivo...

The region known as pX in the 3' end of the human T-cell lymphotropic virus type 1 (HTLV-1) genome contains four overlapping open reading frames (ORF) that encode regulatory proteins. HTLV-1 ORF-I produces the protein p12 and its cleavage product p8. The functions of these proteins have been linked to immune evasion and viral infectivity and persistence. It is known that the HTLV-1 infection does not necessarily imply the development of pathological processes and here we evaluated whether natural mutations in HTLV-1 ORF-I can influence the proviral load and clinical manifestation of HTLV-I-associated myelopathy/tropical spastic paraparesis (HAM/TSP)...

F420 is a low-potential redox cofactor that mediates the transformations of a wide range of complex organic compounds. Considered one of the rarest cofactors in biology, F420 is best known for its role in methanogenesis and has only been chemically identified in two phyla to date, the Euryarchaeota and Actinobacteria. In this work, we show that this cofactor is more widely distributed than previously reported. We detected the genes encoding all five known F420 biosynthesis enzymes (cofC, cofD, cofE, cofG and cofH) in at least 653 bacterial and 173 archaeal species, including members of the dominant soil phyla Proteobacteria, Chloroflexi and Firmicutes...

tRNAScan-SE is a tRNA detection program that is widely used for tRNA annotation; however, the false positive rate of tRNAScan-SE is unacceptable for large sequences. Here, we used a machine learning method to try to improve the tRNAScan-SE results. A new predictor, tRNA-Predict, was designed. We obtained real and pseudo-tRNA sequences as training data sets using tRNAScan-SE and constructed three different tRNA feature sets. We then set up an ensemble classifier, LibMutil, to predict tRNAs from the training data...

The purpose of this panel is to discuss milestones and experiences of a standardized nursing terminology for the documentation of nursing practice using Clinical Care Classification as an example. The aim is to describe the value of using the CCC as the standardized nursing terminology and framework for the multidisciplinary care plans and how its interoperability with SNOMED CT, LOINC, and other required terminologies can be used for the electronic health record systems. Further the aim is to discuss the advantages a multidisciplinary documentation system and how it impacts on nursing practice, management, and research as well as highlight the monitoring of nursing documentation...

MOTIVATION: Identifying drug-target interactions is an important task in drug discovery. To reduce heavy time and financial cost in experimental way, many computational approaches have been proposed. Although these approaches have used many different principles, their performance is far from satisfactory, especially in predicting drug-target interactions of new candidate drugs or targets. METHODS: Approaches based on machine learning for this problem can be divided into two types: feature-based and similarity-based methods...

The use of ontologies has increased rapidly over the past decade and they now provide a key component of most major databases in biology and biomedicine. Consequently, datamining over these databases benefits from considering the specific structure and content of ontologies, and several methods have been developed to use ontologies in datamining applications. Here, we discuss the principles of ontology structure, and datamining methods that rely on ontologies. The impact of these methods in the biological and biomedical sciences has been profound and is likely to increase as more datasets are becoming available using common, shared ontologies...

Recent computational approaches in bioinformatics can achieve high performance, by which they can be a powerful support for performing real biological experiments, making biologists pay more attention to bioinformatics than before. In immunology, predicting peptides which can bind to MHC alleles is an important task, being tackled by many computational approaches. However, this situation causes a serious problem for immunologists to select the appropriate method to be used in bioinformatics. To overcome this problem, we develop an ensemble prediction-based Web server, which we call MetaMHCpan, consisting of two parts: MetaMHCIpan and MetaMHCIIpan, for predicting peptides which can bind MHC-I and MHC-II, respectively...

Deciphering the gene disease association is an important goal in biomedical research. In this paper, we use a novel relevance measure, called HeteSim, to prioritize candidate disease genes. Two methods based on heterogeneous networks constructed using protein-protein interaction, gene-phenotype associations, and phenotype-phenotype similarity, are presented. In HeteSim MultiPath (HSMP), HeteSim scores of different paths are combined with a constant that dampens the contributions of longer paths. In HeteSim SVM (HSSVM), HeteSim scores are combined with a machine learning method...

OBJECTIVES: To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain and clinical care. METHOD: We provide a synopsis of the articles selected for the IMIA Yearbook 2015, from which we attempt to derive a synthetic overview of current and future activities in the field. As last year, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section...

OBJECTIVES: To summarize excellent current research in the field of Bioinformatics and Translational Informatics with application in the health domain and clinical care. METHOD: We provide a synopsis of the articles selected for the IMIA Yearbook 2015, from which we attempt to derive a synthetic overview of current and future activities in the field. As last year, a first step of selection was performed by querying MEDLINE with a list of MeSH descriptors completed by a list of terms adapted to the section...

PURPOSE: Large amounts of routine radiotherapy (RT) data are available, which can potentially add clinical evidence to support better decisions. A developing collaborative Australian network, with a leading European partner, aims to validate, implement and extend European predictive models (PMs) for Australian practice and assess their impact on future patient decisions. Wider objectives include: developing multi-institutional rapid learning, using distributed learning approaches; and assessing and incorporating radiomics information into PMs...

PURPOSE: The purpose of this study is to investigate the relationship between computed tomographic (CT) texture features of primary lesions and metastasis-free survival for rectal cancer patients; and to develop a datamining prediction model using texture features. METHODS: A total of 220 rectal cancer patients treated with neoadjuvant chemo-radiotherapy (CRT) were enrolled in this study. All patients underwent CT scans before CRT. The primary lesions on the CT images were delineated by two experienced oncologists...

PURPOSE: A national Australian inter-university medical physics (MP) group was formed in 2011/12, supported by Department of Health Better Access to Radiation Oncology BARO) seed funding. Core membership includes the six universities providing postgraduate MP courses. Objectives include increasing capacity, development and efficiency of national academic MP structures/systems and hence supporting education, clinical training and research, for the MP workforce support. Although the BARO scheme focuses on Radiation Oncology, the group has wider MP interests...

The prescriptions including Polygoni Multiflori Caulis that built by Pro. Yan were collected to build a database based on traditional Chinese medicine (TCM) inheritance assist system. The method of association rules with apriori algorithm was used to achieve frequency of single medicine, frequency of drug combinations, association rules between drugs and core drug combinations. The datamining results indicated that in the prescriptions that including Polygoni Multiflori Caulis, the highest frequency used drugs were parched Ziziphi Spinosae Semen, Ostreae Concha, Ossis Mastodi Fossilia, Salviae Miltiorrhizae Radix Et Rhizoma, Paeoniae Rubra Radix, and so on...

MOTIVATION: Multiple sequence alignment (MSA) is important work, but bottlenecks arise in the massive MSA of homologous DNA or genome sequences. Most of the available state-of-the-art software tools cannot address large-scale datasets, or they run rather slowly. The similarity of homologous DNA sequences is often ignored. Lack of parallelization is still a challenge for MSA research. RESULTS: We developed two software tools to address the DNA MSA problem. The first employed trie trees to accelerate the centre star MSA strategy...

BACKGROUND: Computational prediction of major histocompatibility complex class II (MHC-II) binding peptides can assist researchers in understanding the mechanism of immune systems and developing peptide based vaccines. Although many computational methods have been proposed, the performance of these methods are far from satisfactory. The difficulty of MHC-II peptide binding prediction comes mainly from the large length variation of binding peptides. METHODS: We develop a novel multiple instance learning based method called MHC2MIL, in order to predict MHC-II binding peptides...

BACKGROUND: DNA-binding proteins are vital for the study of cellular processes. In recent genome engineering studies, the identification of proteins with certain functions has become increasingly important and needs to be performed rapidly and efficiently. In previous years, several approaches have been developed to improve the identification of DNA-binding proteins. However, the currently available resources are insufficient to accurately identify these proteins. Because of this, the previous research has been limited by the relatively unbalanced accuracy rate and the low identification success of the current methods...