Research Description

Modern biological research is characterized by an integration of scale. Scientists study phenomena in a context ranging from the molecular to the organismal and population levels. Information from vast and diverse databases are brought to bear on any particular research question. Often such approaches are now referred to as "systems biology". Within this larger context, our group focuses mainly on "genome informatics", or more broadly "computational genome science". On the genomic scale, we try to understand the functional units in a genome (identification of protein-coding genes and RNA genes; organization of transposable elements and other repetitive sequences), how their expression is controlled, and how they may have come to be what and where they are (comparative genomics, genome dynamics, molecular phylogeny). On the molecular scale, we are particularly interested in the process of pre-mRNA splicing (identification of splice sites and characterization of splicing factors). Most of our data work in recent years has been focused on plant genomes, but we are also exploring other genomes (in particular arthropod genomes), and our tools apply widely.

We approach our research using a combination of computational and experimental approaches. Typically, we start with a biological question, then derive a statistical model for evaluation of the data, develop algorithms and software to organize and analyze the data for the study, and finally interpret the results, more often than not leading to further biological questions and another iteration of research.

A key aspect to this approach is efficient data management, and thus we devote much of our efforts to the development of bioinformatics databases and data management tools. Our large-scale studies increasingly involve cyberinfrastructure-enabled high performance computing, and we seek to contribute to the development of relevant domain-specific cyberinfrastructure.