PhD Proposal: Algorithms for characterizing microbial communities

Complex microbial communities of bacteria, archaea, fungi, and viruses play a crucial role in environmental and human health. With the advent of high throughput sequencing, a new field of metagenomics has emerged that studies genomic characteristics of the microbial community as a whole. While metagenomics has provided a new lens to study the microbial world, many computational challenges remain with downstream analyses. Determining the taxonomic origin of sequenced genomic fragments is one of the first analysis performed in a typical metagenomic study. Substantial research has focused on the development of taxonomic classification methods, often making trade-offs in computational efficiency and classification accuracy. Most methods yield coarse-grained taxonomic resolution because of ambiguities in taxonomic assignments. To address this issue, we have developed a new taxonomic classification method, called ATLAS, that uses significant outliers within database search results to cluster sequences in the database into taxonomic groups. We show that ATLAS provides maximal taxonomic resolution based on the database used and the dataset being classified. We conclude with proposed work in the areas of taxonomic classification, metagenomic binning, and inference of microbial interaction networks. With the wealth of data being generated in metagenomic studies, we believe our algorithms and software tools will help biological discoveries.Examining Committee: