Pathway-Based Functional Analysis of Metagenomes

Abstract

Metagenomic data enables the study of microbes and viruses through their DNA as retrieved directly from the environment in which they live. Functional analysis of metagenomes explores the abundance of gene families, pathways, and systems, rather than their taxonomy. Through such analysis researchers are able to identify those functional capabilities most important to organisms in the examined environment. Recently, a statistical framework for the functional analysis of metagenomes was described that focuses on gene families. Here we describe two pathway level computational models for functional analysis that take into account important, yet unaddressed issues such as pathway size, gene length and overlap in gene content among pathways. We test our models over carefully designed simulated data and propose novel approaches for performance evaluation. Our models significantly improve over current approach with respect to pathway ranking and the computations of relative abundance of pathways in environments.