Abstract

Many metagenomics classification tools have been developed with the rapid growth of the metagenomics field. However, the classification of closely related species remains a challenge for this field. Here, we compared MetaPhlAn2, kallisto and Kraken for their performances in two metagenomics settings, human metagenomics and environmental metagenomics. Our comparative study showed that kallisto demonstrated higher sensitivity than MetaPhlAn2 and Kraken and better quantification accuracy than Kraken at the species level. We also showed that classification tools that run on full reference genomes misidentified many species that were not truly present. In order to reduce false positives, we introduced marker genes from MetaPhlAn2 into our pipeline, which uses kallisto for the classification step, as an additional filtering step for species detection.