Abstract

Mycobacterium tuberculosis is an obligate human pathogen capable of persisting in individual hosts for decades. We sequenced the genomes of 21 strains representative of the global diversity and six major lineages of the M. tuberculosis complex (MTBC) at 40- to 90-fold coverage using Illumina next-generation DNA sequencing. We constructed a genome-wide phylogeny based on these genome sequences. Comparative analyses of the sequences showed, as expected, that essential genes in MTBC were more evolutionarily conserved than nonessential genes. Notably, however, most of the 491 experimentally confirmed human T cell epitopes showed little sequence variation and had a lower ratio of nonsynonymous to synonymous changes than seen in essential and nonessential genes. We confirmed these findings in an additional data set consisting of 16 antigens in 99 MTBC strains. These findings are consistent with strong purifying selection acting on these epitopes, implying that MTBC might benefit from recognition by human T cells