Mitigating the impact of rogue genes in phylogenomic studies

Several recent studies have shown that support for contentious relationships in phylogenomic studies can be driven by a few genes or even a single gene. By nearly quadrupling the number of genomes (from 86 to 332) used to reconstruct the phylogeny of budding yeasts, we were able to robustly infer several previously contentious relationships and reduce their occurrence on the phylogeny. Remarkably, we found that the unusually large influence of a single rogue gene on a specific branch was ameliorated by the targeted addition of the genomes of three species.

Share

Copy the link

The advent of genomics and the ever increasing amount of new DNA sequence data generated have given a tremendous boost to phylogenetics, the study of the evolutionary relationships among organisms and their genes, such that we can now seriously contemplate sequencing the genomes of all living organisms. And with the genomes of all living organisms at hand, reconstructing the entire tree of life seems no longer a pipe dream but a realistic and attainable goal.

Although our new study's emphasis is on understanding the evolution of genes and traits involved in metabolism, one of the key achievements of our work is the generation of a genome-scale phylogeny and timetree that captures the diversity of budding yeasts (our analyses included genomes from 79 of the 92 recognized budding yeast genera!). Here's an image of it where species names have been removed and each lineage is shown with a different color (Ascoideaceae is part of the CUG-Ser2 clade in light green around the 5 hour mark):

Painstaking examination, both via different types of analyses as well as by analyses of different subsets of genes, of the robustness of this new budding yeast phylogeny revealed that ~10% (32 / 331) of branches show conflict between analyses. Interestingly, this level of conflict is lower than ~13% (11 / 85), the level we observed in our 2016 analyses of the 86-species budding yeast phylogeny.

But what about the placement of the family Ascoideaceae and the CUG-Ser2 clade in the new, expanded budding yeast phylogeny? The addition of the genomes of three additional species appears to have solidified support for this topology:

And the DPM1 gene, you may ask? Remarkably, inclusion of the DPM1 sequences from the three new species in the Ascoideaceae / CUG-Ser2 clade appears to have dramatically reduced the gene's phylogenetic signal. Interestingly, when we exclude these three species from the data matrix, DPM1 regains its unusually strong phylogenetic signal:

These results are consistent with simulation studies showing that adding species can dramatically increase phylogenetic accuracy. This is good news for efforts to assemble life's family tree from genomic data. If current controversies in phylogenetics are any indication, robustly inferring life's entire family tree will require all the help that we can get!

This community is not edited and does not necessarily reflect the views of Nature Research. Nature Research makes no representations, warranties or guarantees, whether express or implied, that the content on this community is accurate, complete or up to date, and to the fullest extent permitted by law all liability is excluded.

Please sign in or register for FREE

Sign in to Nature Research Ecology & Evolution Community

Register to Nature Research Ecology & Evolution Community

The Nature Research Ecology & Evolution Community provides a forum for the sharing and discussion of news and opinion in ecology and evolutionary biology. Through posts, discussion, image and video content, the community space can be used by members to communicate with each other, and with editors, about topics ranging from the fundamental science itself through to policy, society and the day to day life of the research community. It is also a place to learn more about the activities of Nature Research ecology and evolutionary biology editors and the policies and practices of our journals.