Tuesday, January 17, 2017

What is the null hypothesis for a phylogeny?

As noted in the previous blog post (Why do we need Bayesian phylogenetic information content?), phylogeneticists rarely consider whether their data actually contain much phylogenetic information. Nevertheless, the existence of information content in a dataset implies the existence of null hypothesis of "no information", relative to the objective of the data analysis.

In this regard, Alexander Suh (2016), in a paper on the phylogenetics of birds, makes two important general points:

Every phylogenetic tree hypothesis should be accompanied by a phylogenetic network for visualization of conflicts.

Hard polytomies exist in nature and should be treated as the null hypothesis in the absence of reproducible tree topologies.

It is difficult to argue with the first point, of course. However, the second point is also an interesting one, and deserves some consideration. Suh notes that: "In contrast to ‘soft polytomies’ that result from insufficient data, ‘hard polytomies’ reflect the biological limit of phylogenetic resolution because of near-simultaneous speciation". That is, the distinction is whether polytomies result from simultaneous branching events (hard) or from insufficient sequence information (soft).

The matter of a suitable null hypothesis in phylogenetics has been considered before, for example by Hoelzer and Meinick (1994) and Walsh et al. (1999), who come to essentially the same conclusion as Suh (2016). Clearly, a network cannot be the null hypothesis for a phylogeny, and
nor can a resolved tree (even partially resolved); the only logical
possibility is a polytomy.

However, it seems to me that the current null hypothesis is effectively a soft polytomy, although no hypothesis is ever explicitly stated by most workers. Nevertheless, any evidence to resolve polytomies seems to be accepted, with evidence taken in descending order of strength in order to resolve any conflicting evidence. This inevitably produces a tree that is at least partly resolved, which is the alternative hypothesis.

On the other hand, resolving a hard polytomy requires unambiguous evidence for each branch in the phylogeny. If there is substantial conflict then it can only be resolved as a reticulation, or it must remain a polytomy. The existence of a reticulation, of course, results in a network, not a tree, so that the alternative hypothesis is a network, which may in practice be very tree-like.

As a final point, Suh claims that: "Neoaves comprise, to my knowledge, the first empirical example for a hard polytomy in animals." This is incorrect. There is also a hard polytomy at the root of the Placental Mammals, as discussed in this blog post: Why are there conflicting placental roots?