Pages

Tuesday, March 28, 2017

force.ultrametric method for ultrametric phylogenies that fail ape::is.ultrametric due to numerical precision

Sometimes in R a phylogenetic tree will fail the check
is.ultrametric even though we know our tree is
ultrametric. Why?

Generally speaking, this is because a tree in R is actually never*
precisely ultrametric due to limitations in numerical precision
inherent to all computer machinery. (*The exception is when our tree has
integer branch lengths, in which case it could be exactly
ultrametric.)

Consequently, is.ultrametric has an argument
tol = .Machine$double.eps0.5 which sets the tolerance
level - basically how much deviation from precise ultrametricity (is this
a word?) should be permitted.

Unfortunately, sometimes trees that should be ultrametric that have been
written to file & then read into R will fail this test with tol
set at its default value. In addition, because the default value for
tol is a function of the maximum numerical precision of the
machine, this value can theoretically vary between computers & R versions!

Here is a quick example of how a tree that seems to be ultrametric
can fail a check of is.ultrametric:

I don't know about you - but I have certainly seen a lot of 'ultrametric'
trees written to file with 7 digit precision!

Here is a very simple function that will force our “nearly ultrametric”
tree to be precisely ultrametric. It basically wraps two different methods.
One is the function nnls.tree in the package phangorn by
Klaus Schliep in my lab. For a given
tree, this function can find the set of edge lengths with implied distances
with minimum sum-of-squared differences to the true distances - in this
case the patristic distances on our phylogeny.

The second method is very simple. It just computes the total amount of edge
length that would have to be added to all the tips of tree to render the
tree ultrametric. This seems nice, but it will concentrate the fudge
we are applying to our tree edges to external edges only. I believe that
this method is also implemented in a popular package called
BioGeoBEARS
by Nick Matzke.

Note that neither of these is a formal statistical method for estimating
an ultrametric tree from a tree in which branch lengths are proportional
to (for instance) molecular sequence evolution - there are a number of
different approaches for doing this. Rather, this method is designed
merely to 're-adjust' the edge lengths of a tree that has lost numerical
precision from being written to file & thus fails R checks such as
is.ultrametric.

9 comments:

is.ultrametric() has the option 'option' to choose the method used to test ultrametricity: the default is the scaled range. The other possibility is the variance (option = 2) which used to be the default before ape 4.0.

To avoid issues with the number of printed digits in write.tree/nexus, one can use readRDS instead: it makes smaller files that are faster to read/write, and no digits are lost. Of course, you won't be able to open such files with other software than R.

A third way to make a tree ultrametric: branching.times() does not test whether the tree is ultrametric, so the values output by this function from an "almost ultrametric" tree can then be input to compute.brtime().

For context, I should add that Emmanuel discovered this post when I shared it in a bammtools issue discussion regarding recent changes to is.ultrametric in ape, which may be relevant to this discussion:

This post was very well written, and it also contains a lot of useful facts. I enjoyed your distinguished way of writing the post. Thanks, you have made it easy for me to understand. download instagram videos

About this blog

This web-log chronicles the development of new tools for phylogenetic analyses in the phytools R package. Unless you a reading a very recent page of the blog, I recommend that you install the latest CRAN version of phytools (or latest beta release) before attempting to replicate any of the analyses of this site. That is because the linked functions may be archived, and very likely have been replaced by newer versions.