Pages

Tuesday, March 25, 2014

I just submitted a new version (Rphylip 0.1-23) of the Rphylip package ('an R interface for PHYLIP') to CRAN. This version fixes a couple of bugs in the first CRAN version - including taxon name length limits in Rconsense and Rtreedist (which are present in the corresponding PHYLIP programs CONSENSE and TREEDIST, but can easily be circumvented when calling PHYLIP from within R). It also introduces the new function Rseqboot, which is an R interface two SEQBOOT. SEQBOOT implements a range of non-parametric bootstrapping, jacknife, and data permutation methods. Because it can take a variety of different character types as input, writing the interface was a bit of a pain in the butt - but it is finished to my satisfaction today.

Here's a demo of bootstrapping, distance matrix calculation from all bootstrap samples, and then consensus phylogeny estimation using the Rphylip package:

Cool. Note that RETREE in PHYLIP does midpoint rooting, but this program cannot yet be called with Rphylip. The same general pipeline could be used with ML, MP, or other phylogeny inference methods in PHYLIP (although this would be slower, of course).

This version of Rphylip is already available from GitHub, but hopefully will also be accepted on CRAN soon.

Friday, March 14, 2014

Today a phytools user contacted me about creating a plot that looks like this. Well, I'm not going to try to duplicate this exactly, but here is a quick demo about how to put a bar plot next to a plotted tree with a continuous character map overlain:

Wednesday, March 12, 2014

A slightly more fully functional version of roundPhylogram (described on the blog earlier today) is now part of the phytools package. The function source code can be viewed here, and updated phytools package source can be downloaded from the phytools page.

Tuesday, March 11, 2014

Rphylip, an R interface for J. Felsenstein's PHYLIP phylogeny methods program package, is now on CRAN.

Rphylip is a collaborative project with Scott Chamberlain at Simon Fraser University. Although I did most of the programming of Rphylip, Scott wrote some of the initial interface code that helped me to get started on the project, he has helped to debug the package, he has contributed some of the example datasets, and he is co-authoring the program note that we are presently preparing for journal submission.

Rphylip now contains interface functions for close to 90% of the programs of the PHYLIP package. In almost every case, all or nearly all of the functionality of the PHYLIP programs are transferred to the R user. The Rphylip functions, like PHYLIP itself, cover an enormous range of applications - from phylogeny inference, to distance matrix calculation, to phylogenetic comparative methods. In every case, we have done our best to preserve all the functionality of the PHYLIP programs while allowing them to be integrated seamlessly into an R workflow. Rphylip also contains a number of helper functions which both broaden the functionality of PHYLIP (albeit slightly), and allow it to be more easily used. For instance, the user does not necessarily need to supply the path to the PHYLIP executable. If a path is not supplied, then Rphylip will search common locations for the PHYLIP executables (such as in C:\Program Files\ on a Windows machine. Rphylip even includes a function (setupOSX) that automates the somewhat complicated procedure of installing PHYLIP to a Mac OS X computer.

Of course, before Rphylip can be used, PHYLIP must first be installed. Furthermore, any use of Rphylip in publication should automatically trigger a citation of PHYLIP (as well as relevant references for the particular method employed).

This release of Rphylip in advance of submitting our program note for publication should be considered highly beta. We welcome any feedback on the package or its use. At the moment of writing, only the package source and Mac OS binary are available. I expect that a Windows package binary should be posted soon.

Finally (oops!) I accidentally misentered the package name in the R package DESCRIPTION file which is responsible for the double ("Rphylip: Rphylip: .....") package header on CRAN. I'm aware of this & it will be fixed in future releases.

I've been using your phytools package (and the findMRCA function with type="height" to find the height of a specific node in a phylogenetic tree. However, the computation time is rather long for my purpose and I was thinking about using fastMRCA. However, this outputs only the node number of the parental node. Is there a chance to get the coalescent time of this node (as with the type argument of findMRCA). Or do you have a general other suggestion to increase speed (I would really only need to calculate the time of coalescence of two tips of the tree).

Unfortunately, it doesn't matter which (findMRCA or fastMRCA) is used in this case because the bottleneck is nodeHeights, which is used to compute the height of the common ancestor above the root. Here's a quick demo to illustrate that:

However, we can take a different approach. What if we instead (1) found all the ancestors (back to the root) of each tip, (2) identified the intersection of these two sets, and (3) summed the parent branch length for each ancestral node above the root in the intersection? Here's a function that does this and it runs pretty fast!

Thursday, March 6, 2014

I just added a new function cladelabels to the phytools package. This function is in some ways analogous to nodelabels, tiplabels, etc. in ape. It basically implements the method I gave here, while taking advantage of the trick described here.

The code for this function is here, and it is also in a new phytools version, which can be downloaded here.

Here's a demo:

## this is just code to get a "realistic" looking tree
tree
paste(LETTERS,"._",sapply(round(runif(n=26,min=3,max=6)),
function(x) paste(sample(letters,x),collapse="")),sep=""),
scale=1)
tree$edge.length
length(tree$edge.length),df=10)/10
plotTree(tree) ## also can use plot.phylo
nodelabels()

Now let's label the three clades descended from nodes 46 & 33; and then also node 28 (which is inclusive of node 33:

Cool - this was more or less what we were going for. We can also do this without sending cladelabels the tree, although in this case we need to provide some guidance on the space that cladelabels should leave for tip labels - otherwise it will assume a fixed width of 8 characters.

One caveat important to mention is that at present this works only for rightward facing phylograms (or cladograms, for tree=NULL). This is not theoretically difficult to extend to other plot types, it just requires more work.

Wednesday, March 5, 2014

I stumbled on this trick while working on something else. When plot.phylo in ape or plotTree or plotSimmap in phytools are used to plot a tree the environmental variable last_plot.phylo is created. This variable is used to by nodelabels, tiplabels, and other functions that are used to add elements to the plotted tree. This object contains mostly information about the plotted tree - coordinates of vertices, etc. Today I realized that (almost) the entire "phylo" object can be reconstructed from this variable.

Tuesday, March 4, 2014

Luke Harmon, Andrew Crawford, and I will be co-teaching a phylogeny methods in R workshop at the Universidad de los Andes, Bogotá, Colombia this summer from the 8th to the 11th of July. This course is funded by the NSF and co-sponsored by U. los Andes and the University of Massachusetts Boston (my home institution).

We are pleased to announce a new graduate-level intensive short course
on the use of R for phylogenetic comparative analysis. The course will
be four days in length and will take place at the Universidad de los
Andes, Bogotá, Colombia from the 8th to the 11th of July, 2014. This
course is partially funded by the National Science Foundation, with
additional support from the University of Massachusetts Boston and the
Universidad de los Andes. There are a number of full stipends available
to cover the cost of travel, room and board for qualified students and
post-docs. Applicants are welcome from any country; however we expect
that most admitted students will come from Colombia and the Andean
region. Accepted students from further afield may be offered only
partial funding for their travel expenses. Topics covered will include:
an introduction to the R programming language, tree manipulation,
independent contrasts and phylogenetic generalized least squares,
ancestral state reconstruction, models of character evolution,
diversification analysis, and community phylogenetic analysis. Course
instructors will include Dr. Liam Revell (University of Massachusetts
Boston), Dr. Luke Harmon (University of Idaho), and Dr. Andrew J.
Crawford (Universidad de los Andes).

Instruction in the course will be primarily in English; however some of
the instructors and TAs of the course are competent or fluent in Spanish
and English. Discussion, exercises, and activities will be conducted in
both languages.

To apply for the course, please submit your CV along with a short
(maximum 1 page) description of your research interests, background, and
reasons for taking the course. Admission is competitive, and preference
will go towards students with background in phylogenetics and a
compelling motivation for taking the course. Applications should be
submitted by email to bogota.phylogenetics.course@gmail.com by May 1st,
2014. Applications may be written in English or Spanish; however all
students must have a basic working knowledge of scientific English.
Questions can be directed to liam.revell@umb.edu.

Monday, March 3, 2014

Some comments on earlier version of the function rateshift for identifying one or multiple shifts in the Brownian rate of evolution on the tree suggested that there were some difficulties in converging to the ML solution. Indeed, this is not too surprising. I have now posted a new version of rateshift that has more robust optimization. Here's a quick demo:

In addition, I was recently informed that the package extrafonts has been removed from CRAN. phytools depends on extrafonts for the plotting functions xkcdTree and fancyTree. To address this dependency issue I have now removed xkcdTree from phytools (source code is still available from the phytools page) and modified fancyTree so that it no longer uses extrafonts. This new version has now been submitted to CRAN. It is already available on phytools.org.

About this blog

This web-log chronicles the development of new tools for phylogenetic analyses in the phytools R package. Unless you a reading a very recent page of the blog, I recommend that you install the latest CRAN version of phytools (or latest beta release) before attempting to replicate any of the analyses of this site. That is because the linked functions may be archived, and very likely have been replaced by newer versions.