Pages

Thursday, March 17, 2011

Computing phylogenetic signal

I have posted a new function to compute phylogenetic signal for continuous traits using two methods: the lambda method of Pagel (1999) and Blomberg et al. (2003)'s K-statistic. Rich Glor already created a very helpful wiki page on how to do this in R using several different functions. My function, phylosig() (code here), should make this even easier. With phylosig() you can specify method (presently either "K" or "lambda") and optionally return a P-value for either the randomization test of Blomberg et al. or a likelihood ratio test against the null hypothesis that lambda=0.

For instance, after loading our function from source, to compute K and conduct a hypothesis test of the null hypothesis of no signal we just type:

32 comments:

Nice function. It seems faste than K estimation by phylosignal by a bit (esp at large numbers of iterations), and MUCH MUCH faster than geiger's lambda estimation.

I had a question: Do you know the relative strengths of K vs. lambda relative to tree size? I know a paper a few years back suggested that trees with ~20 species are too small to estimate K robustly, but what about lambda?

@Scott - you could try investigating this by simulating and looking at the variance in K across simulations. Just an idea.

@Dave - please let me know about your success (or failure) using this or any of my other functions with especially large phylogenies. I have not tested these functions, in most cases, for more than a few hundred species.

Hi Nicole.Are some species in the tree missing from the data? (Note that the data should be a vector with names equal to the species names.)To check, you can try the name.check() function in the "geiger" package. This function, I believe, returns a list containing two elements - the species found in the data, but not the tree; and the species found in the tree but not the data.I hope this is of some help.- Liam

This works great! I entered a data set with 65 species, then read in a nexus file with 76 species (i did not add lines with NA for the extra 11 species). The analysis worked and i confirmed the result by doing the calculations from the distance matrix. I also assume the code re-orders that data after it removes taxa from the tree (which i think is this code - tree<-drop.tip(tree,tree$tip.label[-match(names(x),tree$tip.label)]).I also had trouble with my analysis as i had branch lengths of 0 at the tip, but i overcame this by replacing them with very small numbers (i.e 0.000001).Anyway great code, thanks Liam!

This function is amazingly faster than geiger!!! I am having a small problem I can get everything to work for the K stats no problems but when I try to simulate no phylogenetic signal for lambda i get this error message "Error in invCl %*% x : non-conformable arguments". I was hoping you might have some useful suggestion.

Hi Liam,I was actually referring to the physig_sim that compares the response of K or Lambda on tree size. I want to know which is best for my data of over 200 species. I have tried to call the function by: physig_sim(tree, trait), but it didn't work?

Also, it may probably be better to say phylosig() doesn't estimate multiple traits at the same time.

This just made my day. I've been trying to figure out why the fitContinuous examples under "Phylogenetic Signal" compare lnl under a lambda vs. BM model rather than under lambda=0 vs lambda=observed. And today I also ran into the arbitrary upper limit on beta you explained on R-sig-phylo. Not a problem here. So thanks very much, this is so much simpler.

BTW, I don't have a good explanation as to why you would get this error even with old versions of phylosig/phytools, but it is more difficult for me to error check if you are not using the current version. (As well as less useful for other readers.)

Hi Liam, I need to test the phylogenetic signal in morphological traits of several rodent species. The traits are: incisor procumbency, basilar length, mandibular width, out-lever arm and in-lever arms of jaw adductor, and mechanical advantages (in-lever arm divided by out-lever arm). I've tested phylogenetic signal using phylosig, one variable at a time, and i used raw data. But i'm not really sure if this is correct. Can you give me some advise?

About this blog

This web-log chronicles the development of new tools for phylogenetic analyses in the phytools R package. Unless you a reading a very recent page of the blog, I recommend that you install the latest CRAN version of phytools (or latest beta release) before attempting to replicate any of the analyses of this site. That is because the linked functions may be archived, and very likely have been replaced by newer versions.