Convenience functions

Get the parent of a clade

The Tree data structures in Bio.Phylo don't store parent references for each clade. Instead, the get_path method can be used to trace the path of parent-child links from the tree root to the clade of choice:

Note that get_path has a linear run time with respect to the size of the tree -- i.e. for best performance, don't call get_parent or get_path inside a time-critical loop. If possible, call get_path outside the loop, and look up parents in the list returned by that function.

Alternately, if you need to repeatedly look up the parents of arbitrary tree elements, create a dictionary mapping all nodes to their parents:

A potential issue: The above implementation of lookup_by_names doesn't include unnamed clades, generally internal nodes. We can fix this by adding a unique identifier for each clade. Here, all clade names are prefixed with a unique number (which can be useful for searching, too):

Test for "semi-preterminal" clades

Suggested by Joel Berendzen

The existing tree method is_preterminal returns True if all of the direct descendants are terminal. This snippet will instead return True if any direct descendent is terminal, but still False if the given clade itself is terminal.

def is_semipreterminal(clade):
"""True if any direct descendent is terminal."""for child in clade:
if child.is_terminal():
returnTruereturnFalse

In Python 2.5 and later, this is simplified with the built-in any function:

Root at the midpoint between the two most distant nodes (or "center" of all tips)

Graphics

TODO:

Party tricks with draw_graphviz, covering each keyword argument

Exporting to other types

Convert to an 'ape' tree, via Rpy2

The R statistical programming environment provides support for phylogenetics through the '[ape|http://ape.mpl.ird.fr/]' package and several others that build on top of 'ape'. The Python package [rpy2|http://rpy.sourceforge.net/rpy2.html] provides an interface between R and Python, so it's possible to convert a Bio.Phylo tree into an 'ape' tree object:

Convert to a NumPy array or matrix

import numpy
def to_adjacency_matrix(tree):
"""Create an adjacency matrix (NumPy array) from clades/branches in tree.
Also returns a list of all clades in tree ("allclades"), where the position
of each clade in the list corresponds to a row and column of the numpy
array: a cell (i,j) in the array is 1 if there is a branch from allclades[i]
to allclades[j], otherwise 0.
Returns a tuple of (allclades, adjacency_matrix) where allclades is a list
of clades and adjacency_matrix is a NumPy 2D array.
"""
allclades = list(tree.find_clades(order='level'))
lookup = {}for i, elem inenumerate(allclades):
lookup[elem] = i
adjmat = numpy.zeros((len(allclades), len(allclades)))for parent in tree.find_clades(terminal=False, order='level'):
for child in parent.clades:
adjmat[lookup[parent], lookup[child]] = 1ifnot tree.rooted:
# Branches can go from "child" to "parent" in unrooted trees
adjmat += adjmat.transposereturn(allclades, numpy.matrix(adjmat))