Abstract

Motivation: In recent years there has been increased interest in
producing large and accurate phylogenetic trees using statistical
approaches. However for a large number of taxa, it is not feasible
to construct large and accurate trees using only a single processor.
A number of specialized parallel programs have been produced in an
attempt to address the huge computational requirements of maximum
likelihood. We express a number of concerns about the current set
of parallel phylogenetic programs which are currently severely limiting
the widespread availability and use of parallel computing in maximum
likelihood-based phylogenetic analysis.
Results: We have identified the suitability of phylogenetic analysis
to large-scale heterogeneous distributed computing. We have completed
a distributed and fully cross-platform phylogenetic tree building
program called distributed phylogeny reconstruction by maximum likelihood.
It uses an already proven maximum likelihood-based tree
building algorithm and a popular phylogenetic analysis library for all
its likelihood calculations. It offers one of the most extensive sets
of DNA substitution models currently available. We are the first, to
our knowledge, to report the completion of a distributed phylogenetic
tree building program that can achieve near-linear speedup while only
using the idle clock cycles of machines. For those in an academic
or corporate environment with hundreds of idle desktop machines,
we have shown how distributed computing can deliver a âfreeâ ML
supercomputer.
Availability: The software (and user manual) is publicly available
under the terms of the GNU general public licence from the system
webpage at http://www.cs.may.ie/distributed
Contact: tom.naughton@may.ie