Tuesday, July 8, 2014

Touching the Data, report 1

We have now completed two days of the workshop. We have had a relaxed approach to progress, and are thus currently running behind the nominal schedule. Nevertheless, we are progressing splendidly.

We had three talks on the first day and one today. I tried to kick things off by asking a series of what I consider to be unanswered questions from observing practitioners and computationalists in action, although apparently several members of the audience already had their own answers to some of these. The bottom line is that phylogenetic analysis focuses on data patterns while interpretation focuses on processes / mechanisms, and this constitutes a large part of the apparent separation of practitioners and computationalists.

Steven Kelk and Luay Nakhleh introduced the diversity of computational approaches that we already have. These presentations neatly complemented each other, providing a valuable summary of the field as well as an overview of current limitations and future prospects. This topic was taken up later by various members of the audience, as one of the inherent problems for practitioners is how to navigate through the methods to choose a suitable one -- there are methods based on parsimony, likelihood and bayesian analysis, and methods that tackle de novo network construction, gene tree / species tree reconciliation, gene tree scoring, and network presentation.

This topic was followed up today by presentations introducing some of the currently available software. Some of these have progressed significantly in recent years, notably PhyloNet and Dendroscope, and there are some relatively new ones, as well as even newer ones in the pipeline. Based on the literature, these programs are being dramatically under-used compared to their actual usefulness.

This morning Scot Kelchner introduced us to the application of Zen Buddhism to science in general and phylogenetics in particular. This went down much better than he seemed to be expecting -- there were apparently a lot of "Zen" people in the room. The basic idea is not to get trapped by preconceived expectations, especially arbitrary categorical notions, when interpreting the output of a phylogenetic analysis. You can consult The Nine-Headed Dragon River, by Peter Matthiessen, if you would like further information.

Finally, we got to the topic implied by the workshop's title: Touching the Data. We had a brief run-through of the pre-existing datasets stored with this blog (see the upper right-hand corner), which cover some of the diversity of what practitioners have provided to date in the way of usable datasets with "known" phylogenetic patterns.

By far the most interesting, however, was the presentation of some recent datasets made available by members of the workshop, notably Axel Janke (bear species), Scot Kelcher (bamboo species) and Mattis List (Indo-European languages) (Jim Whitfield will present his datasets tomorrow morning). These datasets generated much interest, as they provide a diversity of different possible applications for phylogenetic networks. The idea from here on in the workshop is to address what can currently be done with these datasets and what we might like to do with them if the tools were available. This will help focus the participants on specific practical issues, which should lead to the progress that we hope to achieve.

It has rained most of the day, which is actually unusual -- intermittent rain is more common in this climate. We are currently waiting for the football to start: Germany versus Brazil. Tomorrow will be the Netherlands versus Argentina. It is risky being in this country this week! The current local betting is for an all-European final,an assessment that involves no cultural bias whatsoever.