Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed.ISSN 2051-8188 View this blog in Magazine View.

Wednesday, November 29, 2006

As part of the TreeBASE name mapping exercise, I've come across some interesting names, such as "Diplura". This is a homonym, meaning that more than one taxon has this name. This can complicate life somewhat.

In TreeBASE, the taxon Diplura is a spider genus (TreeBASE taxon T4182), part of the study by Fredrick Coyle (hdl:2246/1665).

NCBI has "Diplura" (Taxonomy ID (29997), but this is the insect class (or order, depdnign on what classification you use). NCBI mistakenly links "Diplura" in NCBI to "Diplura" in TreeBASE, but links correctly to the insect record in ITIS (Taxonomic Serial No:99228).

To make matters worse, there is also an algal genus Diplura, which ITIS also has (Taxonomic Serial No:10873).

The problem comes when we look up this name in uBio. The name Diplura is listed as appearing in several classifications, including NCBI, ITIS, etc., as well as its occurrence as a butterfly name (Diplura Ranbur, 1866). However, in the metadata for this name there is the tag <ubio:taxonomicGroup>Phaeophyceae</ubio:taxonomicGroup> (the Phaeophyceae are algae). Clearly, a name that is used by a spider, an insect, and an alga (never mind a butterfly) can't be assigned to a single taxonomic group. Perhaps one solution would be have multiple instance of the <ubio:taxonomicGroup> tag, one for each major taxonomic group the name came from.

My motivation in all this is to start thinking about taxonomic names as simple "tags", with a view to using some of the vocabularies for taxonomies and "folksonomies" geing developed elsewhere, such as SKOS Core. Under this approach, I'd need GUIDs for name strings, independent of their usage. uBio pretty much does this, but for the <ubio:taxonomicGroup> tag.

Friday, November 24, 2006

As noted on the Society of Systematic Biology (SSB) web site, the journal Phyloinformatics has disappeared. It only published eight papers, but this still represents lost effort, and some of the papers are highly relevant to issues I'm interested in. Luckily, with the help of the Internet Archive's "wayback machine", and a PDF sent by Paul Sereno, I've put all the PDFs on the SSB web site. You can get them here.

Friday, November 17, 2006

Molecular Phylogenetics and Evolution carries two articles debating the application of names to trees, which reflects tensions between two codes of nomenclatures (ICZN and Phylocode). Alain Dubois (doi:10.1016/j.ympev.2006.06.007 andDavid Hillis (doi:10.1016/j.ympev.2006.08.001) present rather different views. The paper that brought things to a head is Hillis and Wilcox (2005) "Phylogeny of the New World true frogs (Rana)" (doi:10.1016/j.ympev.2004.10.007, TreeBASE S1186). I've not had time to digest this (it's Friday evening, after all), but I think it's interesting to see to what extent the systems can coexist (which is what Hillis seems to argue, if only as a transitional stage), or whether they are simply incompatible.

Tuesday, November 14, 2006

Just wanted to write this example down before I loose it. Browsing Bill Piel's trees in Google Earth, and was looking at Lee et al.'s paper (doi:10.1111/j.1365-294X.2005.02707.x) on Physalaemus pustulosus. Searching in iSpecies.org lead to records in GenBank, whereupon I stumble on the fact that in GenBank it is Engystomops pustulosus (Taxonomy ID: 76066). Following up the reference on the NCBI taxonomy page, I find a PDF of the paper freely available (although only a URL for an identifier). Browsing the GenBank records (e.g., DQ337249), I find Ron et al. (10.1016/j.ympev.2005.11.022). Among other things, this paper refines the genus Engystomops:

Then paddling off to HerpNET I query for "Engystomops pustulosus" and get one record, whereas for "Physalaemus pustulosus" I get lots of records (although the geographic range doesn't include all the localities in doi:10.1111/j.1365-294X.2005.02707.x.

My point? Well, don't really have one, except that again we are clicking around different web sites to get a complete picture of what is going on, important data are attached to different names for the same animal, and the nature of those names themselves may vary (for example, Nascimento et al. define Engystomops as a set with a type species (E. petersi), whereas Ron et al. (10.1016/j.ympev.2005.11.022) define Engystomops as a least common ancestor of two taxa on a tree. All of this makes integration a challenge, to say the least.

There are some messages here for an open access movement that places belief in the ability of digital solutions to realise access to information. The experience of systematics suggests that too great a focus on the movement, and too much emphasis on the ability of particular technologies to realise a desired effect can be counter productive. A belief in the inevitability of digital solutions can sideline consideration of potential users and transform it into a simple belief that they will come. From this perspective open access looks like a low cost technical fix to issues of inequality, and of course nothing is that simple. However, we can expect that within the “open access movement” a wide diversity of initiatives may proliferate, and these will make sense to those most directly involved in a variety of ways which cross and blur the distinction between providers and users of information. There will be a need to remain open to non-digital solutions, and to respect the capacity of practitioners to craft their own appropriate technologies, even whilst we celebrate the ability of grand visions of open access to inspire, stimulate and offer a way of making sense of diverse experience.