Datasets

This is a compilation of links to empirical datasets that might prove useful for validating mathematical algorithms associated with those phylogenetic networks intended to represent evolutionary history. In each case an aligned datafile is provided, along with annotation notes.

For each dataset, the following information is provided:

Name: a unique name for the dataset

Source: the publication used as the source for the data

Zip file: contains the nexus file, the annotation notes, and a PDF copy of the source paper

Nexus file: a text version of the nexus-formatted file for quick viewing

Notes: a brief explanation of what the data are about, and what phylogenetic history they represent

Datasets where the history is known from simulation

(1)
Name: CaminSource: Sokal RR (1983) A phylogenetic analysis of the Caminalcules. I. The data base. Systematic Zoology 32: 159-184.Zip file: Camin.zipNexus file: 2 separate files (see the Zip file)Notes: morphological features of artificial organisms; there are two data files, one containing only the 29 extant organisms (and for which the tree is provided) and one with both the 29 extant organisms and the 48 fossil organismsCaveat emptor: These are simulated data, and do not therefore necessarily match real data in all ways

Part 2Datasets where the history is reticulated

Datasets where the evidence of reticulation is independent of the dataset.

Datasets where the history is known from experimentation

(i) Hybridization and Introgression

(1)Name: FelinerSource: Fuertes Aguilar J, Rosselló JA, Nieto Feliner G (1999) Nuclear ribosomal DNA (nrDNA) concerted evolution in natural and artificial hybrids of Armeria (Plumbaginaceae). Molecular Ecology 8: 1341-1346.Zip file: Feliner.zipNexus file: Feliner.nexNotes: one gene sequence from Armeria plants; there are three artificial hybrids, which differ only by having additive polymorphic nucleotides in some of the six positions at which the parents differ

(2)Name: McDadeSource: McDade LA (1997) Hybrids and phylogenetic systematics. III. Comparison with distance methods. Systematic Botany 22: 669-683.Zip file: McDade.zipNexus file: McDade.nexNotes: morphology from Aphelandra plants; there are 17 artificial hybrids, originally intended to be analyzed with each F1 hybrid added individually to the set of F0 species

(3)Name: AtchleySource: Atchley WR, Fitch WM (1991) Gene trees and the origins of inbred strains of mice. Science 254: 554-558.Zip file: Atchley.zipNexus file: Atchley.nexNotes: percentage allelic differences for 144 gene loci from laboratory mice; SEA, CBA and C3H are hybrids, but only the first one appears to be detectable in the dataset

(8)Name: MoodySource: Moody ML, Rieseberg LH (2012) Sorting through the chaff, nDNA gene trees for phylogenetic inference and hybrid identification of annual sunflowers (Helianthus sect. Helianthus). Molecular Phylogenetics and Evolution 64: 145–155.Zip file: Moody.zipNexus files: 11 separate files (see the Zip file)Notes: eleven partial gene sequences from Helianthus plants, with multiple accessions for many of the species, and multiple alleles for many of the accessions; Helianthus anomalus, Helianthus deserticola and Helianthus paradoxus are hybrids; some recombinants have also been detectedCaveat emptor: There are discrepancies between Table 1 and Figure 1 in the paper, and between both of these and the dataset; these are detailed in the Excel spreadsheet in the Zip file