The Phylogeny of a Dataset

Andrea K. Thomer, Nicholas M. Weber
Graduate School of Library and Information Science, University of Illinois at
Urbana-Champaign

Tuesday, Nov. 4, 2:00pm

Summary

In this paper, we use
phylogenetic methods to
study the evolution of a
long-standing, widely-used
dataset from the earth
sciences. We argue that a
quantitative evolutionary
approach to studying how
digital objects change over
time can provide insight
that other historical
methods – such as citation
analysis or provenance
modeling – cannot.

Our analysis shows that
clustering algorithms
developed specifically for
phylogenetic studies in
evolutionary biology can be
successfully adapted to the
study of digital objects,
and their known offspring.
However, we note a number of
limitations with the
approach we have taken here,
and potential ways that
these methods can be refined
in future work. We conclude
with a discussion of how
phylogeny can be an
informative tool for
information science.