Abstract

Divergence time estimation — the calibration of a phylogeny to geological time — is a integral first step in modelling the tempo of biological evolution (traits and lineages). However, despite increasingly sophisticated methods to infer divergence times from molecular genetic sequences, the estimated age of many nodes across the tree of life contrast significantly and consistently with timeframes conveyed by the fossil record. This is perhaps best exemplified by crown angiosperms, where molecular clock (Triassic) estimates predate the oldest (Early Cretaceous) undisputed angiosperm fossils by tens of millions of years or more. While the incompleteness of the fossil record is a common concern, issues of data limitation and model inadequacy are viable (if underexplored) alternative explanations. In this vein, Beaulieu et al. (2015) convincingly demonstrated how methods of divergence time inference can be misled by both (i) extreme state-dependent molecular substitution rate heterogeneity and (ii) biased sampling of representative major lineages. While these (essentially model-violation) results are robust (and probably common in empirical data sets), we note a further alternative: that the configuration of the statistical inference problem alone precluded the reconstruction of the paleontological timeframe for the crown age of angiosperms. We demonstrate, through sampling from the joint prior (formed by combining the tree (diversification) prior with the various calibration densities specified for fossil-calibrated nodes), that with no data present at all, an Early Cretaceous crown angiosperms is rejected (i.e., has essentially zero probability). More worrisome, however, is that for the 24. nodes calibrated by fossils, almost all have indistinguishable marginal prior and posterior age distributions, indicating an absence of relevant information in the data. Given that these calibrated nodes are strategically placed in disparate regions of the tree, they essentially anchor the tree scaffold, and so the posterior inference for the tree as a whole is largely determined by the pseudo-data present in the (often arbitrary) calibration densities. We recommend, as for any Bayesian analysis, that marginal prior and posterior distributions be carefully compared, especially for parameters of direct interest. Finally, we note that the results presented here do not refute the biological modelling concerns identified by Beaulieu et al.(2015). Both sets of issues remain apposite to the goals of accurate divergence time estimation, and only by considering them in tandem can we move forward more confidently. [Angiosperms; divergence time estimation; fossil record; marginal priors; information content; diptych; wild speculation.]

Copyright

The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.