Abstract

The fossil record is our only direct means for evaluating shifts in biodiversity through Earth's history. However, analyses of fossil marine invertebrates have demonstrated that geological megabiases profoundly influence fossil preservation and discovery, obscuring true diversity signals. Comparable studies of vertebrate palaeodiversity patterns remain in their infancy. A new species-level dataset of Mesozoic marine tetrapod occurrences was compared with a proxy for temporal variation in the volume and facies diversity of fossiliferous rock (number of marine fossiliferous formations: FMF). A strong correlation between taxic diversity and FMF is present during the Cretaceous. Weak or no correlation of Jurassic data suggests a qualitatively different sampling regime resulting from five apparent peaks in Triassic–Jurassic diversity. These correspond to a small number of European formations that have been the subject of intensive collecting, and represent ‘Lagerstätten effects’. Consideration of sampling biases allows re-evaluation of proposed mass extinction events. Marine tetrapod diversity declined during the Carnian or Norian. However, the proposed end-Triassic extinction event cannot be recognized with confidence. Some evidence supports an extinction event near the Jurassic/Cretaceous boundary, but the proposed end-Cenomanian extinction is probably an artefact of poor sampling. Marine tetrapod diversity underwent a long-term decline prior to the Cretaceous–Palaeogene extinction.

1. Introduction

Fossils are our only direct record of the history of animal diversification, providing a window onto processes that have shaped the history of life on Earth, such as mass extinctions, clade-replacement events and the tempo of diversification through time. Large-scale studies of marine invertebrate diversity through the Phanerozoic designed to address these points quantitatively were first carried out during the 1970s and 1980s (e.g. Raup 1972, 1976; Sepkoski et al. 1981). Initially, little effort was made to correct for biases introduced to palaeodiversity signals by uneven sampling (but see Raup 1972, 1976) owing to differences in the area of rock preserved at outcrop, in preservation potential of ancient organisms through geological time, and in the collection history of different regions of the globe or geological formations. Recent years have seen increasing investigation into the effects of these factors, and uneven sampling is now recognized as a major confounding influence on our view of palaeodiversity (e.g. Peters & Foote 2001; Smith 2001, 2007; Crampton et al. 2003; Peters 2005; Smith & McGowan 2007; McGowan & Smith 2008; Alroy et al. 2008).

Here, we present the first thorough investigation of the palaeodiversity patterns and geological sampling biases in Mesozoic marine tetrapods, often considered as an adaptive assemblage (e.g. Massare 1987). These taxa represent six taxonomically and morphologically diverse clades that formed an important component of Mesozoic marine ecosystems for ca 185 Ma (figure 1). Several major extinction events affecting these clades have been proposed, and examination of these provides insight into the timing and severity of mass extinctions more generally. Thus, tetrapods allow hypotheses of marine extinction and diversification to be assessed independently from invertebrate-dominated datasets, and provide comparative data for studies of the effects of sampling biases on other vertebrate groups.

Stratigraphic distributions of marine tetrapod occurrences. Dashed lines indicate the inferred presence of clades in geological stages for which they have not been sampled. Squamate silhouette adapted from Lindgren et al. (2007), others drawn by A.S.S.

2. Material and methods

(a) Data collection

We collected data on marine tetrapod species occurrences in 29 stage-level Mesozoic time bins. This resulted in 572 species occurrences by stage, representing 447 nominal species; one of the largest vertebrate palaeodiversity datasets yet compiled (S1 in the electronic supplementary material). Recent reviews of basal sauropterygians (Rieppel 2000), ichthyosaurs (McGowan & Motani 2003), marine chelonians (Hirayama 1997) and metriorhynchoids (Young et al. in press) were complemented by data collection from more recent publications, and the primary literature on thalattosaurs, teleosauroids, squamates and plesiosaurians. Because most species are limited to a single stage, we counted taxa sampled in time bins rather than observing patterns of origination/extinction. These taxic diversity estimates (TDE) do not include phylogenetic ghost lineages.

We used an estimate of the total number of fossiliferous marine formations (FMFs) within each time bin (S1 in the electronic supplementary material) as a proxy for temporal variation in research effort, facies diversity and the volume of fossiliferous rock available for palaeontologists to sample (e.g. Raup 1976; Peters & Foote 2001). These were counted from all records of Mesozoic marine fossils, downloaded from the The Paleobiology Database (PBDB) (181 829 records; 12 May 2009). An alternative approach uses sub-sampling of collections-level occurrence data to standardize sampling between time bins (e.g. Alroy et al. 2008). Unfortunately, assembling these data for marine tetrapods would require an immense international databasing project that is beyond the scope of the present study.

(b) Correlation between rock record estimates and taxon occurrences

We used Pearson's product–moment correlation (ρ), Spearman's rank correlation (rs) and Kendall's (τ) coefficient to test for correlation between estimates of rock availability (FMF) and taxic diversity. Significance thresholds were adjusted for multiple comparisons within families following the false discovery rate procedure of Benjamini & Hochberg (1995). Tests of correlation within Triassic, Jurassic and Cretaceous data subsets were treated as a family of three comparisons; the Triassic–Jurassic, and the Cretaceous data were treated as a family of two comparisons (which did not alter the results for the Cretaceous data); the Mesozoic (i.e. total) data did not require correction for multiple comparisons. The Jarque-Bera test confirmed that all data were normally distributed after log10 transformation (S1 in the electronic supplementary material). Short-term autocorrelation was only detected in our Cretaceous data and first difference transformation of these data only slightly weakened statistical correlations (S2 in the electronic supplementary material).

(c) Correction of raw diversity data for FMF

Species occurrence data were corrected for our proxy of geological sampling intensity (FMF) using the method of Smith & McGowan (2007; see also Barrett et al. 2009; Butler et al. 2009). This method calculates a modelled diversity estimate (MDE), which represents the diversity expected if observed diversity biases are solely the result of sampling intensity. The model was constructed by independently rank-ordering taxic diversity and FMF. The ordered data were then paired-off and log10-transformed, applying the function [f(TDE)=log10(TDE + 1)]. A least-squares regression line was calculated for this re-ordered data, representing a relationship in which FMF accurately predicted taxic diversity. The equation of this line was then used to calculate predicted diversity for each time interval based on FMF; this is the MDE. The difference between taxic diversity and modelled diversity is the residual diversity, not explained solely by variation in FMF. High positive or negative residual values are most likely explained by either exceptional sampling events (e.g. Lagerstätten) or genuine changes in palaeodiversity. Because Triassic and Jurassic modelled diversities may be distorted by Lagerstätten effects, a diversity model and residual diversity were also calculated separately for the Cretaceous data (figure 2b).

Plots of diversity and FMFs through time. (a) Uncorrected (raw) data. (b) Residual data after correction for a model in which FMF predicts diversity (see text for an explanation). Dashed line shows FMF, black line shows TDE, grey line shows PDE and dashed-dot line shows TDE corrected for a model based solely on Cretaceous data.

Phylogenetic diversity estimates (PDE) incorporate information from tree structure by adding counts of ghost lineages to counts of taxa included in phylogenetic trees (Norell 1992). They can be used to partially correct for incomplete sampling. However, these should be interpreted cautiously as the correction is still dependent on fossil sampling and only extends backwards in time (Wagner 2000). However, phylogenetic diversity may give a more reliable indication of mass extinction events as lineages that range through geological stages are counted as species occurrences rather than absence. When phylogenetic diversity exceeds taxic diversity it is likely that taxic diversity is an underestimate of relative palaeodiversity, but this can also arise from phylogenetic error (Wagner 2000). If phylogenetic diversity is lower than taxic diversity within a given time bin, this indicates that many taxa of that age have not yet been incorporated into phylogenetic analyses. The cladograms used to compute phylogenetic diversity are included in S1 in the electronic supplementary material.

3. Results

(a) Comparison between the rock record and taxonomic diversity

Correspondence between the trend lines of taxic diversity and FMFs is supported by statistically significant correlation (table 1). Both lines follow an approximate pattern (figure 2a) in which overall values decline through the Triassic and Early Jurassic, reaching a low level at the beginning of the Middle Jurassic (Aalenian). They then increase into the late Middle Jurassic (Callovian), and remain approximately constant until the end-Jurassic. The Early Cretaceous is characterized by low taxic diversity, while FMF remains at Late Jurassic levels. Both trends rise to a peak around the Albian–Cenomanian. They then decline until the Coniacian before increasing to a Campanian–Maastrichtian high. Correspondence between taxic diversity and FMF is particularly marked during the Cretaceous (table 1). By contrast, Triassic and Jurassic data are punctuated by deviations, high values of either trend that are not present in the other: Taxic diversity is high in the Anisian–Ladinian, Sinemurian, Toarcian, Callovian and Kimmeridgian (figure 2a); FMF is high in the Carnian–Norian, Pliensbachian, Bajocian and Oxfordian. Accordingly, the combined Triassic–Jurassic data correlate less strongly than those for the Cretaceous. This weakness is driven by deviations concentrated in the Jurassic, and correlations of the Jurassic data are weakest (table 1). Although the Triassic data are not significantly correlated, this may result partly from the small number of Triassic time bins (n = 6). Indeed, the non-parametric tests (Spearman's rs and Kendall's τ) recover higher correlation coefficients for the Triassic than for the Cretaceous.

Tests of the correlation between log10FMF versus log10TDE over various time intervals. Identical correlation statistics were obtained for the relationship between MDE and log10TDE because MDE is a linear transformation of log10FMF. Statistically significant correlations are indicated by single (*p < 0.05) or double (**significant after correction for multiple comparisons: Benjamini & Hochberg (1995); correction not applied to Mesozoic data) asterisks.

(b) Corrected diversity estimates

The trend of residual taxic diversity shows an oscillating pattern in the Triassic and Jurassic, in which Anisian–Ladinian, Hettangian–Sinemurian, Toarcian, Callovian and Kimmeridgian peaks alternate with stages of low or negative residuals (figure 2b). Residuals are negative in the Early Cretaceous and then positive in the Late Cretaceous until the Campanian and Maastrichtian, which have residual values close to zero.

(c) Phylogenetic diversity estimates

The PDE approximately corresponds to taxic diversity in many time bins (figure 2a). However, it is consistently higher in the Norian–Pliensbachian, Aalenian–Oxfordian, Berriasian–Aptian, and Coniacian, suggesting that relative taxic diversity is underestimated for these intervals. This effect is particularly marked in the Oxfordian. Despite this approximate correspondence with taxic diversity (which correlates with FMF), phylogenetic diversity does not correlate with FMF except in the Jurassic and Cretaceous, when marginally significant correlations are recovered from some tests prior to correction for multiple comparisons (table 2). Lack of correlation presumably arises from the inclusion of inferred, non-sampled data in the PDE.

4. Discussion and conclusions

(a) The nature of geological megabias

The significant correlation between taxic diversity and FMF suggests that a portion of the observed pattern of taxic diversity might be explained by temporal variation in the quantity of fossiliferous rock and range of facies sampled by palaeontologists (ρ2, the proportion of variation in taxic diversity explained by FMF is given in table 1). If this is the case, caution is required when testing macroevolutionary hypotheses using uncorrected taxic diversity. Alternatively, sea level change may drive both FMF and taxic diversity (e.g. Peters 2005), in which case uncorrected taxic diversity may reflect genuine patterns. The relative importance of such effects is still under debate (Benton & Emerson 2007; Smith 2007; Wall et al. 2009) and differentiating between direct causation and ‘common cause’ is not possible given our data. However, the magnitude of Lagerstätten effects on our data (below) indicates that at least some aspects of sampling heterogeneity cannot be ignored when interpreting palaeodiversity.

A striking feature of our residual diversity trend is the presence of five Triassic–Jurassic peaks (figure 2b). These are not explained by variation in FMF, and could be interpreted as periods of genuinely high diversity. However, these peaks correspond to a small number of European formations from which marine tetrapods have been intensively sampled for ca 200 years, and are thus classified as Lagerstätten effects. These formations have yielded a high proportion of species occurrences from their respective stages: the Anisian (55%) and Ladinian (66%) formations of central Europe, the primarily Sinemurian Lower Lias Group of the UK (100%), the Toarcian Posidonienschiefer Lagerstätte of Germany (52%), the Callovian Peterborough Member of the Oxford Clay Formation (73%) and the primarily Kimmeridgian Kimmeridge Clay Formation of the UK (55%). It is clear that the exceptional sampling of these formations distorts observed diversity patterns to the extent that a significant correlation between taxic diversity and FMF is difficult to recover for Triassic–Jurassic data (table 1). The Triassic data include only one Lagerstätte (Anisian–Ladinian) and, despite the small sample size, which may obstruct statistical detection of true correlation, only marginally non-significant correlations arise from non-parametric tests (table 1). This possible correlation in the Triassic suggests that Lagerstätten effects are temporally localized departures from an underlying correspondence between taxic diversity and FMF across the Mesozoic. The absence of such departures from the Cretaceous record is probably because few European Cretaceous formations yielding well-preserved marine tetrapods have been sampled so extensively.

Previous analyses of the relationship between estimates of geological sampling and vertebrate diversity have illustrated that megabiases strongly influence palaeodiversity signals. However, our results demonstrate that qualitatively different sampling regimes may predominate at different points in Earth's history: marine tetrapod palaeodiversity is primarily influenced by variation in FMF in the Cretaceous, by isolated instances of exceptional sampling in the Jurassic and by a possible combination of these factors in the Triassic. Understanding this temporal heterogeneity in the nature of sampling bias is critical to interpreting patterns in vertebrate diversity through time because a single correctional regime may not be sufficient to unpick the patchwork of geological megabias affecting vertebrate preservation. Crucially, it may be difficult to calculate a single line representing ‘true’ palaeodiversity across all time bins. We explain residual diversity (figure 2b) using a combination of intensely sampled single formations and genuine fluctuations in biological diversity. The former mechanism is most influential in the Triassic–Jurassic.

(b) Marine reptile diversification and extinctions

The uncorrected taxic diversity trend line (figure 2a) shows detailed similarity to that used by Bardet (1992, 1994) to suggest marine tetrapod extinction events at the ends of the Ladinian, Tithonian, Cenomanian and Maastrichtian. Bakker (1993) independently proposed end-Tithonian and mid-Cretaceous extinctions. However, residual diversity corrected for variation in FMF shows a different pattern (figure 2b). A high taxonomic and morphological (e.g. Rieppel 2000; McGowan & Motani 2003) diversity of marine tetrapods was already present by the early Middle Triassic (Anisian), less than 6 Ma after the Permian–Triassic boundary. This suggests either a missing Permian record for at least some clades, or a rapid radiation of marine tetrapods after the end-Permian mass extinction. The high diversity of the Anisian–Ladinian is followed by negative residual diversity through the Late Triassic, despite an increase in marine fossil sampling in the early Late Triassic (Carnian). This is corroborated by a decline in phylogenetic diversity, and suggests an extinction event after the Ladinian, encompassing the demise of nothosauroids (Ladinian), pachypleurosaurs and thalattosaurs (Carnian). However, it is difficult to constrain the timing and severity of this event because of pronounced Lagerstätten effects, which distort observed and residual palaeodiversity (figure 2). Subsequent Triassic–Jurassic fluctuations in residual diversity (figure 2b) are impossible to distinguish from the presence or absence of intensive sampling but high phylogenetic diversity suggests that taxic diversity is underestimated during the intervals of apparent low diversity. There is no evidence that an end-Triassic extinction event affected marine reptiles. This conflicts with the controversial evidence for a major mass extinction among marine invertebrates, terrestrial vertebrates and plants (e.g. Tanner et al. 2004).

A sharp end-Tithonian (terminal Jurassic) drop in taxic diversity results in pronounced negative residual diversity, from which marine reptiles did not recover until the early Late Cretaceous. This corresponds to a well-documented decline in the diversity of thalattosuchian crocodiles (Young et al. in press) and ichthyosaurs (e.g. McGowan & Motani 2003). This result complements hypotheses of a wider extinction event coinciding with a major end-Jurassic reorganization of terrestrial dinosaur faunas (Bakker 1978; Upchurch & Barrett 2005). However, recognition of a marine tetrapod extinction here is complicated by evident differences in sampling regimes between the Jurassic and the Cretaceous. The earliest Late Cretaceous (Cenomanian) is marked by residual diversity close to zero, increasing to high positive values in subsequent stages. This is not consistent with an end-Cenomanian extinction event (Bardet 1992, 1994), instead suggesting progressive diversification driven by the radiation of marine chelonians (despite the final extinction of ichthyosaurs), but obscured by the low numbers of Cenomanian–Santonian marine formations. It is also possible that the observed end-Cenomanian marine invertebrate extinction event is an artefact of geological megabias (Smith et al. 2001).

Diversity declined in the latest Cretaceous (Campanian–Maastrichtian). This does not represent an ‘edge effect’ (e.g. Signor & Lipps 1982) because it results from the in-bin taxon counts and not from range-through data. Barrett et al. (2009) recovered a similar result for theropod and ornithischian dinosaurs. In combination, these results suggest that global ecosystems, marine and terrestrial, underwent a progressive decline in vertebrate biodiversity during the terminal stages of the Cretaceous.

Our data provide evidence for significant extinction events in the early Late Triassic, terminal Jurassic and terminal Cretaceous, but not at the end of the Cenomanian. The end-Triassic extinction event, one of the ‘big five’ mass extinctions, had little effect on marine tetrapods. This is surprising as many taxa were at the apices of marine food chains, and thus might be expected to be highly susceptible to major ecosystem changes. The apparent long-term decline in marine tetrapod diversity prior to the Cretaceous/Palaeogene boundary contradicts prevailing wisdom that the end-Cretaceous extinction event was geologically rapid and cataclysmic (marine tetrapods: Bardet 1992, 1994; Ross 2009). These patterns provide important new insights into the evolutionary history of Mesozoic marine communities.

Acknowledgements

This study benefited from data compiled within The Paleobiology Database (www.paleodb.org) by numerous colleagues, and is Paleobiology Database official publication 104. We thank P. M. Barrett, P. Mannion and P. Upchurch for comments on earlier versions of the manuscript. H. F. Ketchum, M. Young and S. Pierce provided access to manuscripts in press. Two anonymous reviewers provided comments that greatly improved this manuscript. R.J.B. is supported by an Alexander von Humboldt Research Fellowship.

2007The shape of the Phanerozoic marine palaeodiversity curve: how much can be predicted from the sedimentary rock record of western Europe?Palaeontology50, 765–774. (doi:10.1111/j.1475-4983.2007.00693.x)