Abstract

The study of spatial patterns in biotic compositional variability in deep time is key to understanding the macroecological response of species assemblages to global change. Globally warm climatic phases are marked by the expansion of megathermal climates into currently extra-tropical areas. However, there is currently little information on whether vegetation in these “paratropical” regions resembled spatially modern tropical or extra-tropical biomes. In this paper we explore spatial heterogeneity in extra-tropical megathermal vegetation, using sporomorph (pollen and spore) data from the late Paleocene Calvert Bluff and Tuscahoma Formations of the formerly paratropical U.S. Gulf Coast (Texas, Mississippi, and Alabama). The data set comprises 139 sporomorph taxa recorded from 56 samples. Additive diversity partitioning, nonmetric multidimensional scaling, and cluster analysis show compositional heterogeneity both spatially and lithologically within the U.S. Gulf Coastal Plain (GCP) microflora. We then use sporomorph data from Holocene lake cores to compare spatial patterns in the late Paleocene GCP with modern tropical and extra-tropical biomes. Distance decay analysis of the Holocene data reveals a higher rate of spatial turnover in tropical versus extra-tropical vegetation types, consistent with a latitudinal gradient in floral compositional heterogeneity. The specific combination of rate and scale dependency of distance decay in the Holocene assemblages prevented us from associating the late Paleocene GCP with any particular modern biome. Our results demonstrate the importance of spatial scale, taphonomy, and lithology in determining patterns of spatial heterogeneity, and show the potential of the fossil sporomorph record for studying spatial patterns and processes in deep time.

During the “greenhouse” phase of the early Paleogene (Zachos et al. 2001, 2008) tropical climates extended into modern extra-tropical areas (Wilf et al. 2003; Fine and Ree 2006; Jaramillo et al. 2006; Jablonski 2008), and broad bands of “paratropical” forests covered much of the midlatitudes (Fine and Ree 2006). To date there is little information on whether plant communities in these areas spatially resembled modern tropical or extra-tropical biomes. This is significant because it provides information on the relative importance of the regional climatic regime in determining spatial floral patterns. Given similar levels of environmental heterogeneity along a latitudinal gradient, it is expected that spatial heterogeneity will increase with decreasing latitude, due to more equable climates facilitating greater habitat specialization towards the tropics (Stevens 1989). This pattern has been demonstrated for vascular plants and mammals in North America (Rodriguez and Arita 2004; Qian and Ricklefs 2007; Qian et al. 2009), although direct comparisons with tropical latitudes are currently lacking.

The U.S. Gulf Coastal Plain (hereafter GCP) was part of the expanded tropical climatic zone during the early Paleogene. Leaf morphologies, from which megathermal climatic estimates have been derived (Wolfe 1978; Wolfe and Dilcher 2000), testify to tropically functional plant types, whereas previous sporomorph studies have shown that taxonomically the vegetation was a mix of modern temperate and tropical lineages (Harrington 2001; Harrington et al. 2004; Jardine and Harrington 2008). The lack of a land bridge between North and South America limited compositional similarity between the GCP and Neotropics to ∼0.7% in the late Paleocene (Jaramillo and Dilcher 2001; Harrington and Jaramillo 2007). Therefore, the spatial dynamics of the GCP paratropics were not simply “inherited” from range expansions from the tropical biome. The late Paleocene GCP sporomorph record is thus ideal for measuring spatial heterogeneity in an extra-tropical, megathermal setting.

Terms used to describe compositional heterogeneity through space have varied greatly among authors (Vellend 2001; Magurran 2004). Beta diversity and spatial turnover have commonly been used interchangeably, but here we follow Vellend (2001) in differentiating between them. In this scheme beta diversity is simply heterogeneity within a group of samples. Although our main interest here is in spatial heterogeneity (Fig. 1A), this can also relate to samples distributed temporally or environmentally. Note that beta diversity is not intrinsically tied to any specific spatial scale, such as habitat or local area (Gering et al. 2003), making its use more flexible across different sampling schemes. Turnover is defined as compositional change along a gradient. Spatial turnover therefore refers to compositional heterogeneity along a geographic gradient (Fig. 1B). As with beta diversity this is not bound to any specific spatial scale; thus the “geodisparity” concept of Miller et al. (2009) is simply spatial turnover writ large. Where there is a strong environmental control on species membership (as predicted by niche-assembly theories of spatial heterogeneity [Chave 2008]; see Fig. 1B, lower row), beta diversity may be high but spatial turnover low.

β diversity (among-sample heterogeneity) and spatial turnover in ecological systems. Large squares represent samples distributed in space; symbols represent different taxa found in those samples. α (within-sample) richness is held constant at α = 5 throughout. A, Left group: many taxa are shared between samples, suppressing both β and γ (total) richness. Right group: no taxa are shared between samples, increasing both β and γ richness. B, The distribution of taxa along a geographic gradient. Background fill within squares represents environmental conditions, e.g., soil type or climate. Upper row: compositional similarity declines with geographic distance, regardless of the environmental characteristics of each sample. Lower row: within-sample composition is determined entirely by environmental variables, and compositional similarity declines with environmental, rather than geographic, distance.

In this paper we use the late Paleocene GCP sporomorph record to explore spatial heterogeneity in megathermal, extra-tropical floras. We address two principal questions. First, how important was spatial heterogeneity for structuring the GCP palynoflora during the Paleocene? Second, how did the spatial structure of the GCP paratropical vegetation relate to modern tropical and extra-tropical biomes? We address this second question by using Holocene sporomorph data as a proxy record for modern plant communities.

We find that there are significant compositional differences geographically and between lithologies (clastic samples versus lignites) even though many taxa have widespread geographic distributions and occur in repeated associations across samples and formations. Our data are not able to associate the Paleocene GCP more closely with either modern tropical or extra-tropical biomes. Our results do demonstrate the importance of spatial scale in determining spatial patterns, and support an increased level of compositional heterogeneity in the modern tropics compared to the extra-tropics.

Sampling Spatial Patterns in the Fossil Sporomorph Record

A variety of biases can confound studies of spatial patterns in the fossil record. Here, we briefly discuss those that are relevant to the sporomorph record. The spatial area represented by a sample, termed “grain size” in the ecological literature (Nekola and White 1999), will influence the patterns described. Larger grain sizes are expected to reduce the rate of spatial turnover between samples (Vellend 2001; Mac Nally et al. 2004; Qian et al. 2005). Furthermore, species-area relationships dictate that, all else being equal, a smaller area will contain fewer species than a larger area (Rosenzweig 1995; Nekola and White 1999). Studies of spatial heterogeneity must therefore comprise samples of a similar grain size.

As with spatial grain size, time-averaging (the amount of time represented by a sample) will influence the spatial pattern represented in a data set (White et al. 2010). Increasing the amount of time represented by a sample concomitantly increases the spatial area represented, by including more taxa that are only occasionally present in the sampled area and therefore missed in shorter sampling intervals (Magurran and Henderson 2003; White et al. 2010). It is thus necessary to ensure that temporal sampling is similar among samples being compared. It is also essential that spatial comparisons be between coeval deposits, so that differences are not due to compositional changes through time rather than through space.

Finally, the taxonomic rank used will influence the observed patterns. Supra-specific taxa are more geographically widespread than individual species, resulting in lower levels of spatial heterogeneity among samples (Qian and Ricklefs 2007). Sporomorphs typically allow identification only to generic or familial level at best (Traverse 1988). However, Qian and Ricklefs (2007) demonstrated that the latitudinal gradient in spatial turnover is still detectable at these taxonomic levels, albeit in a suppressed form compared to species-level comparisons. Thus there is no reason that low levels of taxonomic resolution imposed by the fossil sporomorph record (Traverse 1988; Mander et al. 2010) should preclude its use for studying spatial patterns in plant assemblages.

The Wilcox Group in Texas crops out along a northeast-southwest trending belt (Nichols and Traverse 1971). The Calvert Bluff Formation in east-central Texas comprises a suite of paralic sediments very similar to those found in the Tuscahoma Formation, with sands, silts, clays and lignite horizons. Much of this region during the late Paleocene was dominated by the Rockdale Delta system, which was at least partly fed by the Mount Pleasant Fluvial System to the northeast (Fisher and McGowen 1967; Nichols and Traverse 1971). Flanking the delta system to the southwest and east are sediments representative of strandline, bay, and lagoon systems. The Calvert Bluff Formation is the main lignite-bearing unit in the Texas Gulf Coast (Middleton and Luppens 1995); lignite seams occur in association with fluvial, deltaic and lagoonal sediments (Nichols and Traverse 1971).

Materials

Sampling Strategy

Fifty-six samples were collected from late Paleocene deposits on the GCP, the majority in May 2008 (Fig. 2). Twenty-eight samples were gathered from outcrops of the Calvert Bluff Formation at five localities in east-central and east Texas: U.S. Silica Mine near Kosse (Limestone County; 31°18′30″N, 96°30′30″W; 1 sample), Walnut Creek Mining Company Calvert Mine near Bremond (Robertson County; 31°04′54″N, 96°39′30″W; 13 samples), Bastrop Golf Course in Bastrop (Bastrop County; 30°04′24″N, 97°16′54″W; 1 sample), Red Bluff on the Colorado River in Bastrop (Craddock 1947) (Bastrop County; 30°04′18″N, 97°16′42″W; 10 samples), and a road cut eight miles north of Mt. Enterprise (Rusk County; 32°01′18″N, 94°43′24″W; 3 samples). The U.S. Silica Mine and Calvert Mine deposits are located near the base of the formation, and the Bastrop Golf Course, Red Bluff, and Mt. Enterprise sections are at the top of the formation, just below the contact with the Carrizo Formation (Craddock 1947; Galloway 2002). Although the Sabinetown Formation is typically placed between the Calvert Bluff and Carrizo Formations (Fairchild and Elsik 1969; Elsik 1978; Crabaugh and Elsik 2000; Elsik and Crabaugh 2001), it is highly discontinuous in surface outcrop and was absent from the sections sampled. Samples representing marine deposition in the upper part of the formation (Janus and Stidham 2008) were barren of sporomorphs, limiting coverage to the formation top and base.

Following the depositional model of Fisher and McGowen (1967), all localities represent fluvial-deltaic sedimentation. According to this model, the Bastrop Golf Course and Red Bluff sections may be more distal than the Silica Mine, Calvert Mine, and Mt. Enterprise sections. Some samples from Calvert Mine and Bastrop Golf Course contained dinoflagellate cysts (<15 counted per sample), indicating some level of marine influence or reworking. Twenty-two of the Texas samples are clastic sediments (interlaminated clays, silts, and sands) and six samples are lignites, with three lignite samples from the base of the formation (U.S. Silica Mine and Calvert Mine sections) and three from the top (Mt. Enterprise section).

Twenty-eight samples were taken from the Tuscahoma Formation in Mississippi and Alabama. Twenty-six samples were collected from the OSM #2 Wahalak core (Mancini 1981, 1984; Carroll 1999) from Choctaw County, Alabama (32°03′00″N, 88°17′36″W). The Tuscahoma Formation in the Wahalak core is 154 m thick. Sporomorph yields were excellent throughout most of the formation, and sample coverage is much more complete than for the Calvert Bluff Formation. Twenty-three samples are clastics, and three lignite samples were taken from near the top of the formation. Two additional lignite samples were added from the Red Hills Mine (Jardine and Harrington 2008) in Ackerman, Mississippi (33°22′42″N, 89°13′54″W). These samples were collected in November 2000, and were included to provide a more similar sampling strategy for each study area, with lignite samples occurring at the top and bottom of each formation. This distribution ensures that temporal changes between lignite samples within each formation can be accounted for when assessing spatial differences between formations.

Sample Processing

Preparation of the clastic samples followed a basic procedure of maceration of 20 g of material with a mortar and pestle, successive treatments of 10% HCl, 40% HF, and 10% HCl to digest carbonates and silicates, and a light oxidation to remove excess amorphous organics (2 min in 70% HNO3). Between most stages, residues were sieved with a 10 µm mesh. Lignite samples were processed by crushing finely 2 g of sediment, and oxidizing for 25 min in 70% HNO3 and 10 s in household bleach. Residues were sieved at 10 µm after each oxidation treatment. Large fragments of carbonized material were sieved off with a 180-µm mesh. Finally residues were stained with safranin and aliquots were mounted onto coverslips.

Data Collection

A target sample size of 300 sporomorph grains was set for each sample. This was not achieved for all samples, however; samples with counts of <150 grains were discarded from the analysis. Where one morphotype accounted for >100 grains, and where sample sizes permitted, counts were increased so that at least 200 grains of the non-dominant type were included. Previous studies of GCP sporomorphs have typically focused on a restricted taxonomic group within an assemblage (e.g., Tschudy 1973a, 1975; Frederiksen and Christopher 1978; Frederiksen 1994) or on just those taxa that are relevant to biostratigraphy (Frederiksen 1979, 1980a, 1991, 1998). Because our interest in this study is robust estimation of compositional differences between study sites, we included all taxa encountered on the slides in the sample count. Those that were not recognized from the published literature were assigned an informal name based on likely form-genus membership.

A total of 139 morphospecies and 19,430 specimens were recorded from the 56 samples (see Supplementary Table 1 for the sporomorph data set). Table 1 provides summary statistics for all of the samples analyzed. The widely used Chao2 species richness estimator (Colwell and Coddington 1994) gave a richness estimate of 153.9 taxa ±15.3 (95% confidence interval) for the whole data set, which suggests that the data are not severely undersampled. Prior to data analysis all taxa that occur only in one sample were discarded from the data set. This led to the removal of 25 taxa, 21 of which were represented by only one specimen. This procedure ensures that compositional dissimilarities are not artificially inflated by differing sampling intensities (Foote and Raup 1996; Olszewski and Patzkowsky 2001a) and removes noise from the data set (Jaramillo 2002).

Using sporomorphs from across the U.S. Gulf Coast, Fairchild and Elsik (1969) broadly correlated the Tuscahoma Formation of Mississippi and Alabama with the Rockdale Formation (now regarded as the Calvert Bluff Formation and the underlying Simsboro and Hooper Formations [Elsik 1978]) of Texas. We used biostratigraphically important sporomorph taxa within our samples to confirm and refine this correlation. The Tuscahoma Formation extends over approximately the last 1 Myr of the Paleocene and into the earliest Eocene. The Paleocene/Eocene boundary in Mississippi and Alabama lies in the upper part of the Tuscahoma Formation, as evidenced by the presence of early Wasatchian mammal teeth in the uppermost Tuscahoma Formation at the Red Hot Truck Stop in Meridian, Mississippi (Beard 2008). The Paleocene/Eocene boundary is missing in the OSM#2 Wahalak core (Harrington et al. 2004), and all Tuscahoma Formation samples in this study are late Paleocene in age.

Fourteen samples taken from the three localities at the very top of the Calvert Bluff Formation yielded no diagnostic Eocene taxa known from the uppermost part of the Tuscahoma Formation (e.g., Brosipollis spp., Celtis tschudyi, Granulatisporites luteticus, Interpollis microsupplingensis, Nuxpollenites spp., Platycarya spp., Retistephanocolporites sp., Symplocos? contracta and Thompsonipollis sabinetownensis [Tschudy 1973b; Frederiksen 1998; Harrington 2003; Harrington et al. 2004; Harrington and Jaramillo 2007]); only Paleocene marker taxa were recorded (Holkopollenites chemardensis, Insulapollenites rugulatus, Lanagiopollis cribellata, Lanagiopollis lihoka, Momipites strictus, Retitrescolpites anguloluminosus, Spinaepollis spinosa and Trudopollis spp. [Frederiksen 1998; Harrington 2008]). This concurs with Crabaugh and Elsik (2000) and Elsik and Crabaugh (2001), who placed the first appearance of Platycarya spp. at the contact between the Calvert Bluff Formation and the overlying Sabinetown Formation. From the localities that we have studied, we therefore consider the Calvert Bluff Formation to be entirely late Paleocene in age.

Only a few sparsely represented pollen taxa can be used to differentiate the Tuscahoma Formation from the underlying Nanafalia Formation (Frederiksen 1998). Pistillipollenites mcgregorii first appears at the base of the Tuscahoma Formation (Frederiksen 1998), and this is present in samples from the base and top of the Calvert Bluff Formation. Furthermore, Dickey and Yancey (2010) recovered Bagelopollis verrucatus, which first appears in the middle of the Tuscahoma Formation (Frederiksen 1998), from the Red Bluff locality at Bastrop (uppermost Calvert Bluff), along with the Paleocene marker taxon Spinaepollis spinosa. The illustrated Platycarya of Dickey and Yancey (2010) is atypical of the form genus and not representative of the classical form that is found in the early Eocene. The Paleocene part of the Tuscahoma Formation and the Calvert Bluff Formation are thus treated as being laterally equivalent and broadly coeval, with both formations representing approximately 1 Myr of the latest Paleocene.

The Biogeographic Structure of the Paleocene GCP Palynoflora

Methods

Data analysis was carried out using R, version 2.11.1 (R Development Core Team 2010), with the package “vegan,” version 1.17–4 (Oksanen et al. 2010). The R code used in these analyses is provided in the online supplementary information. We used additive diversity partitioning (Lande 1996) to assess the distribution of diversity within and between samples and formations, and two exploratory multivariate techniques, ordination and cluster analysis, to identify compositional patterns and taxonomic associations within the data set. Both ordination and cluster analysis can be used with abundance or incidence data. Although we have abundance information in the form of sporomorph sample counts, we have limited our analyses to incidence data. Sporomorphs vary in their transportation potential because of differences in size, shape, and density (Heusser 1988; Holmes 1994). Sorting by water flow therefore may bias the relative abundances of sporomorph taxa differently in each formation, given some level of environmental variation across the GCP. Furthermore, ordinations and cluster analyses carried out with abundance data also gave results highly similar to those based on incidence data. We used the Jaccard similarity coefficient for these analyses because this metric is considered to perform well in ecological studies (Shi 1993; Legendre and Legendre 1998).

We have correlated the Calvert Bluff and Tuscahoma Formations by using biostratigraphically important sporomorph taxa. Using these same taxa to assess compositional similarity between formations may bias our results by introducing circularity. Arguably, however, these taxa were a part of the GCP flora, and excluding them from comparisons would be expected to increase the differentiation between the two formations, increasing the chances of a Type 1 statistical error (i.e., spuriously finding a significant difference [Hammer and Harper 2006]). Repeating the analyses with the biostratigraphic taxa removed did not alter the results; therefore we present our findings for the full data set, with biostratigraphic taxa included.

ADP decomposes the total diversity (richness or an appropriate measure of evenness) of a set of samples into α and β components using the general formula γ = α + β (Lande 1996). The β component can easily be calculated by rearranging this equation, so that β = γ − α. This treatment of β diversity has the attractive property of using the same unit of measurement for α, β, and γ diversity (Crist et al. 2003). Therefore γ is the total diversity of the pooled set of samples, α is the mean within sample diversity, and β is the mean among sample diversity, or the average diversity not found in any one sample (Veech et al. 2002). Applying ADP to the scenarios in Figure 1A gives α = 5, β = 2 and γ = 7 for the left-hand set of samples (“Low β”), and α = 5, β = 20, and γ = 25 for the right-hand set of samples (“High β”).

Where the sampling scheme is nested and hierarchical, the additive approach can be extended to take into account ever increasing scales of sampling (Wagner et al. 2000). In this study, samples (sampling level one) are nested within formations (sampling level two), which are nested within the total “regional” sampling scale γ (Fig. 3). Thus γ = α2(formation) + β2(formation), and α2(formation) = α1(sample) + β1(sample). By substitution, γ = α1(sample) + β1(sample) + β2(formation). It is therefore possible to describe which sampling level contributes most to total diversity. Given our operational definition of β diversity as the heterogeneity within a group of samples, ADP is the ideal tool to quantify it in a complex, nested sampling design.

Rather than using individual-based rarefaction (Gotelli and Colwell 2001), ADP corrects for varying sample sizes by using a weighting factor in obtaining mean α diversity. The mean α diversity of j samples at each hierarchical level i is obtained by αi = ∑Dijqij, where Dij is the diversity of each sample, and qij are the sample weights calculated by the proportion of the total number of individuals in level i that are found in each sample j (Crist et al. 2003). Larger samples are therefore given more weight, under the assumption that they provide a more reliable estimate of diversity (Layou 2007).

Crist et al. (2003) increased the explanatory power of ADP by developing a statistical framework for hypothesis testing. Crist et al. (2003) used randomization procedures to test for departures from random chance, by randomizing either individuals or samples. We used the individual-based approach to test the hypothesis that the observed diversity partitioning could have been explained by a random distribution of individuals among samples at all hierarchical scales (Crist et al. 2003; Gering et al. 2003). We can then ask whether the between-formation (β2) level is larger than expected if individuals were randomly distributed in space (i.e., part of the same floristic assemblage). Therefore, this approach provides a test of whether composition is significantly different between the plant assemblages on the eastern and western GCP. If so, the within-formation (α2) level must be smaller than expected by random chance, because in each sampling level γ = α + β. This pattern is what we would expect given high levels of intra- and interspecific aggregation of individuals within the two formations. Additive partitioning combined with a null model has been used in various settings in the ecological and conservation biology literature (e.g., Crist et al. 2003; Gering et al. 2003; Summerville et al. 2003a,b, 2006; Veech and Crist 2007); to our knowledge this is the first study to apply it to deep-time paleoecology.

The randomization test works by shuffling individuals between samples, while preserving the original sample sizes and overall taxon abundances (Crist et al. 2003). Samples are then pooled into formations, and the randomized data set is partitioned into alpha and beta components, using the same procedure as with the actual data. By repeating this process many times (9999 in the present case), the null distributions of α and β estimates for each sampling level are produced. Probabilities can then be attached to the observed (actual) partitions by assessing the proportion of null values that are greater or less than them (Crist et al. 2003; Gering et al. 2003). For example, if one in 9999 of the expected null diversity values is greater than or equal to the observed value, then there is a probability (p-value) of 0.0001 of obtaining an estimate the same or greater than the observed value by chance. In this case the result is highly significant, and the null hypothesis is rejected (Legendre et al. 1994; Crist et al. 2003).

ADP can be carried out on any diversity measure that is concave, i.e., where γ is always equal to or greater than α (Lande 1996). Such measures include species richness and the Simpson and Shannon dominance metrics. Here we have used species richness because we are interested in taxon co-occurrence patterns between samples and formations, which is most clearly tested by considering the average number of taxa present (or not present) in each sample and formation compared to the null model, rather than the evenness/dominance structures at each sampling level. The interpretability of relative abundance distributions, which are described by evenness metrics, is also not clear for sporomorph samples. Different pollen dispersal mechanisms (especially animal versus wind dispersal) greatly influence the quantity of pollen produced by different taxa (Jackson 1994; Bush 1995; Bush and Rivera 2001), which likely will distort the true relative abundances of the parent plant species (Odgaard 1999).

Sample Ordination

To further understand the compositional structure of the Paleocene GCP palynoflora, we ordinated the samples using nonmetric multidimensional scaling (NMDS). Ordination techniques project complex multivariate data onto a minimal number of axes; samples that are similar in composition appear close to one another in ordination space, and those that are less similar appear farther away. Ordinations are powerful techniques for detecting compositional patterns and gradients in community data sets (Legendre and Legendre 1998), whereas ADP can only provide information on predefined groups of samples (formations in the present case). ADP also averages across samples at each level in the sampling hierarchy, and so it does not inform about the relationships between individual samples. NMDS is a nonparametric ordination approach that uses ranked distances between samples, rather than the distances themselves (ter Braak 1995; Legendre and Legendre 1998; Hammer and Harper 2006). NMDS has a long history of use in ecological studies, and has been shown to portray intersample distances well (Kenkel and Orloci 1986; Minchin 1987; Clarke and Green 1988; Shi 1993; Bush and Brame 2010).

Cluster Analysis

We used agglomerative, hierarchical cluster analysis as a complement to NMDS, and to explore taxonomic associations across the GCP. Like ordination methods, cluster analysis is a multivariate technique that depicts structure within the data. Cluster analysis classifies samples (Q-mode) or species (R-mode) into groups based on the similarity of their compositions or associations respectively (Sneath and Sokal 1973; Legendre and Legendre 1998). By its nature hierarchical cluster analysis imposes a nested, hierarchical structure on the data whether one is present or not (Springer and Bambach 1985); however, comparison with NMDS ensures that spurious groupings will not be misinterpreted.

We carried out a two-way cluster analysis (Springer and Bambach 1985; Olszewski and Patzkowsky 2001b), in which the dendrograms from both Q-mode and R-mode cluster analyses are used to reorder the rows and columns of the original samples by species matrix. This allows clusters of taxa to be related to clusters of samples, to better understand the taxonomic structure of sample groupings. We used the Unweighted Pair Group Method with Arithmetic Averaging (UPGMA) linkage algorithm (van Tongeren 1995; Legendre and Legendre 1998) to produce cluster dendrograms. The UPGMA linking algorithm is an intermediate between the single- and complete-linkage algorithms; it minimizes the artificial gradients imposed by the former and artificial discrete groupings imposed by the latter (van Tongeren 1995).

Results

Additive Diversity Partitioning

When partitioning is carried out on all samples (Fig. 4), the observed data show that the largest partition is the within-formation (α2) level ( = α1(sample) + β1(sample)), which makes up 90% of total richness. Partitioning this into α and β components shows the importance of the among-sample (β1) level, which makes up 56% of the total richness. The between-formation (β2) level is the smallest partition, constituting only 10% of the total. However, when compared to the expected partitioning produced by the randomization procedure, it can be seen that both the β1 and β2 levels are significantly larger than expected by chance (p < 0.001). Average within-sample (α1) and within-formation (α2) richness contribute significantly less to the total richness than would be expected by chance (p < 0.001). A significant compositional difference between formations is therefore supported, although among-sample heterogeneity is clearly important.

Additive partitioning of species richness for all samples (left) and lignite samples only (right). “Observed” refers to partitions for the actual data, and “Expected” to partitions derived from the randomization procedure (9999 randomizations). Plus sign indicates that the observed partition is significantly larger than expected (p < 0.001), and minus sign that it is significantly smaller than expected (p < 0.001). N.S. = not significant at p = 0.05. α2(formation) = α1(sample) + β1(sample).

When partitioning is carried out on the lignite samples only (Fig. 4), the α2 and β2 levels are no longer significantly different from the null model. As with the analysis of the whole data set, observed α1 richness contributes significantly less to the total richness than would be expected by chance (p < 0.001), and β1 contributes significantly more (p < 0.001). Both α1 and α2 richness constitute proportionally more of the total richness in the lignite samples than in the whole data set (Fig. 4). Total richness is lower (55 taxa as opposed to 114 for all samples combined) because not all taxa occur in the 11 lignite samples.

The significant compositional difference between formations was confirmed with nonparametric multivariate analysis of variance (NPMANOVA) tests (Anderson 2001). NPMANOVA is analogous to a traditional multivariate ANOVA, but being nonparametric, it makes no assumptions about multivariate normality (Anderson 2001). We ran the tests on incidence data using the Jaccard similarity metric (Shi 1993). Highly significant differences were found both for the whole data set (F = 4.55, p = 0.001) and, in contrast to the ADP analysis, for lignite samples only (F = 1.74, p = 0.048). This suggests that NPMANOVA may be more robust to smaller sample sizes, or lower between-group dissimilarities.

NMDS

When plotted (Fig. 5), the samples from the Tuscahoma (open data points) and Calvert Bluff (black data points) formations are well separated in ordination space, with only minimal overlap. Importantly, the lignite samples (circles) maintain this separation, demonstrating that they show the same overall pattern as the space-averaged marginal marine clastic samples. Whereas the samples from the two formations are separated along NMDS axis two, the lignite and clastic samples are separated along NMDS axis one. This demonstrates a clear lithological control on composition, with clastic samples having additional compositional inputs from outside the immediate area of the coastal swamps.

To explore fine-scale compositional patterns within the formations, we ran NMDS on each formation separately. When the samples from the Calvert Bluff Formation are ordinated (Fig. 6A), there is a clear distinction between the lignite (circles) and clastic (triangles) samples on NMDS axis 1. Clastic samples from the top (black symbols) and bottom (open symbols) of the formation occur in two overlapping clusters, whereas the lignite samples occur in one cluster. The Tuscahoma Formation (Fig. 6B) shows a general compositional separation between the clastic and lignite (circled) samples. As with the Calvert Bluff lignites there is no obvious separation temporally or spatially. The two clastic samples occurring close to the lignites occur immediately above and below a lignite seam in the OSM#2 Wahalak core. The remaining clastic samples from the OSM#2 Wahalak core do not show any obvious gradient or grouping through time.

Cluster Analysis

Several main points can be drawn from the two-way cluster analysis (Fig. 7; see Appendix for taxon names, and see also Supplementary Figure 1 for a large-scale version with taxon and sample names included). Samples (the upper dendrogram) form clusters based primarily on lithology: clastic samples occur mostly in clusters A to F, and lignite samples occur exclusively in clusters G and H. Clusters within these groups are dominated by one formation or the other, but there is no single break between Calvert Bluff and Tuscahoma samples. This is to be expected, given that the NMDS (Fig. 5) revealed some overlap between formations. Taxa (the dendrogram to the right) form less clear clusters than the samples, with many taxa occurring individually rather than in groups. Because “chaining” is not a characteristic bias of the UPGMA algorithm (van Tongeren 1995), we are confident that this represents a genuine gradient in the data, reflecting an increase in the incidence of taxa among samples from cluster I to cluster VIII. Samples occupy a smaller range of similarity values (between 0.3 and 0.6) than do taxa (0.0 to 1.0). Using both R- and Q-mode analyses to relate taxon associations to sample clusters reveals that most taxa are not limited to one group of samples; instead associations of taxa are diffuse among samples, formations, and lithologies. This is unsurprising, given that 92 of the 114 taxa in the data set ( = 81%) are shared between the two formations. The most common ranked taxa (those that occur in multiple samples) are also largely the same in each formation (Fig. 8). Some relationships can be picked out from the two-way cluster analysis, however. Lignites (clusters G and H) are marked by the lack of taxa from cluster VI (e.g., Cingulatisporites sp., Ericipites spp., Ulmipollenites krempii, Camarozonosporites sp., Momipites waltmanensis, Bombacacidites reticulatus, Caryapollenites wodehousei; and the biostratigraphic marker taxa Retitrescolpites anguloluminosus, Momipites strictus, and Pistillipollenites mcgregorii) and by the presence of taxa from cluster V (e.g., Aesculiidites circumstriatus, Momipites triradiatus, Caprifoliipites spp., Horniella spp. and Rhoipites sp. 1). Cluster A contains mainly Calvert Bluff clastics and is separated by cluster II taxa (e.g., Intratriporopollenites pseudinstructus, Insulapollenites rugulatus, Spinizonocolpites sp. 1, Sequoiapollenites spp. and Toroisporites spp.); clusters B and E are delineated by cluster III taxa (e.g., Ailanthipites berryi; the Normapolles Basopollis obscurocostatus, Trudopollis plenus, and Pseudoplicapollis limitatus; and the spore Zlivosporis novamexicanum).

Incidence plots for the 15 most common taxa within each formation. The y-axis shows the number of samples in which each taxon occurs. Asterisks denote the taxa in each formation that are not among the 15 most common taxa in the other formation.

Compositional Heterogeneity in the Paleocene GCP Palynoflora

We have demonstrated that the flora across the GCP during the late Paleocene was not compositionally homogeneous. A separation between the Calvert Bluff Formation in Texas and the Tuscahoma Formation in Mississippi and Alabama was demonstrated by the ADP analysis (when both clastic and lignite samples were used), NPMANOVA (when using the whole data set or with the lignite samples only), NMDS, and cluster analysis. Similar floral discontinuities on regional scales have been discovered in Amazonia and have been related to changes in soil characteristics and underlying geology (Pitman et al. 2008; Coronado et al. 2009; Duque et al. 2009). Ecological processes within the GCP floras can only be speculated upon, given the lack of knowledge of fine-scale environmental and habitat changes within this region and of the environmental tolerances of plant taxa (or even what the plant taxa are below family level). No geographic barrier to species interchange across the GCP is known, apart from the subsiding Mississippi embayment (Cox and Van Arsdale 2002).

In the ∼1-Myr time window of this study, compositional differentiation was mainly in the spatial dimension; compositional turnover through time was only of secondary importance in driving among-sample heterogeneity. NMDS showed some differentiation between clastic samples from the top and bottom of the Calvert Bluff Formation (Fig. 6), but this was less important than the differentiation between the two formations (Fig. 5). All lignite samples from the Calvert Bluff Formation plotted close together, demonstrating stability in these swamps through time and space (∼200 km). The Tuscahoma samples did not reveal any clear temporal trends (Fig. 6), although the differences in sampling distribution and intensity between the two formations may have influenced this. A general pattern of change over time has previously been recorded within the Tuscahoma Formation (Harrington and Jaramillo 2007), with samples taken from the base of the formation being compositionally different from those at the top. The pattern of change on both western and eastern sides of the U.S. Gulf Coastal Plain can be attributed to the long-term climate warming in the late Paleocene evident from isotope and other climate proxy data (Wilf 2000; Fricke and Wing 2004; Zachos et al. 2008).

The most pervasive segregation observed in the multivariate analyses is lithological rather than spatial. Clastic and lignite samples clearly separate out in the NMDS and cluster analysis plots (Figs. 5–⇑7), although no taxa are completely restricted to the swamp environments (Fig. 7). This lithological differentiation demonstrates the need to take taphonomic and environmental factors into account in spatial analyses. Lower within-sample richness in the lignites relative to the clastic samples, as demonstrated by the ADP analysis (Fig. 4), is a common feature of such deposits (Nichols and Traverse 1971; Jaramillo et al. 2007; Harrington 2008; Jardine and Harrington 2008). This may be caused by the restricted sporomorph source area for swamps relative to marginal marine and lacustrine settings (Nichols and Traverse 1971), or the special adaptations needed by plant taxa to inhabit a swamp environment (Fitter and Hay 2002). The clastic sediments sample both the local and regional vegetation (Harrington 2008). The maintenance of the geographic separation between formations in the lignite samples, as demonstrated by the NMDS and NPMANOVA test, means those extra-local taxa are not biasing the signal displayed by the data set as a whole.

Despite the prevalence of compositional heterogeneity in this biome (both among samples and between formations), many common taxa are widespread across the GCP (Figs. 7, 8). Taxonomic associations are diffuse and recur within and between formations, which suggests a regional species pool arranged into repeated, co-occurring groups of taxa distributed across the GCP. Similar patterns have been recorded in both temperate (Nekola and White 1999) and tropical (Pitman et al. 1999, 2001; Macia and Svenning 2005; Pitman et al. 2008) forests in areas of low environmental heterogeneity.

Relating the Paleocene GCP to Holocene biomes

Methods

Holocene Data Set Details

To compare the spatial dynamics of the Paleocene GCP with modern biomes, we used previously published Holocene lake core data. All data were taken either from the online Neotoma database (the new repository for the Global Pollen Database; http://www.neotomadb.org/), or from personal communications (details of Holocene data sets used are available in Supplementary Table 2 and Supplementary Figure 2). Tropical sites (n = 5) were taken from the Colombian and Brazilian Neotropics, close to the equator. Extra-tropical sites were taken from eastern North America, and were divided into warm mixed forest (25°N to 36°N; n = 11) and temperate deciduous forest (26°N to 45°N; n = 11). North American biome names and coverage were taken from Williams et al. (2000). The warm mixed forest (also called broadleaved evergreen forest by Williams et al. [2000]) includes the modern GCP vegetation. Sites were chosen so that each biome covers a similar area, and all are cores from lakes less than 600 m above sea level.

Results for distance-decay regression of Holocene lake sites. See text for details.

We selected pollen samples dating from 3000 to 3500 Ka as a suitable time interval for comparing spatial patterns between Holocene biomes, because this post-dates the major climatic changes of the earlier Holocene and pre-dates major human alteration of the landscape (Mayle et al. 2000; Williams et al. 2000; Berrio et al. 2002; Munoz and Gajewski 2010). Holocene pollen samples typically represent years to tens of years (Jacobson and Bradshaw 1981); assuming a constant rate of sedimentation, samples from the OSM#2 Wahalak core represent 70–140 years (given a 1-Myr duration and 150 m of core for the Tuscahoma Formation, and 1–2 cm of core thickness per sample). Sedimentation rates were not uniform throughout deposition of the Tuscahoma Formation (Mancini and Tew 1991, 1995), so time-averaging for the GCP data probably extends into several hundred years for some samples. To emulate this increased time-averaging in the Holocene data, two or three consecutive samples within each core were pooled. The Holocene samples used in this analysis therefore encompassed 200 to 700 years.

Data Analysis

ADP is not appropriate for comparing the Paleocene and Holocene data because most of the Paleocene localities produced multiple samples arranged through time, leading to temporal pseudoreplication (Srivastava 1999) in within- and between-sample comparisons that is not present in the Holocene data. We instead analyzed the distance decay of compositional similarity in the Holocene biomes, by regressing inter-site compositional similarities onto inter-site geographical distances (Nekola and White 1999). A steeper slope demonstrates a more rapid decay of similarity with distance, and so a higher level of spatial turnover. Compositional similarity was measured as the natural logarithm of the Jaccard coefficient, which has given the best linear fit in similar studies of modern ecological systems (Nekola and White 1999). Geographic distance was measured as the great circle distance between sites, calculated using the R package “spdep,” version 0.4–7 (Bivand et al. 2007).

Using pairwise distances violates assumptions of independence in parametric linear regression (Legendre et al. 1994; Nekola and White 1999). Because parametric p-values could therefore not be used, we used a permutation procedure, as outlined in Legendre et al. (1994), to assess the significance of the regressions. This procedure is analogous to the randomization test used with the ADP analyses (Crist et al. 2003). The response (compositional similarity) matrix is randomized, the regression performed, and the F-value stored (the R2-value can also be used, as F is a monotonically increasing function of R2 [Legendre et al. 1994]). This procedure is repeated many times (9999 here), to create a distribution of null values under the expectation of random chance. The p-value is then the proportion of expected (permuted) F-values smaller than the observed F-value (i.e., the F-value from the original, unpermuted data).

The pervasive temporal pseudoreplication in the Paleocene GCP data prevented us from using linear regression as described above. Hence, we calculated the mean intersample compositional distance between the Calvert Bluff and Tuscahoma Formations, and compared this with the Holocene data by plotting it on the same distance-decay graphs, using 819 km as the geographic distance between sampling locality midpoints for each formation. Only clastic samples were used, to maintain isotaphonomic consistency. The intersample distances were not normally distributed (Shapiro-Wilk test of normality, W = 0.98, p < 0.001), so 95% confidence intervals for the mean distance were calculated through a bootstrapping procedure. The inter-formation, intersample compositional similarities were resampled, with replacement, 10,000 times. The mean was calculated for each resampling run, and the 2.5% and 97.5% quantiles taken from this distribution.

Lake sediments are dominated by sporomorphs produced by vegetation in the surrounding tens to hundreds of meters, and maximum sporomorph source radii are typically on the order of tens of kilometers (Jacobson and Bradshaw 1981; Sugita 1993; Davis 2000). The sporomorph source areas for the marginal marine samples of the Paleocene GCP are uncertain. However, given that the sedimentary systems on both sides of the GCP are associated with major deltaic complexes, and are therefore likely to have received significant fluvial sporomorph input, the source areas for these deposits are expected to be larger than for the Holocene lakes. As discussed above, increasing the duration of the sampling interval through artificial time-averaging is expected to increase the effective sampling area (White et al. 2010). To further investigate the effects of increased spatial grain on the distance-decay analysis, we pooled Holocene sites into zones with diameters of 300 km (see Supplementary Table 2). This diameter was chosen as a trade-off between number of zones and number of sites/zone. Both temperate and warm mixed biomes were divided into four zones of two or three sites. This procedure was not carried out for the tropical biome, owing to the low number of sites. We then carried out regressions as described previously, using the great circle distance between zone midpoints.

Results

Regression of compositional similarity onto geographic distance (Table 2, Fig. 9A) reveals that the rate of spatial turnover in the Holocene tropical biome is greater than that of the two extra-tropical biomes. The tropical biome also has a higher intercept than the extra-tropical biomes. This means that geographically proximal sites are more similar in the tropics than in extra-tropical biomes, up to the intersection of lines at ∼900 km between sites. At greater distances between sites tropical floras are relatively less similar than extra-tropical ones.

When the regression is rerun with the spatially averaged temperate and warm mixed data sets (Fig. 9B), the warm mixed forests show both a steeper slope and a higher intercept than the temperate forests. Here the two regression lines cross at about 800 km between sites. As with the tropical versus extra-tropical biomes, this shows a higher rate of distance decay in the warm mixed forests, and scale dependency in relative compositional similarity between biomes. As expected, similarity values are higher for the spatially averaged (pooled sites) data set than for the individual sites regression.

The GCP data (the black cross in Fig. 9) plot out very close to the intersection point of the Holocene biome regression lines in the individual sites regression (Fig. 9A). Thus, given the distance between sampling localities in the Paleocene data and the spatial structure of the Holocene biomes, we cannot tell which biome the Paleocene GCP most closely resembles. Comparison with the spatially averaged data (Fig. 9B) shows that the GCP is more dissimilar than either the warm mixed or temperate biomes at this geographic distance. The GCP mean similarity value shown is for clastic samples only; the same value for the entire data set (with the 11 lignite samples included) is −0.88, versus −0.83 for clastic samples only. This pushes the GCP data point down toward the Holocene regression line intersection in Figure 9A, and further away from the Holocene data in Figure 9B.

The Paleocene GCP versus Holocene Tropical and Extra-Tropical Biomes

Directly relating the Paleocene GCP palynoflora to the Holocene biomes depends on how the sporomorph source area for the GCP marginal marine deposits is reconstructed. If these sediments are assumed to have a sporomorph source area approximately similar to that of the Holocene lakes then the regional-scale spatial structure of the GCP vegetation was consistent with those of modern biomes. This suggests that essentially modern spatial patterns were established in plant assemblages by at least the start of the Cenozoic. Alternatively the GCP clastics had a sporomorph source considerably larger than the Holocene lakes, which means that the spatially averaged distance-decay analyses were a better comparison. In this case the GCP flora was more heterogeneous than that of modern extra-tropical biomes. How it would relate to modern tropical vegetation requires further study.

The comparison between modern biomes has also yielded data useful for both modern and deep-time studies. Making contrasts between modern tropical and extra-tropical biomes are currently problematic because direct comparisons are scarce in the ecological literature (although see Gilbert et al. 2010). Furthermore, spatial processes that have previously been associated with hyper-diverse tropical plant communities (such as stochastic, dispersal-driven assembly [Hubbell 2001] and density-dependent mortality of conspecifics [e.g., the Janzen-Connell hypothesis; Comita et al. 2010]) have recently been shown to be important in structuring temperate forests as well (Nakashizuka 2001; Gazol and Ibáñez 2009; Shibata et al. 2010; Zhang et al. 2010). These results are therefore of interest not only for relating deep-time regional-scale spatial patterns to those of the modern day, but also because they provide a valuable comparison of spatial patterns in modern tropical and extra-tropical biomes.

According to our results, tropical biomes have higher rates of spatial turnover than extra-tropical biomes, which is consistent with expectations of a latitudinal gradient in spatial turnover (Stevens 1989; Qian and Ricklefs 2007) that extends to the equator. The distance-decay relationship between biomes is also scale dependent, with tropical biomes being more similar at smaller spatial scales than extra-tropical biomes, despite the more rapid decline in similarity with geographic distance. These findings deserve testing more thoroughly with species-level data in modern floras, but they potentially provide diagnostic signatures of tropical versus extra-tropical vegetation that can be recognized and assessed in the terrestrial plant fossil record.

Summary and Conclusions

Our results have demonstrated the importance of spatial heterogeneity in the Paleocene GCP palynoflora. A pervasive compositional separation between the Calvert Bluff and Tuscahoma Formations was revealed by all analyses, despite the presence of many geographically widespread taxa. This separation was also maintained by the lignite samples, which means that the pattern was not caused by extra-regional taxa being transported into the GCP, and is a genuine property of the regional vegetation type. A lithological separation between clastic and lignite samples, driven by both shared presences and absences of taxa among the lignites, shows the importance of taking lithological and taphonomic considerations into account when analyzing compositional variability in fossil assemblages.

The Holocene sporomorph record provides a bridge between neoecological studies of modern plant communities and shallow and deep-time paleoecological studies of those in the past (Harrington 2004; Jaramillo et al. 2006). The combination of rate and scale dependency of spatial turnover in the Holocene biomes confounded our efforts to associate the Paleocene GCP paratropical flora more closely with any specific modern vegetation type. Artificially increasing the spatial grain of Holocene sampling by combining samples from multiple lake cores decreased the rate of distance decay of compositional similarity, and suggests that the Paleocene GCP flora was more heterogeneous than modern extra-tropical biomes. These results therefore illustrate the importance of both spatial grain and scale dependency in determining spatial patterns in biotic assemblages, and the potential of incorporating spatially explicit data into paleoecological studies. The Holocene data support a latitudinal gradient in spatial turnover in modern biomes, with higher levels of turnover in tropical versus extra-tropical vegetation types.

More generally, our results show that with careful analysis useful spatial data can be extracted from the fossil sporomorph record, despite low levels of taxonomic resolution imposed by pollen and spore morphology and, in the present case, the relatively low level of spatial resolution for most samples. Sporomorph data are capable of determining large-scale floral discontinuities, and taphonomic and environmental differences between samples related to lithology. Given the interest in spatial patterns and processes in modern plant communities (e.g., Hubbell 2001; Condit et al. 2002; Duque et al. 2009; Gazol and Ibáñez 2009), the fossil sporomorph record will be invaluable for assessing their maintenance and origins in deep time.

Acknowledgments

We thank C. Jaramillo and S. Punyasena for detailed reviews that greatly improved the content of this paper. H. Behling, J. C. Berrio, and F. Mayer are thanked for providing Holocene data. We thank D. Kowalski of Walnut Creek Mining Company, K. Wallis of U.S. Silica Mine, D. Dockery III, and R. Carroll for their help and logistic support during fieldwork on the U.S. Gulf Coast. T. Janus also aided with fieldwork. A. Moss provided technical and laboratory support. This research was funded by a Sylvester-Bradley award from the Palaeontological Association, a grant from the Geological Society of London Timothy Jefferson Fund, and a School Ph.D. Studentship at the University of Birmingham to P.E.J. A Hodson Award to G.J.H. is acknowledged for help with fieldwork expenses.

2010.
Multiple paleoecological controls on the composition of marine fossil assemblages from the Frasnian (Late Devonian) of Virginia, with a comparison of ordination methods.
Paleobiology36:
573–
591.

1994.
Relationships of palynofacies to coal-depositional environments in the upper Paleocene of the Gulf Coast Basin, Texas, and the Powder River Basin, Montana and Wyoming. Pp.
217–
237inTraverseA.ed.
Sedimentation of organic particles.
Cambridge University Press,
Cambridge.