Abstract

Penguins are a remarkable group of birds, with the 18 extant species living in diverse climatic zones from the tropics to Antarctica. The timing of the origin of these extant penguins remains controversial. Previous studies based on DNA sequences and fossil records have suggested widely differing times for the origin of the group. This has given rise to widely differing biogeographic narratives about their evolution. To resolve this problem, we sequenced five introns from 11 species representing all genera of living penguins. Using these data and other available DNA sequences, together with the ages of multiple penguin fossils to calibrate the molecular clock, we estimated the age of the most recent common ancestor of extant penguins to be 20.4 Myr (17.0–23.8 Myr). This time is half of the previous estimates based on molecular sequence data. Our results suggest that most of the major groups of extant penguins diverged 11–16 Ma. This overlaps with the sharp decline in Antarctic temperatures that began approximately 12 Ma, suggesting a possible relationship between climate change and penguin evolution.

1. Introduction

Penguins are a group of flightless birds living in a wide range of habitats from the tropical Galapagos Islands to the frozen Antarctic continent. Although ancient penguin fossils have been found close to the equator [1], it has been suggested that the most recent common ancestor (MRCA) of extant penguins originated in the Antarctic and later diversified out of this continent [2]. Here and throughout this paper, we refer to all extant penguins as crown penguins, which belong to the family Spheniscidae, and the term Sphenisciformes is used to denote crown plus stem groups. Palaeontological studies suggest that the ages of the oldest crown penguin fossils are less than or equal to 10 Myr [1,3–5]. Previous studies using morphological and molecular datasets suggested that the MRCA of crown penguins was approximately 16 Myr [1,5,6]. By contrast, studies based solely on molecular sequence data estimated the age of the MRCA of crown penguins to be 41–51 Myr and the ages of different crown penguin groups to be 21–38 Myr [2,7–9].

The molecular dating analyses in previous studies used only mitochondrial genes [7–9] or mitochondrial genes and a single nuclear protein-coding gene [2] to estimate the divergence times. Since mitochondrial genes are inherited as a single linked unit, the above data could strictly be regarded as a single locus or two independent loci. Furthermore, some of these studies used deep external calibrations (more than 100 Myr), which may not be appropriate for dating more recent divergence events such as the crown penguin history. To address these limitations, we sequenced five introns (belonging to four genes) from 11 penguin species and used recent crown penguin fossils to calibrate the molecular clock.

2. Material and methods

Blood samples were obtained from the 11 penguin species shown in figure 1. All samples were PCR-amplified for introns of four genes, adenylate kinase intron 5 (AK1i5), myelin proteolipid protein intron 4 (MPP4), ornithine decarboxylase introns 6 and 7 (ODC6) and ubiquitin carboxyl-terminase esterase L3 intron 5 (UCHL3). All products were sequenced in both directions, using the forward and reverse primers. The PCR and sequencing methodologies are detailed in the electronic supplementary material.

Bayesian chronogram showing the divergence times between major groups of penguins. Numbers are node identifiers and the details of the time estimates for each node are given in table 1. Arrows indicate calibration nodes (Myr): 1 = 0–86, 3 = 61–86, 6 = 7.6–20, 7 = 10–20 and 8 = 9.2–20 (see the electronic supplementary material). (Online version in colour.)

DNA sequences from the five introns were concatenated. In addition, sequence data for the mitochondrial genes 16S rRNA, 12S rRNA, COX1 and CYTB as well as for the nuclear gene RAG from penguins, kelp gull (Larus dominicanus), peregrine falcon (Falco peregrinus) and zebra finch were obtained from GenBank (see the electronic supplementary material). From the cDNA sequences, first and second codon positions were separated from third codon positions. We concatenated 16S and 12S rRNA genes and combined the first plus second codon sites from COX1 and CYTB genes. Hence our final data (9253 nucleotide sites) consist of five different sites including the first plus second and third codon positions of the nuclear RAG gene. To avoid estimation errors because of the effect of saturation, we did not include the third codon sites of mitochondrial genes. To determine the best model of evolution for each dataset, we used the program ModelTest [10] implemented in the software MEGA5 [11]. However, using different models of evolution produced largely similar divergence time estimates (see the electronic supplementary material).

We used the software BEAST to estimate divergence times [12] using a relaxed (uncorrelated lognormal) molecular clock. Although all five datasets were combined to determine the tree and estimate the age of extant penguins we unlinked substitution and clock models (see the electronic supplementary material). To calibrate the molecular clock, we used an external calibration of 61 Myr, which is the age of the oldest stem penguin fossil (Waimanu), as the minimum age constraint for the penguins plus kelp gull node [13]. In addition, we used 86 Myr as the maximum age of the Neoaves as recommended previously [9,14]. We also used three internal calibrations using penguin fossils representing three independent splits between the major crown groups. The age of the fossil of Madrynornis mirandis (10 Myr) was used to calibrate the Eudyptes/Megadyptes split [4,5]. In addition, we used the age of the fossil of the extinct penguin, Spheniscus muizoni (9.2 Myr) to calibrate the node separating Spheniscus and Eudyptula [5,15]. Finally, we used 7.6 Myr for the age of Pygoscelis based on the fossil of Pygoscelis grandis [3]. All the above times were used as minimum constraints and 20 Myr was used as the maximum constraint. We used various calibration priors based on uniform, lognormal and normal distributions and obtained similar age estimations. The results shown in figure 1 and table 1 are based on normal calibration priors. The details of calibration and other priors used are given in the electronic supplementary material. To examine the phylogenetic relationship between penguins and outgroup species, we constructed a maximum-likelihood tree by concatenating all genes and tested the strength of statistical support for each node using a bootstrap resampling procedure (1000 replicates).

3. Results

The phylogenetic relationship between extant penguins and other outgroup species is shown in the Bayesian chronogram (figure 1). The topology of the maximum-likelihood tree was identical to that shown in figure 1. The topology of this tree is slightly different from that identified in previous studies [2,5]. While our tree identified a separate clade for the penguin genera Aptenodytes and Pygoscelis, previous studies suggested that Aptenodytes was basal and first branched off from remaining penguin species, followed later by Pygoscelis. Although the Bayesian posterior probability for the Aptenodytes plus Pygoscelis node is high (0.99), the ML-based bootstrap support is low (66%) (table 1). Furthermore, a recent study using molecular and morphology data found a polytomy for Antarctic penguins [16], supporting the uncertainty observed in this study.

Our Bayesian analyses based on the internal and external calibrations indicate a time of 20.4 Myr (highest probability density (HPD) 17.0–23.8) for the age of the MRCA of all extant penguins (figure 1 and table 1). The difference (if any) in the rate of evolution between penguin and non-penguin outgroups might affect the age estimation. Therefore, we estimated the age of crown penguins after excluding all outgroup species and using only the penguin sequences and three internal fossil calibrations (nodes 6, 7 and 8). The age obtained by this method was 19.0 Myr (HPD 16.0–22.1). We also estimated the age of crown penguins using only deeper external calibrations (nodes 1 and 3) and this indicated a time of 22.3 Myr (HPD 16.0–29.1). All these estimates are similar, suggesting the robustness of our results. All the divergence times of major crown penguin clades estimated in this study ranged between 11 and 16 Myr (table 1).

4. Discussion

The age of the ancestor of all living penguins estimated in this study (20.4 Myr, 17.0–23.8 Myr) is much more recent than the estimates from previous molecular studies. Using molecular sequence data from all extant penguins, Baker et al. [2] estimated the MRCA of penguins to be 40.5 Myr (34.2–47.6 Myr). Another study estimated the divergence time between Aptenodytes and Eudyptes plus Eudyptula (which points to node 4 in figure 1) to be 51 Myr (31–71 Myr) [7]. Hence the age of the MRCA of crown penguins reported in this study is less than half of the previous estimates. Similarly, the divergence times estimated in this study for the splits between major penguin groups were much more recent compared with previous studies [2,7–9]. For example, the divergence times reported in earlier studies for the Eudyptes–Eudyptula split (21–35 Myr) were up to twice as high as our estimate (15.8 Myr). Furthermore, previous studies reported much older estimates (32–38 Myr) compared with ours (20 Myr) for the separation of the Pygoscelis lineage from the Eudyptes plus Eudyptula clade [2,8]. On the other hand, our divergence time estimates are supported by previous studies using morphological characters, which estimated the age of Spheniscidae to be approximately 16 Myr [1,5,6]. The available fossil record suggests that the minimum age for crown penguins ranges between 7 and 10 Myr [3–5,15].

The discrepancy between our estimates and those of the previous studies could be due to the difference in the calibration methods used. Since rates of evolution vary significantly between different avian lineages [17], calibrations based on deep divergences are not appropriate to estimate divergence times of shallow branches or specific groups. By contrast, a calibration using penguin-specific fossils should capture the lineage-specific evolutionary rate, because the physiology and life-history traits are typically similar in species within the same clade. Furthermore, this discrepancy might be owing to the type of datasets used. Unlike previous studies that used one or two loci, our data come from six independent loci.

In this study, we used additional data from multiple nuclear loci together with penguin-specific calibrations to estimate the time of origin of crown penguins. Our results suggest a Miocene origin of extant penguins, which is consistent with previous paleontological and morphological studies. Furthermore, our results show that the major penguin lineages diverged around 11–16 Myr. Interestingly around 12 Ma Antarctica experienced a sharp decline in temperatures that resulted in a permanent ice cover over the continent [1,18]. Since the temperature in Antarctica continued to fall steeply after 12 Myr [1,18], this might have played a role in the diversification of penguin lineages.

Funding statement

We thank the Australia India Strategic Research Fund, Massey University and the Australian Research Council for financial support.

Acknowledgements

We are grateful to Alan Baker and Oliver Haddrath for providing penguin samples.