Abstract

The human mutation rate is an essential parameter for studying the evolution of our species, interpreting present-day genetic variation, and understanding the incidence of genetic disease. Nevertheless, our current estimates of the rate are uncertain. Most notably, recent approaches based on counting de novo mutations in family pedigrees have yielded significantly smaller values than classical methods based on sequence divergence. Here, we propose a new method that uses the fine-scale human recombination map to calibrate the rate of accumulation of mutations. By comparing local heterozygosity levels in diploid genomes to the genetic distance scale over which these levels change, we are able to estimate a long-term mutation rate averaged over hundreds or thousands of generations. We infer a rate of 1.61 ± 0.13 × 10-8 mutations per base per generation, which falls in between phylogenetic and pedigree-based estimates, and we suggest possible mechanisms to reconcile our estimate with previous studies. Our results support intermediate-age divergences among human populations and between humans and other great apes.

(A) Ancestral recombinations separate chromosomes into blocks of piecewise-constant TMRCA (and hence expected heterozygosity). (B) From the data, we measure local heterozygosity as a function of genetic distance; red and blue circles represent heterozygous and homozygous sites, respectively, along a diploid genome. (C) Our statistic HS(d) is an average heterozygosity as a function of genetic distance over many starting points with similar local heterozygosities, yielding a smooth relaxation toward the genome-wide average.

(A) Overview: from the data, we compute both the statistic HS(d) and other parameters necessary to create matching calibration curves with known values of μ. (B) Details of capturing aspects of the real data for the calibration data. (C) Computation of HS(d): the statistic captures the average heterozygosity as a function of genetic distance d from a starting point with heterozygosity in a defined range S, averaged over many such points. (D) For the final inferred value of μ, we compare matched HS(d) curves for the real data and calibration data (with known values of μ).

(A) All eight individuals together; the inferred rate is μ = 1.61 ± 0.13 × 10−8 per generation. (B) Results for the four Europeans; the inferred rate is μ = 1.72 ± 0.14 × 10−8. (C) Results for the four East Asians; the inferred rate is μ = 1.55 ± 0.14 × 10−8. For all real-data results, the curves displayed are for representative calibrations matching the overall means. The reported values are also corrected for gene conversion, genotype error, and base content, which explains the apparent discrepancy between the final estimates and the curves (for example, the estimate (A) is corrected from a raw value of 2.00 × 10−8).