The ability to recognize individual animals has substantially increased our knowledge of the biology and behavior of many taxa. However, not all species lend themselves to this approach, either because of insufficient phenotypic variation or because tag attachment is not feasible. The use of genetic markers (`tags') represents a viable alternative to traditional methods of individual recognition, as they are permanent and exist in all individuals. We tested the use of genetic markers as the primary means of identifying individuals in a study of humpback whales in the North Atlantic Ocean. Anaysis of six microsatellite loci (2,3) among 3,060 skin samples collected throughout this ocean allowed the unequivocal identification of individuals. Analysis of 692 `recaptures', identified by their genotype, revealed individual local and migratory movements of up to 10,00 km, limited exchange among summer feeding grounds, and mixing in winter breeding areas, and also allowed the first estimates of animal abundance based solely on genotypic data. Our study demonstrates that genetic tagging is not only feasible, but generates data (for example, on sex) that can be valuable when interpreting the results of tagging experiments.

Skin biopsy (4) or sloughed skin (5) samples from free-ranging humpback whales (Megaptera noveaeangliae) were collected across the North Atlantic between 1988 and 1995. Total-cell DNA was extracted (6), and the sex (7) and genotype at six mendelian inherited (8) microsatellite loci (9) were determined for each sample. From the 3,060 samples analysed, we detected 2,368 unique genotypes. The expected number of samples collected from different individuals with identical genotype arising by chance was estimated at less than one (see Methods). Because of this, and the fact that all samples with identical genotypes were of consistent sex, we believe that the 3,060 samples represented 2,368 individual whales. Of the 692 recaptures observed during the study, 216 occurred on the sumer feeding grounds (Fig 1). Of these, 96% (n=207) occurred within the same feeding area (Fig 1), confirming previous behavioural (10,11) and genetic (12, 13) observations of maternally directed fidelity to specific feeding grounds. The remaining 4% 9n=9) of the recaptures on the summer feeding grounds were detected in different but adjacent feeding grounds (Fig 1). However, significantly more recaptures were detected within these sampling areas than would be expected if the areas constituted one intermixing feeding aggregation and is consistent with the notion of maternally directed site fidelity (see Methods).

Of the 114 individuals recorded on both summer feeding and winter breeding ground (Fig 1), two had migrated from the West Indies to either Jan mayen or Bear Island (in the Barents Sea). These genetic recaptures represent the most extensive one-way movements (6,435 and 7,940 km, respectively) recorded in this study, and support recent findings (13, 14) that whales feeding in the Barents Sea share a common breeding ground with other North Atlantic humpback whales. In three other indivduals, which were each sampled on three occasions, movements were documented from a feeding ground to the breeding range and back, involving minimum migration distances of up to 10,00 km between the first and last sampling event. No feeding ground was disproportionately represented among the recaptures in the West Indies (Fig 1), supporting the current view that humpback whales in the North Atlantic constitute a single panmictic population (13, 15, 16) (G-test, G= 4.68, P< 0.46).

As with traditional identification methods (1), microsatellite data lend themselves to abundance estimation using mark-recapture statistical methods (17), although to our knowledge this has not previously been attempted. Using breeding-ground samples collected during 1992 and 1993, we estimated the North Atlantic humpback whale population at 4,894 (95% confidence interval, 3,374-7,123) males and 2,804 (95% confidence in, 1,776-4,463) females. This total of 7,698 whales is substantially (albeit not significantly) higher than the most recent previous photographic based estimate of 5,505 (ref. 10) (95% confidence interval, 2,888- 8,122). Preliminary results from new and more reliable photographic estimates are also larger than previous estimates (T.S. et al., manuscript in preparation), and could partly be due to population growth during the intervening decade since the previous estimate (18). The significntly different estimates for males and females are unexpected given the even sex ration observed on the feeding grounds (19) (Table 1) and among 198 claes that we sampled in the breeding range (data not shown). The estimates are independent of between-sex sampling biases, and so the observed deficit of females probably reflects within-sex behavioural differences, for example that individual females display a higer degree of preferences with respect \to region and/or residence time in the breeding range than do males.

Our results demonstrate that genetic tagging is effective even in a large population of wide-ranging and inaccessible mammals such as cetaceans. Further, the data obtained from genetic tags can be used to address evolutionary (20), demographic (19) and behavioural (21) questions to which traditional tagging methods are unsuited. Because all eukaryotes possess microsatellites (22), individuals within any taxon can in principle be identified reliably from minute quantities of tissue. Such tissue is commonly derived from biopsies (23), but can also come from sloughed skin (5), shed hair (24) or fecal material (25, 26), thus potentially allowing genotyping and individual recognition even of unobserved animals. However, the validity of a genetic tag depends on there being a sufficently low probability of identity (27). An underestimate of this probability can be caused by unrecognized population substructure or linkage disequilibrium among loci. Additional anaylses addressing these issues, as well as checks with other data (such as the sex of recaptures, as in this study) must be performed to ensure the validity of the overall probability of identity. Similarly, a genetic tag consists of data from multiple loci, each of which are prone to handling errors in the laboratory. With proper laboratory procedurs, such errors can be rendered minimal (here estimated to 0.0011 per locus), and so do not represent a serious obstacle to the detection of recaptures. Recaptures with an erroneous genotype will almost certainly be among the samples that match at all but one locus. If a sufficient number of loci has been analysed, none or few samples are expected to match at all but one locus, and so re-analysis of the discrepant locus presents only a minor additional effort.

Methods

Expected number of samples with identical genotypes. The number of samples from different individuals with identical genotypes (across all loci) arising by chance was estimated from the probability of identity (I) (27). No difference in the estimate was observed whether I was estimated from all samples of unique genotypes only. The expected number of samples from different individuals with identical genotypes arising by chance was first estimated separately within each sampling area, and subsequently between sampling areas (after removal of duplicate genotypes within each sampling area). The expected total number of such matches was estimated to be 0.32 and 0.27 within and between areas, respectively. No significant degree of linkage disequilibrium (which could cause an underestimate of I), nor any significant deviations from the expected Hardy-Weinberg proportions of genotypes, was observed after the removal of duplicate genotypes.

Expected number of matches between the Gulf of St Lawrence and Newfoundland/Labrador. The probability of observing six recaptures between years in the Gulf of St Lawrence was assessed by Monte Carlo simulations, under the assumption that the Gulf of St Lawrence and Newfoundland/Labrador constitue one intermixing feeding aggregation. Several tests, each of 1,00 simulations, were conducted over a range of the most likely abundance estimates derived from the data. Each simulation was conditioned on the number of recapturs observed in the sample and the order of sampling int he two areas. The probability of observing six or more individuals sampled more than once in the Gulf of St Lawrence (if part of the same feeding ground as Newfoundland/Labrador) was estimated to less than 0.0001.

Estimation of abundance and log-likelihood ratio test of equal numbers of males and females on the breeding range. Estimates of abundance and 95% confidence intervals were calculated as suggested previously (17). In 1992, 382 males and 231 females were sampled on the breeding range; the corresponding numbers for 1993 were 408 and 265. Between the two years we obsereved 31 and 21 recaptures of males and females, respectively. The statistical significance of the difference in the estimated number of males and females was assessed with a likelihood-ratio test of the null hypothesis that the number of males was equal to the number of females, assuming that the number of recaptures is hypergeometrically distributed. Asymptotically the test statistic (-2ln(A), where A is the likelihood ratio) is chi-squared distributed with one degree of freedom, which implies that eh probability of -2 ln(A) exceeding 4.14 is 0.042. Monte Carlo simulations (10,000 replicates) confirmed this probability.

Estimation of error rate. The error rate was estimated from the samples with identical genotypes at five but not six loci. From 2,368 genotypes just 64 such pairs (in all 117 unique genotypes) were detected in 19 of which the individual from whcih the sample was collected had been identified by its natural markings (28). Of these 19 incidences just 3 from the same individual. Hence the 117 samples were estimated to include 9 samples with an incorrect genotype, which equals an error rate of 0.0011 per locus after inclusion of the 2 times 692 smaples which match on all loci (presumably determined correctly). The upper limit of the 95% confidence interval was estimated to 0.0027 from Monte Carlo simulations, assuming a binomial distribution of errors, and a hupergeometric distribution of photographed whales.

Acknowledgements. Most samples were collected during the international collaborative project (Year of the North Atlantic Humpback Whale (YONAH). We thank T.H. Andersen, P. Arctander, C. Berchok, L. Bonnelly, M. Fredholm, J. Jensen, K.B. Pedensen, P. Raahauge, J. Robbins, O. Vasquez and E. Widen for their support and assistaance. Funds were obtained from the Commission for Scientific Research in Greenland, the Greenland Home Rule, the EU Biotechnology Program, the Danish and Norwegian Research Councils, the US National Marine Fisheries Service. The US National Fish and Wildlife Foundation, the Department of Fisheries and Oceans, the International Whaling Commission, the US State Department, the Aage V. Jensen Charity Foundation, the Dorr Foundation, the American-Scandinavian Foundation, the Exxon Corporation, and Feodor Pitcairn.