Genetic diversity in cocoa (Theobroma cacao L.) has been assessed based on morphological and molecular markers for germplasm management and breeding purposes. Pedigree data is available in cocoa but it has not been used for assessing genetic relatedness. The geneitic diversity of 30 clonal cocoa accessions resistant to witche´ broom disease, from the CEPEC series, were studied on the basis of RAPD data and pedigree information. Twenty of these accessions descend from the TSA-644 clone, originated from a cross between the Upper Amazon germplasm called Scavina-6, the main source of resistance to witches' broom disease, and IMC-67. The ten remaining clones come from different sources including Amazon and Trinitario germplasm. RAPD data was collected using 16 primers and pedigree information was obtained from the International Cocoa Germplasm Database. Genetic similarities, genetic distances and coefficient of parentage were calculated using available software. Relatively low genetic diversity was observed in this germplasm set, probably because of great genetic relatedness amongst accessions studied and the poor representation of the germplasm. The TSA-644 descendants were more diverse than the other accessions used in the study. This might be due to the origin of the TSA clone, which was derived from highly divergent genotypes. Association between genetic similarities based on RAPD data and coefficient of parentage, based on pedigree data, was very low, probably due to the homogeneity of the breeding stocks and poor pedigree information. These findings are useful to cocoa breeders in planning crosses for the development of hybrid and clonal cultivars.

Cocoa (Theobroma cacao L.) genotypes resistant to witches' broom (WB) disease, caused by the fungus Crinipellis perniciosa, have been identified in many cocoa producing countries in South and Central America (LAKER et al., 1988; PIRES et al., 1999). The Scavina accessions, collected in the Upper Amazon in the 1930's, are the main source of resistance (SORIA, 1977; KENNEDY et al., 1987). These accessions, especially Scavina-6 and Scavina-12, transmit to their descendants not only resistance but also high hybrid vigor and good yielding performance, being widely used in producing countries affected by the disease. Many other resistant accessions were identified in Brazil, at the Centro de Pesquisas do Cacau (CEPEC) in the state of Bahia, and Estação de Recursos Genéticos José Haroldo (ERJOH) in the state of Pará, after field evaluation of a germplasm collection of more than 1000 accessions in each location (MARITA et al., 2001; FALEIRO et al., 2004c; PIRES, 2003). Scavina-6, Scavina-12, and other accessions identified in Brazil as resistant are being crossed in different breeding schemes to combine alleles from different resistant sources in order to develop new resistant clonal and hybrid varieties.

Accurate estimates of genetic variability levels among germplasm accessions can increase the breeding efficiency of crop species (BARRET et al., 1998). Measurements of genetic variability among and within cocoa populations have been derived from phenotypic and molecular data (BEKELE and BEKELE, 1996; MARITA et al., 2000; FALEIRO et al., 2004a). Phenotypic data has been instrumental in the classification of the major germplasm groups in cocoa, but it is less useful for the separation of accessions within groups (CHARTERS and WILKINSON, 2000). Pedigree data are used to calculate coefficient of parentage (KEMPTHORNE, 1969), an indirect measure of genetic diversity among accessions that estimates the probability that alleles of two individuals at a locus are identical by descent (BARRET et al. 1998). Pedigree data are available for some cocoa germplasm and cultivars but have not been commonly used for genetic diversity estimates.

Molecular markers like Random Amplified Polymorphic DNA (RAPD) and microsatellites are currently the most widely used markers for assessing genetic diversity in cocoa (MARITA et al., 2001; FALEIRO et al., 2002b; FALEIRO et al., 2004a; 2004b). MARITA et al. (2001) using RAPD studied the genetic diversity of a representative sample composed by 280 accessions, providing an overall portrait of the relatedness of the major cocoa germplasm groups available in the world. RAPD has also been used in genetic diversity studies in clones selected for WB resistance (FALEIRO et al., 2004a), in the identification of mislabeling and accession duplication (FALEIRO et al., 2002c), and in the development of genetic linkage maps (FALEIRO et al. 2004b; QUEIRÓZ et al., 2003).

The objectives of this study were: (i) to estimate the genetic similarity (GS) of 30 cocoa accessions of the CEPEC series based on RAPD markers; (ii) to estimate the coefficient of parentage (COP) of these accessions; (iii) to examine the agreement between the RAPD-based GS and COP; and (iv) to determine the genetic similarities between the TSA-644 derived accessions and non-derived ones.

2. MATERIAL AND METHODS

Genetic material

Out of 270 cocoa clonal accessions of the CEPEC series, 30 were select for this study (Table 1). These accessions are very important for the CEPEC's breeding program due to their resistance to witches' broom. Twenty of these accessions (DS group) have in their pedigree the TSA-644 (Trinidad Selection Amazon) clone, originated from the cross between Sca-6 (Scavina) and IMC-67 (Iquitos Mixed Calabacillo), both from the Upper Amazon or Forastero group. The ten remaining accessions (ND group) come from different sources including: IMC and Pa (Parinary) in the Upper Amazon; EEG (Estação Experimental de Goitacazes), SIC (Seleção Instituto do Cacau), SIAL (Seleção Instituto Agronômico do Leste), and CAS (Campo Agrícola de Santarém) in the Brazilian Amazon River Basin; EET (Estación Experimental Tropical) in Ecuador, all of them belonging to the Forastero group; and UF (United Fruit), and ICS (Imperial College Selection) in Central America, belonging to the Trinitario group, originated from crosses between Forasteros and Criollos in Central America. These 30 accessions are held at the CEPEC's germplasm collection, in rows of 10 trees, and have been evaluated for resistance to witches' broom in the field for more than 10 years.

DNA extraction and RAPD reactions

Young individual leaves of adult plants were collected in plastic bags, tagged and stored at  4ºC for DNA extraction following a modified CTAB protocol (DOYLE AND DOYLE, 1990) optimized by FALEIRO et al. (2002a). DNA quality and integrity was determined by running a small aliquot in a 0.8% agarose gel, stained with ethidium bromide and visualized under ultraviolet light (UV). Each RAPD reaction was done using 30 ng of DNA in a final volume of 25 µL, according to FALEIRO et al. (2002b). PCR reactions were carried out in a Primus 96 plus (MWG AG BIOTECH) thermocycler .

Data collection and analysis

The RAPD markers generated by each primer over the 30 CEPEC accessions were scored using 1 to indicate presence and 0 to indicate absence of bands. Genetic distance was calculated for each pairwise combination in all 30 accessions using the RAPD marker data, excluding from the analysis the monomorphic bands. The pedigree of each accession traced back to its ancestors using cocoa breeders information and data from the International Cocoa Germplasm Database.

The coefficient of parentage (COP) was estimated using SAS (SAS INSTITUTE, 1988). The genetic dissimilarities using RAPD data were estimated using the arithmetical complement of the similarity coefficient of Jaccard in Genes software (CRUZ, 1997). NTSYSpc software (ROHLF, 1993) was used to calculate correlations between distance matrices using MXCOMP function and significance of the correlation tested with a Mantel test of 1000 permutations, according to DeHAAN et al. (2003). A three dimensional MDS plot was constructed from the similarities estimates using principal coordinates, calculated with the SAS program (SAS INSTITUTE, 1988). RAPD markers and pedigree analyses were used to produce a frequency distribution of genetic diversity estimates; the coefficient of variation for genetic similarities was determined and plotted against sample size to obtain the number of bands required for reliable and accurate analyses of the genetic diversity (LERCETEAU et al., 1997).

3. RESULTS

Of the 59 primers screened 16 were polymorphic and used to collect the RAPD data. The primers resulted in 91 distinct fragments, with an average of 5.7 fragments per primer (Table 2). The overall average of fragments per primer obtained in this study was higher than that reported by FIGUEIRA et al. (1994) and MARITA et al. (2001) but lower than that obtained by YAMADA et al. (2002) with another set of accessions in the CEPEC series.

Histograms of the COP estimates based on pedigree data (Figure 1A) and genetic similarities estimates based on RAPD data (Figure 1B), generated from the 435 pair-wise combinations of the 30 cocoa accessions, revealed differences between RAPD and pedigree data. The genetic similarity estimates based on RAPD markers ranged from 0.18 to 0.76, with an average of 0.45, standard deviation 0.10 and coefficient of variation 0.22. The COP estimates ranged from 0.03 to 1.00, with an average of 0.44, standard deviation 0.41 and coefficient of variation 0.96.

Results of the Spearman's correlation analysis showed a positive correlation between GS and COP estimates (rS = 0.21, P < 0.001), indicating an association between these two coefficients of relatedness for this set of germplasm.

A separate analysis of Genetic Diversity (GD) was performed for the 20 TSA descendants and for the 10 non-descendant ones. For the TSA descendants the GD ranged from 0.159 to 0.605 with an average of 0.384 and a coefficient of variation of 22.7%. For the non-descendant, the GD ranged from 0.050 to 0.533 with an average of 0.232 and a higher value for the coefficient of variation (29.1%).

Associations among the 30 cocoa accessions based on a Multi Dimensional Scaling (MDS) analysis are presented in Figure 2. Dimensions 1, 2 and 3 (DIM1, DIM2 and DIM3) presented a badness-of-fit criterion of 0.1834, indicating that the original multidimensional data had a disturbance of 18% when plotted in three dimensions.

DIM1 separated all but one TSA-644 descendant accessions from the non-descendant ones. TSA-644 descendants were plotted on quadrants B and C, and non-descendants on quadrant A of the graph. The TSA descendants are crosses between the TSA-644 clone (originated from Upper Amazon germplasm) versus Lower Amazon accessions SIC and SIAL and unknown parents (Table 1). Upper and Lower Amazon germplasm have several distinct traits, especially resistance to witches' broom, the Upper Amazon set used in this study being more resistant than the Lower Amazon one.

The ten non-descendant accessions originated from crosses between Upper Amazon and Lower Amazon germplasm and crosses between Upper Amazon and Trinitarios. DIM2 scattered the accessions within the TSA descendants and non-descendants, but no clear groupings based on pedigree or germplasm origin were observed. DIM3 separated the non-descendant accessions, favoring differentiation of Upper versus Lower Amazon from Trinitarios versus Amazon descendants. The Upper Amazon CEPEC-1008 (pedigree EET-399 x EET-392) clone was plotted isolated in quadrant A3 and was clearly distinct from the remaining non-descendant accessions.

4. DISCUSSION

Genetic diversity in cocoa has been examined more frequently in the main germplasm groups, composed of Forasteros, Trinitarios and Criollos, seldom within groups or germplasm series. In general, accessions from the Upper Amazon germplasm show higher GD than those from other regions (MARITA et al, 2001; RONNING et al., 1995; N'GORAN et al., 1994; LERCETEAU et al., 1997; WHITKUS et al., 1998). The clone set studied here consisted of Upper and Lower Amazon and Trinitario germplasm, comprising 20 TSA-644 descendants and 10 non-descendants. The results showed less GD than reported in previous studies, probably because most of the accessions had in their pedigree TSA-644 (Upper Amazon), SIC and SIAL (Lower Amazon) germplasm. Moreover, the set had a relatively low number of accessions and some were genetically related. Previous studies showed low genetic diversity for the SIC and SIAL germplasm (CASCARDO et al., 1993), originated from selections made in the 1950's and 1960's in a landrace called "Cacau Comum da Bahia" and highly adapted to southern Bahia, corroborating with our findings of low GD.

When the TSA descendants and non-descendants were analyzed separately, the GD level was higher in the TSA descendants than in the non-descendants accessions. This may be due to the genetic diversity of the other accessions genitors involved and the TSA origin from a cross of Upper Amazon accessions with a high GD. The lower level of GD in the non-descendant accessions may be due to the lower number of genitors involved and the lower genetic diversity of Trinitario and the EET-392 in this group. These findings are in agreement with the phenotypic evaluation of resistance that indicated existence of variability in this trait among accessions and groups according to their origin (PIRES, 2003).

These findings can be used to help breeders direct their crosses. For example, the TSA descendants could be divided into six sub-groups based on the accession present in quadrants B1, B2, B3, C1, C2 and C3 of the graph (Figure 2), according to their relatedness. In these sub-groups, the agronomically superior genotypes resistant to witches' broom and other important traits and less related ones based on GD values, could be selected for future crosses. The same could be done with the non-descendant genotypes. This approach may help breeders reduce the number of crosses to test, saving time and resources. This selection procedure assumes that more related accessions, based on the RAPD data, have common alleles for agronomic traits. If this is true, less genetic gain is expected from crosses involving more related accessions.

Cocoa breeders usually keep pedigree records of the genetic material in use and the relatedness in the main breeding stocks is known. However, these breeding stocks and populations are very heterogeneous and almost no population breeding has been developed, making it difficult to measure relatedness based on COP. We believe this is the main reason why COP is a poor measure of genetic diversity for the accessions studied here. Coefficient of parentage values obtained do not appear to be reliable enough to indicate distinct genetic similarities among genotypes, in agreement with SOUZA et al. (1998). Considering that COP calculation can lead to wrong conclusions about genetic relationships in studied materials, it follows that molecular markers in cocoa are more reliable and, if collected appropriately, may be very helpful for breeding and germplasm conservation.

The level of GD found in the TSA-644 descendants can be considered low when compared to previous studies in large sets of cocoa germplasm (FALEIRO et al., 2004a) and farmers' selections in southern Bahia (YAMADA, 2001; FALEIRO et al.; 2004c; LEAL et al., 2004). However, the data revealed that Scavina-6 X IMC 67 descendants maintain a certain level of diversity. There is great concern among cocoa breeders regarding the use of only one main source of resistance, thus narrowing the genetic base, given the great variability of the pathogen and the large number of diseases that threaten cocoa crops.