Chinese cymbidiums (Cymbidium sp.) are important ornamental plants because of their foliage, flower shape, and fragrance. Well-known Chinese cymbidiums mainly include Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. The population genetics of Chinese cymbidiums can be efficiently analyzed using small-scale marker panels with high discriminatory power. In this study, we tested several genic simple sequence repeats (SSRs) and built six genic SSR panels. The panels included several robust markers, which can rapidly assign Chinese cymbidium accessions to their source species. Fifty-three accessions of Chinese cymbidiums were analyzed using 25 markers, which exhibited polymorphism among five species. These markers were ranked according to their discriminatory scores (D scores). The program selected six markers to build an “overall” panel for all Cymbidium classifications and yielded 95.16% population assignment accuracy. Considering one species as the “critical” population and the four other species as one population, we built five genic SSR panels: C. ensifolium panel (four markers, 98.05% accuracy), C. faberi panel (six markers, 95.90% accuracy), C. goeringii panel (six markers, 95.15% accuracy), C. sinense panel (six markers, 96.35% accuracy), and C. kanran panel (five markers, 96.10% accuracy). Genetic distance matrices calculated using the “overall” panels and those derived with the 25 markers were compared. Results showed a high correlation (R = 0.807) with statistical significance (P = 0.042). Moreover, “all panels” revealed higher genetic variations among populations than “all markers.” Hence, the developed panels are suitable for efficient population classification of Chinese cymbidiums.

Abstract

Chinese cymbidiums (Cymbidium sp.) are important ornamental plants because of their foliage, flower shape, and fragrance. Well-known Chinese cymbidiums mainly include Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. The population genetics of Chinese cymbidiums can be efficiently analyzed using small-scale marker panels with high discriminatory power. In this study, we tested several genic simple sequence repeats (SSRs) and built six genic SSR panels. The panels included several robust markers, which can rapidly assign Chinese cymbidium accessions to their source species. Fifty-three accessions of Chinese cymbidiums were analyzed using 25 markers, which exhibited polymorphism among five species. These markers were ranked according to their discriminatory scores (D scores). The program selected six markers to build an “overall” panel for all Cymbidium classifications and yielded 95.16% population assignment accuracy. Considering one species as the “critical” population and the four other species as one population, we built five genic SSR panels: C. ensifolium panel (four markers, 98.05% accuracy), C. faberi panel (six markers, 95.90% accuracy), C. goeringii panel (six markers, 95.15% accuracy), C. sinense panel (six markers, 96.35% accuracy), and C. kanran panel (five markers, 96.10% accuracy). Genetic distance matrices calculated using the “overall” panels and those derived with the 25 markers were compared. Results showed a high correlation (R = 0.807) with statistical significance (P = 0.042). Moreover, “all panels” revealed higher genetic variations among populations than “all markers.” Hence, the developed panels are suitable for efficient population classification of Chinese cymbidiums.

The use of numerous marker loci in population genetic analysis is an expensive and inefficient method for genotyping large germplasm collections. As such, scholars developed informative panels by using a minimum number of marker locus combinations. The panel can be considered a DNA bar code and can be used for research on taxonomy and population genetics. Minimum combination of highly ranked loci for population assignment can be determined through two basic procedures. First, simulated test data are generated using the observed allele frequencies for each locus, and then the number of times each test genotype is correctly assigned to the appropriate source population is determined and markers are then ranked by comparison of individual marker scores. Second, another round of iterations yield ranked loci by increasing one locus at each run until the assignment score reaches or exceeds the accuracy criterion. This strategy is applied for locus selection in numerous species during population assignment; such species include chinook salmon [Oncorhynchus tshawytscha (Greig et al., 2003)], asian seabass [Lates calcarifer (Yue et al., 2012)], trematomids [Trematomus sp. (Van de Putte et al., 2009), humans [Homo sapiens (Rosenberg et al., 2003)], switchgrass [Panicum virgatum (Okada et al., 2011)], and rice [Oryza sativa (Agrama et al., 2012)].

In the present study, we aim to develop informative panels by using a minimum number of genic SSRs. The panels can be used to assign Chinese cymbidiums to five populations with high accuracy. The proposed strategy could be an efficient tool for population classification.

Materials and Methods

Plant samples

Fifty-three accessions from C. goeringii, C. faberi, C. ensifolium, C. kanran, and C. sinense were used for testing genetic markers (Table 1). Fifty-three accessions were selected from 105 accessions (Supplemental Fig. 1) based on their ancestries derived from STRUCTURE (Pritchard et al., 2000), which had maximum ancestry index higher than 0.6. The plants were grown and maintained in a greenhouse under natural light. Fresh leaves were collected from two or three seedlings of each accession for genomic DNA extraction.

Genotyping

Genomic DNA was extracted from leaf samples as previously described (Li et al., 2007). Twenty-five genic SSRs were selected from the genic SSR collection (Li et al., 2014a) for panel development. Polymerase chain reaction (PCR) primers were synthesized by Life Technologies (ABI and Invitrogen, Shanghai, China). PCR experiments were conducted, and the products were separated using polyacrylamide gel electrophoresis gel (Li et al., 2007). The gel was silver stained according to the procedure reported by Li et al. (2001), and the results were recorded using a scanner. The genotype was determined by analyzing band patterns (Li et al., 2014a). The accessions were labeled by 1, 2, 3, and so on, according to the number and position of bands, and then the data were converted into binary matrix using “compute frequency” procedure in PowerMarker version 2.7 (Liu and Muse, 2005).

Statistical analysis

Polymorphism of markers.

Genetic distances were calculated using Nei’s distance (Nei and Takezaki, 1983). Phylogenetic reconstruction was performed based on neighbor-joining (NJ) method (with 1000 bootstrap replicates) and implemented in PowerMarker version 2.7. The phylogenetic tree was visualized in MEGA version 4 (Tamura et al., 2007). PowerMarker was also used to calculate the average number of marker alleles and polymorphism information content (PIC).

Analysis of population structure.

Population structure was initially analyzed using the software STRUCTURE as described in previous studies (Li et al., 2010, 2011, 2012, 2014b). Genetic variations within and among populations were then calculated by analysis of molecular variance (AMOVA) in Arlequin V2.000 (Schneider and Excoffier, 1999). Genetic distances among species were calculated using Nei’s distance (Nei and Takezaki, 1983). A phylogenetic tree was constructed by NJ method. Both calculations were performed in PowerMarker version 2.7. The phylogenetic tree was visualized using MEGA version 4 (Tamura et al., 2007). The correlation of genetic distances based on total markers or panels was assessed using Mantel test in PowerMarker.

Discriminatory power of markers.

Twenty-five genic SSRs were employed for WHICHLOCI analysis. Accessions with clear origins were used to test discriminatory powers. Using the WHICHLOCI v.1.0 software package (Banks et al., 2003), we ranked each of the 25 markers according to their D scores, which indicate discriminatory powers in population analysis. Discriminatory power was assessed by conducting 10,000 simulations (population size N = 2000) based on the allele frequency of each population. The threshold of the assignment accuracy was set to 95% and the stringency of limit of detection (LOD) was designated as 3.0 for all simulations. For the “critical” population, we determined which locus is necessary for identification of a specific species. These methods were applied in each species, where all species that do not belong to the “critical” population were treated as one population. Finally, reliability examination was performed on panel markers by comparing the results derived from panels with those obtained using all markers.

Results

Population structure and genic ssr profile.

Fifty-three of 105 accessions were selected based on their ancestries derived from STRUCTURE (Table 1). Accessions with severely mixed ancestries (maximum ancestry index lower than 0.6) were excluded. A total of 12, 11, 10, 10, and 10 accessions from C. goeringii, C. kanran, C. faberi, C. ensifolium, and C. sinense, respectively, were used for further analysis. The AMOVA results revealed 19.98% genetic variation among species and 80.02% genetic variation within species. Polygenetic analysis also indicated that the largest genetic distance was found between C. ensifolium and C. sinense (0.2734), and the smallest distance was detected between C. goeringii and C. kanran (0.2008) (Fig. 1; Table 2).

Neighbor-joining tree of five Chinese cymbidiums based on Nei’s distance. These species included Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. Bootstrap value (out of 1000) is indicated at the branch point.

Comparison of pairwise genetic distance above the diagonal based on the “overall” panel comprising six markers and below the diagonal based on all 25 markers among five Chinese Cymbidium species.z

In the genic SSR collection, 25 genic SSRs exhibited polymorphism within the five species. These SSRs were selected to test assignment ability because equal numbers of effective markers (polymorphic markers) are required within each population for WHICHLOCI. The polymorphisms of the markers are shown in Table 3. The PIC values varied from 0.204 for SSR68 to 0.855 for SSR73, with an average content of 0.528. The allele number of these markers ranged from two for SSR79 to 14 for SSR08, with an average value of 6.88.

Evaluation of marker discriminatory power and construction of panels.

D scores were used to measure the ability of markers for correct assignment of accessions to their original species. Among the 25 markers, SSR73 showed the highest assignment score (4.941), whereas SSR68 displayed the lowest score (0.941) (Fig. 2; Table 4). The correlation coefficients between D scores and allelic numbers and those between D scores and PICs were 0.795 and 0.924, respectively. With increasing number of markers employed, high portions of simulated individuals were accurately assigned (Fig. 2). The percentage of correct assignments markedly increased in response to the first six markers and then gradually increased upon successive addition of the seventh to the 10th markers. The percentage then reached 99.31%, and the “overall” panel was generated by WHICHLOCI without defining the critical population. The panel included six markers: SSR73, SSR21, SSR08, SSR64, SSR53, and SSR03. In this panel, the D scores of the markers varied from 3.57 for SSR03 to 4.94 for SSR73, with an average value of 4.15 (Table 4). About 95.16% of the simulated accessions were accurately assigned to their original species by using the most effective markers, which were selected based on D scores at an LOD threshold of 3.0 (Table 4).

Percentage of correct assignments and corresponding number of selected simple sequence repeats (SSRs). SSRs were ranked according to their discriminatory scores (D scores) from high to low, as calculated by WHICHLOCI (Banks et al., 2003). D scores was assessed by conducting 10,000 simulations (population size N = 2000) based on the allele frequency of each species. The threshold of the assignment accuracy was set to 95%, and the stringency of limit of detection was designated as 3.0 for all simulations.

Estimation of correct assignments of the five Cymbidium species and “critical” species by using “overall” or “critical” panels.

In critical population method, other populations were treated as one population. When C. ensifolium was selected as the critical population, a panel was created; the developed panel generated 98.05% correct assignments. The panel included four markers with D scores ranging from 2.07 to 2.67 (Table 4). The C. faberi panel consisted of six markers, with D scores ranging from 1.71 to 2.77, and resulted in 95.90% correct assignments. The C. goeringii panel comprised six markers, with D scores ranging from 1.52 to 2.67, and resulted in 95.15% correct assignments. The C. sinense panel was composed of six markers, with D scores ranging from 1.79 to 2.57, and resulted in 96.35% correct assignments. The C. kanran panel included five markers, with D scores ranging from 1.94 to 2.79, and resulted in 96.10% correct assignments.

Application of marker panels in population structure analysis.

The discriminatory powers of marker panels in the population assignment were assessed. Genetic variation among the five populations derived using the “overall” panel was 29.92%, whereas that obtained using all marker was 19.98%. In critical population analysis for each Chinese cymbidium species, genetic variation between C. ensifolium and the four other populations derived using the C. ensifolium panel was 30.46%, which was higher than the 14.00% variation obtained using all markers. Similar results were observed in the analysis of the four other species (Table 5). In addition, pairwise genetic distances between species derived from each panel or all markers were compared through Mantel analysis. The results showed a significant correlation, with a coefficient of 0.807 (P = 0.042), between the two techniques (Table 2). This high correlation suggested that the panels developed can be effectively used to reveal population differentiation patterns.

Table 5.

Genetic variation among Cymbidium species explained by panels and all 25 markers (genic simple sequence repeats).

Discussion

Genic ssrs as markers for analysis of population differentiation.

Genic SSRs show higher transferability across related species than anonymous SSR markers (Scott et al., 2000). Genic SSRs can be used to directly compare different taxa and prevent the risk of locus-specific differences (genetic diversity among accessions), which may mask true species-level differences (Ellis and Burke, 2007). As such, genic SSRs are suitable for discriminating species. In the present study, genic SSRs were used to create highly efficient panels for Chinese cymbidiums. As the complex relationships between accessions can influence the discriminating power of DNA markers (Hayes et al., 2005), accessions with ambiguous ancestries were not included in marker selection. Finally, 53 accessions with clear sources were analyzed in WHICHLOCI. To avoid bias associated with the number of markers used within each species, we selected 25 markers that showed polymorphisms within the five species for further analysis.

Factors influencing the discriminatory power of markers.

The informative value of markers is attributed to their allelic number and PIC (Agrama et al., 2012; Yue et al., 2012). The discriminatory power of markers is determined by their efficiency in correct population assignments and is inversely correlated with their propensity for causing false assignments (Banks et al., 2003). Successive assignment trials were performed in simulated populations based on the allele frequencies of markers (Banks et al., 2003). Thus, the D score of a marker is logically and positively associated with the corresponding genetic diversity, PIC, and number of alleles. In the “overall” panel analysis, six markers with high D scores presented high PICs and high allele numbers. Similarly, SSR68, SSR54, and SSR16, which showed low D scores, corresponded to low polymorphism and diversity. The correlation coefficient between D score and PIC and that between D score and number of alleles are 0.795 and 0.924, respectively. These results are consistent with those reported in a previous study (Agrama et al., 2012).

The order sorted by D scores and PICs was not exactly the same. This difference may be due to allelic distribution among and within populations. In critical population analysis for each species, specific markers containing a minimum of four or five alleles would be ranked as high for particular species, such as in the case of SSR01 in C. goeringii panel, which amplified C. goeringii-specific alleles. Similar results were also reported previously (Banks et al., 2003). C. goeringii was the least differentiated species in this study (Table 5) and required six markers for accurate assignment. This high requirement could be due to the polyphyletic quality of C. goeringii accessions. For example, numerous C. goeringii accessions are clustered with C. faberi or C. sinense, as reported in previous studies (Li et al., 2014a, 2014b).

Evaluation and improvement of panels for population classification.

To evaluate the capability of panels for population identification, we compared the results obtained using panels with those derived from total markers. Both results showed similar population structure patterns, as indicated by the high correlation coefficient (R = 0.8068) among genetic distance matrices. Nevertheless, the panels are considered more appropriate for population assignment because of their potential to present clear genetic differentiation with few markers. This ability could be attributed to the allelic distribution of panel markers and is highly consistent with the population structure. Among the “critical” panels, the C. ensifolium panel was found to be the most efficient because it can explain the highest variation among populations with the least number of markers (four). The C. goeringii panel explained relatively low variation and used the most markers (six). Considering the possibility of misassignment and uncertainty of the genetic background, we proposed an approach for correcting the population structure, that is, several reference accessions are included when using the panels for population analysis.

Genic SSR panels with the strongest discriminatory power were used to characterize Chinese cymbidium germplasms and explore their population structure. In practice, these panels can facilitate rapid screening of large germplasm banks and species classification at a relatively low cost.

LiX.JinF.JinL.JacksonA.HuangC.LiK.ShuX.2014aDevelopment of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiumsBMC Genet.15124

Li,X.Jin,F.Jin,L.Jackson,A.Huang,C.Li,K.Shu,X.2014aDevelopment of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiumsBMC Genet.15124)| false

LiX.YanW.AgramaH.HuB.JiaL.JiaM.JacksonA.MoldenhauerK.McclungA.WuD.2010Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collectionGenetica13812211230

Li,X.Yan,W.Agrama,H.Hu,B.Jia,L.Jia,M.Jackson,A.Moldenhauer,K.Mcclung,A.Wu,D.2010Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collectionGenetica13812211230)| false

SchneiderS.ExcoffierL.1999Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNAGenetics15210791089

Schneider,S.Excoffier,L.1999Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNAGenetics15210791089)| false

Polyacrylamide gel electrophoresis profile of SSR08 tested among 105 Chinese cymbidiums accessions. Fifty-three of 105 accessions were selected based on their ancestry for this study. Marker was labeled with M; Cymbidium goeringii accessions were shown as 1 to 25; Cymbidium ensifolium accessions were shown as 26–47; Cymbidium faberi accessions were shown as 48–66; Cymbidium sinense accessions were shown as 67–85; and Cymbidium kanran accessions were shown as 86–105.

Article Sections

Article Figures

Neighbor-joining tree of five Chinese cymbidiums based on Nei’s distance. These species included Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. Bootstrap value (out of 1000) is indicated at the branch point.

Percentage of correct assignments and corresponding number of selected simple sequence repeats (SSRs). SSRs were ranked according to their discriminatory scores (D scores) from high to low, as calculated by WHICHLOCI (Banks et al., 2003). D scores was assessed by conducting 10,000 simulations (population size N = 2000) based on the allele frequency of each species. The threshold of the assignment accuracy was set to 95%, and the stringency of limit of detection was designated as 3.0 for all simulations.

Polyacrylamide gel electrophoresis profile of SSR08 tested among 105 Chinese cymbidiums accessions. Fifty-three of 105 accessions were selected based on their ancestry for this study. Marker was labeled with M; Cymbidium goeringii accessions were shown as 1 to 25; Cymbidium ensifolium accessions were shown as 26–47; Cymbidium faberi accessions were shown as 48–66; Cymbidium sinense accessions were shown as 67–85; and Cymbidium kanran accessions were shown as 86–105.

Neighbor-joining tree of five Chinese cymbidiums based on Nei’s distance. These species included Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. Bootstrap value (out of 1000) is indicated at the branch point.

Percentage of correct assignments and corresponding number of selected simple sequence repeats (SSRs). SSRs were ranked according to their discriminatory scores (D scores) from high to low, as calculated by WHICHLOCI (Banks et al., 2003). D scores was assessed by conducting 10,000 simulations (population size N = 2000) based on the allele frequency of each species. The threshold of the assignment accuracy was set to 95%, and the stringency of limit of detection was designated as 3.0 for all simulations.

Polyacrylamide gel electrophoresis profile of SSR08 tested among 105 Chinese cymbidiums accessions. Fifty-three of 105 accessions were selected based on their ancestry for this study. Marker was labeled with M; Cymbidium goeringii accessions were shown as 1 to 25; Cymbidium ensifolium accessions were shown as 26–47; Cymbidium faberi accessions were shown as 48–66; Cymbidium sinense accessions were shown as 67–85; and Cymbidium kanran accessions were shown as 86–105.

LiX.JinF.JinL.JacksonA.HuangC.LiK.ShuX.2014aDevelopment of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiumsBMC Genet.15124

Li,X.Jin,F.Jin,L.Jackson,A.Huang,C.Li,K.Shu,X.2014aDevelopment of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiumsBMC Genet.15124)| false

LiX.YanW.AgramaH.HuB.JiaL.JiaM.JacksonA.MoldenhauerK.McclungA.WuD.2010Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collectionGenetica13812211230

Li,X.Yan,W.Agrama,H.Hu,B.Jia,L.Jia,M.Jackson,A.Moldenhauer,K.Mcclung,A.Wu,D.2010Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collectionGenetica13812211230)| false

SchneiderS.ExcoffierL.1999Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNAGenetics15210791089

Schneider,S.Excoffier,L.1999Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNAGenetics15210791089)| false