Marker-assisted selection: an approach for precision plant breeding in the twenty-first century

Abstract

DNA markers have enormous potential to improve the efficiency and precision of conventional plant breeding via marker-assisted selection (MAS). The large number of quantitative trait loci (QTLs) mapping studies for diverse crops species have provided an abundance of DNA marker–trait associations. In this review, we present an overview of the advantages of MAS and its most widely used applications in plant breeding, providing examples from cereal crops. We also consider reasons why MAS has had only a small impact on plant breeding so far and suggest ways in which the potential of MAS can be realized. Finally, we discuss reasons why the greater adoption of MAS in the future is inevitable, although the extent of its use will depend on available resources, especially for orphan crops, and may be delayed in less-developed countries. Achieving a substantial impact on crop improvement by MAS represents the great challenge for agricultural scientists in the next few decades.

Keywords:

1. Introduction

Plant breeding—in combination with developments in agricultural technology such as agrochemicals—has made remarkable progress in increasing crop yields for over a century. However, plant breeders must constantly respond to many changes. First, agricultural practices change, which creates the need for developing genotypes with specific agronomic characteristics. Second, target environments and the organisms within them are constantly changing. For example, fungal and insect pests continually evolve and overcome host–plant resistance. New land areas are regularly being used for farming, exposing plants to altered growing conditions. Finally, consumer preferences and requirements change. Plant breeders therefore face the endless task of continually developing new crop varieties (Evans 1997).

The outlook for global crop production in the twenty-first century has been analysed by many researchers and does not look bright (Pinstrup-Andersen et al. 1999). A rising global population will require increased crop production and some research suggests that the rate of increase in crop yields is currently declining (Pingali & Heisey 1999). This required increase in crop production will need to occur in the context of mounting water scarcity, decreasing area and environmental degradation of arable land (partly caused by agriculture), increasing pollution, inevitable emergence of new races and biotypes of pathogens and pests, and possible adverse effects of climate change. Thus, the task of increasing crop yields represents an unprecedented challenge for plant breeders and agricultural scientists.

Plant breeding will play a key role in this coordinated effort for increased food production. Given the context of current yield trends, predicted population growth and pressure on the environment, traits relating to yield stability and sustainability should be a major focus of plant breeding efforts. These traits include durable disease resistance, abiotic stress tolerance and nutrient- and water-use efficiency (Mackill et al. 1999; Slafer et al. 2005; Trethowan et al. 2005). Furthermore, there is a need to develop varieties for cultivation in marginal land areas, especially in developing countries, and give greater emphasis to improving minor or ‘orphan’ crops (Naylor et al. 2004).

Despite optimism about continued yield improvement from conventional breeding, new technologies such as biotechnology will be needed to maximize the probability of success (Ortiz 1998; Ruttan 1999; Huang et al. 2002). One area of biotechnology, DNA marker technology, derived from research in molecular genetics and genomics, offers great promise for plant breeding. Owing to genetic linkage, DNA markers can be used to detect the presence of allelic variation in the genes underlying these traits. By using DNA markers to assist in plant breeding, efficiency and precision could be greatly increased. The use of DNA markers in plant breeding is called marker-assisted selection (MAS) and is a component of the new discipline of ‘molecular breeding’.

(a) Features of cereal breeding

The fundamental basis of plant breeding is the selection of specific plants with desirable traits. Selection typically involves evaluating a breeding population for one or more traits in field or glasshouse trials (e.g. agronomic traits, disease resistance or stress tolerance), or with chemical tests (e.g. grain quality). The goal of plant breeding is to assemble more desirable combinations of genes in new varieties.

Standard breeding techniques for inbreeding cereal crops have been outlined in various textbooks (e.g. Allard 1999). In the commonly used pedigree breeding method, selecting desirable plants begins in early generations for traits of higher heritability. However, for traits of low heritability, selection is often postponed until the lines become more homozygous in later generations (F5 or F6). Selection of superior plants involves visual assessment for agronomic traits or resistance to stresses, as well as laboratory tests for quality or other traits. When the breeding lines become homozygous (F5 or later), they can be harvested in bulk and evaluated in replicated field trials. The entire process involves considerable time (5–10 years for elite lines to be identified) and expense.

The size and composition of a plant population is an important consideration for a breeding programme. The larger the number of genes segregating in a population, the larger the population size required in order to identify specific gene combinations. Typical breeding programmes usually grow hundreds or even thousands of populations, and many thousands or millions of individual plants (Witcombe & Virk 2001). Given the extent and complexity of selection required in breeding programmes, and the number and size of populations, one can easily appreciate the usefulness of new tools that may assist breeders in plant selection. The scale of breeding programmes also underlines the challenges of incorporating a relatively expensive technology such as MAS.

(b) Main types of DNA markers used in MAS

There are five main considerations for the use of DNA markers in MAS: reliability; quantity and quality of DNA required; technical procedure for marker assay; level of polymorphism; and cost (Mackill & Ni 2000; Mohler & Singrun 2004).

Reliability. Markers should be tightly linked to target loci, preferably less than 5 cM genetic distance. The use of flanking markers or intragenic markers will greatly increase the reliability of the markers to predict phenotype (figure 1).

Reliability of selection using single and flanking markers (adapted from Tanksley (1983), assuming no crossover interference). The recombination frequency between the target locus and marker A is approximately 5% (5 cM). Therefore, recombination may occur between the target locus and marker in approximately 5% of the progeny. The recombination frequency between the target locus and marker B is approximately 4% (4 cM). The chance of recombination occurring between both marker A and marker B (i.e. double crossover) is much lower than for single markers (approx. 0.4%). Therefore, the reliability of selection is much greater when flanking markers are used. Adapted from formulae from Liu (1998, p. 310).

DNA quantity and quality. Some marker techniques require large amounts and high quality of DNA, which may sometimes be difficult to obtain in practice, and this adds to the cost of the procedures.

Technical procedure. The level of simplicity and the time required for the technique are critical considerations. High-throughput simple and quick methods are highly desirable.

Level of polymorphism. Ideally, the marker should be highly polymorphic in breeding material (i.e. it should discriminate between different genotypes), especially in core breeding material.

Cost. The marker assay must be cost-effective in order for MAS to be feasible.

The most widely used markers in major cereals are called simple sequence repeats (SSRs) or microsatellites (Gupta et al. 1999; Gupta & Varshney 2000). They are highly reliable (i.e. reproducible), co-dominant in inheritance, relatively simple and cheap to use and generally highly polymorphic. The only disadvantages of SSRs are that they typically require polyacrylamide gel electrophoresis and generally give information only about a single locus per assay, although multiplexing of several markers is possible. These problems have been overcome in many cases by selecting SSR markers that have large enough size differences for detection in agarose gels, as well as multiplexing several markers in a single reaction. SSR markers also require a substantial investment of time and money to develop, and adequate numbers for high-density mapping are not available in some orphan crop species. Sequence tagged site (STS), sequence characterized amplified region (SCAR) or single nucleotide polymorphism (SNP) markers that are derived from specific DNA sequences of markers (e.g. restriction fragment length polymorphisms: RFLPs) that are linked to a gene or quantitative trait locus (QTL) are also extremely useful for MAS (Shan et al. 1999; Sanchez et al. 2000; Sharp et al. 2001).

(c) QTL mapping and MAS

The detection of genes or QTLs controlling traits is possible due to genetic linkage analysis, which is based on the principle of genetic recombination during meiosis (Tanksley 1993). This permits the construction of linkage maps composed of genetic markers for a specific population. Segregating populations such as F2, F3 or backcross (BC) populations are frequently used. However, populations that can be maintained and produced permanently, such as recombinant inbreds and doubled haploids, are preferable because they allow replicated and repeated experiments. These types of populations may not be applicable to outbreeding cereals where inbreeding depression can cause non-random changes in gene frequency and loss of vigour of the lines. Using statistical methods such as single-marker analysis or interval mapping to detect associations between DNA markers and phenotypic data, genes or QTLs can be detected in relation to a linkage map (Kearsey 1998). The identification of QTLs using DNA markers was a major breakthrough in the characterization of quantitative traits (Paterson et al. 1988).

Reports have been numerous of DNA markers linked to genes or QTLs (Mohan et al. 1997; Francia et al. 2005). An overview of marker development is presented in figure 2. Previously, it was assumed that most markers associated with QTLs from preliminary mapping studies were directly useful in MAS. However, in recent years it has become widely accepted that QTL confirmation, QTL validation and/or fine (or high resolution) mapping may be required (Langridge et al. 2001). Although there are examples of highly accurate preliminary QTL mapping data as determined by subsequent QTL mapping research (Price 2006), ideally a confirmation step is preferable because QTL positions and effects can be inaccurate due to factors such as sampling bias (Melchinger et al. 1998). QTL validation generally refers to the verification that a QTL is effective in different genetic backgrounds (Langridge et al. 2001). Additional marker-testing steps may involve identifying a ‘toolbox’ or ‘suite’ of markers within a 10 cM ‘window’ spanning and flanking a QTL (due to a limited polymorphism of individual markers in different genotypes) and converting markers into a form that requires simpler methods of detection.

Once tightly linked markers that reliably predict a trait phenotype have been identified, they may be used for MAS. The fundamental advantages of MAS over conventional phenotypic selection are as follows.

It may be simpler than phenotypic screening, which can save time, resources and effort. Classical examples of traits that are difficult and laborious to measure are cereal cyst nematode and root lesion nematode resistance in wheat (Eastwood et al. 1991; Eagles et al. 2001; Zwart et al. 2004). Other examples are quality traits which generally require expensive screening procedures.

Selection can be carried out at the seedling stage. This may be useful for many traits, but especially for traits that are expressed at later developmental stages. Therefore, undesirable plant genotypes can be quickly eliminated. This may have tremendous benefits in rice breeding because typical rice production practices involve sowing pre-germinated seeds and transplanting seedlings into rice paddies, making it easy to transplant only selected seedlings to the main field.

Single plants can be selected. Using conventional screening methods for many traits, plant families or plots are grown because single-plant selection is unreliable due to environmental factors. With MAS, individual plants can be selected based on their genotype. For most traits, homozygous and heterozygous plants cannot be distinguished by conventional phenotypic screening.

These advantages can be exploited by breeders to accelerate the breeding process (Ribaut & Hoisington 1998; Morris et al. 2003). Target genotypes can be more effectively selected, which may enable certain traits to be ‘fast-tracked’, resulting in quicker line development and variety release. Markers can also be used as a replacement for phenotyping, which allows selection in off-season nurseries making it more cost-effective to grow more generations per year (Ribaut & Hoisington 1998). Another benefit from using MAS is that the total number of lines that need to be tested can be reduced. Since many lines can be discarded after MAS early in a breeding scheme, this permits more efficient use of glasshouse and/or field space—which is often limited—because only important breeding material is maintained.

Considering the potential advantages of MAS over conventional breeding, one rarely discussed point is that markers will not necessarily be useful or more effective for every trait, despite the substantial investment in time, money and resources required for their development. For many traits, effective phenotypic screening methods already exist and these will often be less expensive for selection in large populations. However, when whole-genome scans are being used, even these traits can be selected for if the genetic control is understood.

3. Applications of MAS in plant breeding

The advantages described above may have a profound impact on plant breeding in the future and may alter the plant breeding paradigm (Koebner & Summers 2003). In this section, we describe the main uses of DNA markers in plant breeding, with an emphasis on important MAS schemes. We have classified these schemes into five broad areas: marker-assisted evaluation of breeding material; marker-assisted backcrossing; pyramiding; early generation selection; and combined MAS, although there may be overlap between these categories. Generally, for line development, DNA markers have been integrated in conventional schemes or used to substitute for conventional phenotypic selection.

(a) Marker-assisted evaluation of breeding material

Prior to crossing (hybridization) and line development, there are several applications in which DNA marker data may be useful for breeding, such as cultivar identity, assessment of genetic diversity and parent selection, and confirmation of hybrids. Traditionally, these tasks have been done based on visual selection and analysing data based on morphological characteristics.

(i) Cultivar identity/assessment of ‘purity’

In practice, seed of different strains is often mixed due to the difficulties of handling large numbers of seed samples used within and between crop breeding programmes. Markers can be used to confirm the true identity of individual plants. The maintenance of high levels of genetic purity is essential in cereal hybrid production in order to exploit heterosis. In hybrid rice, SSR and STS markers were used to confirm purity, which was considerably simpler than the standard ‘grow-out tests’ that involve growing the plant to maturity and assessing morphological and floral characteristics (Yashitola et al. 2002).

(ii) Assessment of genetic diversity and parental selection

Breeding programmes depend on a high level of genetic diversity for achieving progress from selection. Broadening the genetic base of core breeding material requires the identification of diverse strains for hybridization with elite cultivars (Xu et al. 2004; Reif et al. 2005). Numerous studies investigating the assessment of genetic diversity within breeding material for practically all crops have been reported. DNA markers have been an indispensable tool for characterizing genetic resources and providing breeders with more detailed information to assist in selecting parents. In some cases, information regarding a specific locus (e.g. a specific resistance gene or QTL) within breeding material is highly desirable. For example, the comparison of marker haplotypes has enabled different sources of resistance to Fusarium head blight, which is a major disease of wheat worldwide, to be predicted (Liu & Anderson 2003; McCartney et al. 2004).

(iii) Study of heterosis

For hybrid crop production, especially in maize and sorghum, DNA markers have been used to define heterotic groups that can be used to exploit heterosis (hybrid vigour). The development of inbred lines for use in producing superior hybrids is a very time-consuming and expensive procedure. Unfortunately, it is not yet possible to predict the exact level of heterosis based on DNA marker data although there have been reports of assigning parental lines to the proper heterotic groups (Lee et al. 1989; Reif et al. 2003). The potential of using smaller subsets of DNA marker data in combination with phenotypic data to select heterotic hybrids has also been proposed (Jordan et al. 2003).

(iv) Identification of genomic regions under selection

The identification of shifts in allele frequencies within the genome can be important information for breeders since it alerts them to monitor specific alleles or haplotypes and can be used to design appropriate breeding strategies (Steele et al. 2004). Other applications of the identification of genomic regions under selection are for QTL mapping: the regions under selection can be targeted for QTL analysis or used to validate previously detected marker–trait associations (Jordan et al. 2004). Ultimately, data on genomic regions under selection can be used for the development of new varieties with specific allele combinations using MAS schemes such as marker-assisted backcrossing or early generation selection (described below; Ribaut et al. 2001; Steele et al. 2004).

(b) Marker-assisted backcrossing

Backcrossing has been a widely used technique in plant breeding for almost a century. Backcrossing is a plant breeding method most commonly used to incorporate one or a few genes into an adapted or elite variety. In most cases, the parent used for backcrossing has a large number of desirable attributes but is deficient in only a few characteristics (Allard 1999). The method was first described in 1922 and was widely used between the 1930s and 1960s (Stoskopf et al. 1993).

The use of DNA markers in backcrossing greatly increases the efficiency of selection. Three general levels of marker-assisted backcrossing (MAB) can be described (Holland 2004; figure 3). In the first level, markers can be used in combination with or to replace screening for the target gene or QTL. This is referred to as ‘foreground selection’ (Hospital & Charcosset 1997). This may be particularly useful for traits that have laborious or time-consuming phenotypic screening procedures. It can also be used to select for reproductive-stage traits in the seedling stage, allowing the best plants to be identified for backcrossing. Furthermore, recessive alleles can be selected, which is difficult to do using conventional methods.

The second level involves selecting BC progeny with the target gene and recombination events between the target locus and linked flanking markers—we refer to this as ‘recombinant selection’. The purpose of recombinant selection is to reduce the size of the donor chromosome segment containing the target locus (i.e. size of the introgression). This is important because the rate of decrease of this donor fragment is slower than for unlinked regions and many undesirable genes that negatively affect crop performance may be linked to the target gene from the donor parent—this is referred to as ‘linkage drag’ (Hospital 2005). Using conventional breeding methods, the donor segment can remain very large even with many BC generations (e.g. more than 10; Ribaut & Hoisington 1998; Salina et al. 2003). By using markers that flank a target gene (e.g. less than 5 cM on either side), linkage drag can be minimized. Since double recombination events occurring on both sides of a target locus are extremely rare, recombinant selection is usually performed using at least two BC generations (Frisch et al. 1999b).

The third level of MAB involves selecting BC progeny with the greatest proportion of recurrent parent (RP) genome, using markers that are unlinked to the target locus—we refer to this as ‘background selection’. In the literature, background selection refers to the use of tightly linked flanking markers for recombinant selection and unlinked markers to select for the RP (Hospital & Charcosset 1997; Frisch et al. 1999b). Background markers are markers that are unlinked to the target gene/QTL on all other chromosomes, in other words, markers that can be used to select against the donor genome. This is extremely useful because the RP recovery can be greatly accelerated. With conventional backcrossing, it takes a minimum of six BC generations to recover the RP and there may still be several donor chromosome fragments unlinked to the target gene. Using markers, it can be achieved by BC4, BC3 or even BC2 (Visscher et al. 1996; Hospital & Charcosset 1997; Frisch et al. 1999a,b), thus saving two to four BC generations. The use of background selection during MAB to accelerate the development of an RP with an additional (or a few) genes has been referred to as ‘complete line conversion’ (Ribaut et al. 2002).

Some examples of MAB in cereals are presented in table 1. MAB will probably become an increasingly more popular approach, largely for the same reasons that conventional backcrossing has been widely used (Mackill 2006). For practical reasons, farmers in developed and developing countries generally prefer to grow their ‘tried and tested’ varieties. Farmers have already determined the optimum sowing rates and date, fertilizer application rates and number and timing of irrigations for these varieties (Borlaug 1957). There may also be reluctance from millers or the marketing industry to dramatically change a variety since they have established protocols for testing flour characteristics. Furthermore, even with the latest developments in genetic engineering technology and plant tissue culture, some specific genotypes are still more amenable to transformation than others. Therefore, MAB must be used in order to trace the introgression of the transgene into elite cultivars during backcrossing.

(c) Marker-assisted pyramiding

Pyramiding is the process of combining several genes together into a single genotype. Pyramiding may be possible through conventional breeding but it is usually not easy to identify the plants containing more than one gene. Using conventional phenotypic selection, individual plants must be evaluated for all traits tested. Therefore, it may be very difficult to assess plants from certain population types (e.g. F2) or for traits with destructive bioassays. DNA markers can greatly facilitate selection because DNA marker assays are non-destructive and markers for multiple specific genes can be tested using a single DNA sample without phenotyping.

The most widespread application for pyramiding has been for combining multiple disease resistance genes (i.e. combining qualitative resistance genes together into a single genotype). The motive for this has been the development of ‘durable’ or stable disease resistance since pathogens frequently overcome single-gene host resistance over time due to the emergence of new plant pathogen races. Some evidence suggests that the combination of multiple genes (effective against specific races of a pathogen) can provide durable (broad spectrum) resistance (Kloppers & Pretorius 1997; Shanti et al. 2001; Singh et al. 2001). The ability of a pathogen to overcome two or more effective genes by mutation is considered much lower compared with the ‘conquering’ of resistance controlled by a single gene. In the past, it has been difficult to pyramid multiple resistance genes because they generally show the same phenotype, necessitating a progeny test to determine which plants possess more than one gene. With linked DNA markers, the number of resistance genes in any plant can be easily determined. The incorporation of quantitative resistance controlled by QTLs offers another promising strategy to develop durable disease resistance. Castro et al. (2003) referred to quantitative resistance as an insurance policy in case of the breakdown of qualitative resistance. A notable example of the combination of quantitative resistance was the pyramiding of a single stripe rust gene and two QTLs (Castro et al. 2003).

Pyramiding may involve combining genes from more than two parents. For example, Hittalmani et al. (2000) and Castro et al. (2003) combined genes originating from three parents for rice blast and stripe rust in barley, respectively. MAS pyramiding was also proposed as an effective approach to produce three-way F1 cereal hybrids with durable resistance (Witcombe & Hash 2000). Strategies for MAS pyramiding of linked target genes have also been evaluated (Servin et al. 2004). For many linked target loci, pyramiding over successive generations is preferable in terms of minimizing marker genotyping.

In theory, MAS could be used to pyramid genes from multiple parents (i.e. populations derived from multiple crosses). Some examples of MAS pyramiding in cereals are presented in table 2. In the future, MAS pyramiding could also facilitate the combination of QTLs for abiotic stress tolerances, especially QTLs effective at different growth stages. Another use could be to combine single QTLs that interact with other QTLs (i.e. epistatic QTLs). This was experimentally validated for two interacting resistance QTLs for rice yellow mottle virus (Ahmadi et al. 2001).

(d) Early generation marker-assisted selection

Although markers can be used at any stage during a typical plant breeding programme, MAS is a great advantage in early generations because plants with undesirable gene combinations can be eliminated. This allows breeders to focus attention on a lesser number of high-priority lines in subsequent generations. When the linkage between the marker and the selected QTL is not very tight, the greatest efficiency of MAS is in early generations due to the increasing probability of recombination between the marker and QTL. The major disadvantage of applying MAS at early generations is the cost of genotyping a larger number of plants.

One strategy proposed by Ribaut & Betran (1999) involving MAS at an early generation was called single large-scale MAS (SLS–MAS). The authors proposed that a single MAS step could be performed on F2 or F3 populations derived from elite parents. This approach used flanking markers (less than 5 cM, on both sides of a target locus) for up to three QTLs in a single MAS step. Ideally, these QTLs should account for the largest proportion of phenotypic variance and be stable in different environments.

The population sizes may soon become quite small due to the high selection pressure, thus providing an opportunity for genetic drift to occur at non-target loci, so it is recommended that large population sizes be used (Ribaut & Betran 1999). This problem can also be minimized by using F3 rather than F2 populations, because the selected proportion of an F3 population is larger compared with that of an F2 population (i.e. for a single target locus, 38% of the F3 population will be selected compared with 25% of the F2). Ribaut & Betran (1999) also proposed that, theoretically, linkage drag could be minimized by using additional flanking markers surrounding the target QTLs, much in the same way as in MAB.

For self-pollinated crops, an important aim may be to fix alleles in their homozygous state as early as possible. For example, in bulk and single-seed descent breeding methods, screening is often performed at the F5 or F6 generations when most loci are homozygous. Using co-dominant DNA markers, it is possible to fix specific alleles in their homozygous state as early as the F2 generation. However, this may require large population sizes; thus, in practical terms, a small number of loci may be fixed at each generation (Koebner & Summers 2003). An alternative strategy is to ‘enrich’ rather than fix alleles—by selecting homozygotes and heterozygotes for a target locus—within a population in order to reduce the size of the breeding populations required (Bonnett et al. 2005).

(e) Combined marker-assisted selection

There are several instances when phenotypic screening can be strategically combined with MAS. In the first instance, ‘combined MAS’ (coined by Moreau et al. 2004) may have advantages over phenotypic screening or MAS alone in order to maximize genetic gain (Lande & Thompson 1990). This approach could be adopted when additional QTLs controlling a trait remain unidentified or when a large number of QTLs need to be manipulated. Simulation studies indicate that this approach is more efficient than phenotypic screening alone, especially when large population sizes are used and trait heritability is low (Hospital et al. 1997). Bohn et al. (2001) investigated the prospect of MAS for improving insect resistance in tropical maize and found that MAS alone was less efficient than conventional phenotypic selection. However, there was a slight increase in relative efficiency when MAS and phenotypic screening were combined. In an example in wheat, MAS combined with phenotypic screening was more effective than phenotypic screening alone for a major QTL on chromosome 3BS for Fusarium head blight resistance (Zhou et al. 2003b). In practice, all MAS schemes will be used in the context of the overall breeding programme, and this will involve phenotypic selection at various stages. This will be necessary to confirm the results of MAS as well as select for traits or genes for which the map location is unknown.

In some (possibly many) situations, there is a low level of recombination between a marker and QTL, unless markers flanking the QTL are used (Sanchez et al. 2000; Sharp et al. 2001). In other words, a marker assay may not predict phenotype with 100% reliability. However, plant selection using such markers may still be useful for breeders in order to select a subset of plants using the markers to reduce the number of plants that need to be phenotypically evaluated. This may be particularly advantageous when the cost of marker genotyping is cheaper than phenotypic screening, such as for quality traits (Han et al. 1997). This was referred to as ‘tandem selection’ by Han et al. (1997) and ‘stepwise selection’ by Langridge & Chalmers (2005).

In addition to complementing conventional breeding methods, mapping QTLs for important traits may have an indirect benefit in a conventional breeding programme. In many cases, this occurs when traits which were thought to be under the complex genetic control are found to be under the influence of one or a few major QTLs. For example, in pearl millet downy mildew resistance was found to be under the control of genes of major effect (Jones et al. 1995). Likewise, submergence tolerance of rice was found to be under the control of the major QTL Sub1, which helped simplify the breeding for this trait (Mackill et al. 2006).

4. Reasons to explain the low impact of marker-assisted selection

(a) Still at the early stages of DNA marker technology development

Although DNA markers were first developed in the late 1980s, more user-friendly PCR-based markers such as SSRs were not developed until the mid- to late 1990s. Although currently large numbers of SSRs are publicly available for major cereals, this number was initially very low. It is only during the last 5–7 years that these markers could have been widely used, and tangible results may not yet have been produced. Inspection of the publication dates for the examples in tables 1 and 2 supports this. If this is the case, there should be a notable increase in the number of published papers describing MAS in the next 10 years and beyond.

(b) Marker-assisted selection results may not be published

Although QTL mapping has many potential practical outcomes, it is considered to be a basic research process, and results are typically published in scientific journals. However, for plant breeding, the final ‘product’ is a new variety. Although these varieties are registered, explicit details regarding the use of DNA markers during breeding may not be provided. Another reason for the limited number of published reports could be that private seed companies typically do not disclose details of methodology due to competition with other seed companies. In general, the problem of publishing also extends to QTL validation and QTL mapping. New QTLs are frequently reported in scientific journals, but reconfirmation of these QTLs in other germplasm and identification of more useful markers are usually not considered novel enough to warrant new publications. This is unfortunate because it is exactly this type of information that is needed for MAS. Some of this information can be found in symposia abstracts or web sites, but often this information is not very informative. An excellent example of successful MAS is the development of an improved version of the pearl millet hybrid HHB 67 with resistance to downy mildew, described at http://www.dfid-psp.org/AtAGlance/HotTopic.html.

The accuracy of the QTL mapping study is critical to the success of MAS. This is particularly important when QTL mapping is undertaken for more complex traits, such as yield, that are controlled by many QTLs with small effects compared with simple traits. Many factors may affect the accuracy of a QTL mapping study such as the level of replication used to generate phenotypic data and population size (Kearsey & Farquhar 1998; Young 1999). Simulation and experimental studies have indicated that the power of QTL detection is low with the typical populations (less than 200) that are used (Beavis 1998; Kearsey & Farquhar 1998). As a result, confidence intervals for regions containing QTLs may be large, even for QTLs with large effects. Furthermore, sampling bias can lead to a large bias in estimates of QTL effects, especially in relatively small population sizes (Melchinger et al. 1998). These factors have important implications for MAS, since the basis for selecting markers depends on the accurate determination of the position and effect of a QTL.

In some cases, recombination occurs between the marker and gene/QTL due to loose linkage (Sharp et al. 2001; Thomas 2003). This may occur even if genetic distances from a preliminary QTL mapping study indicated tight linkage, because data from a single QTL mapping experiment may not be accurate (Sharp et al. 2001). The process of marker validation is required to determine the reliability of a marker to predict phenotype and this points out the advantages of using flanking markers.

Ideally, markers should be ‘diagnostic’ for traits in a wide range of breeding material. In other words, markers should clearly discriminate between varieties that do and do not express the trait. Unfortunately, in practice, DNA markers are not always diagnostic. For example, a wheat SSR marker was diagnostic for the Sr2 gene (controlling stem rust resistance) for all except four susceptible Australian cultivars, in which the same marker allele was detected as for the source of resistance (Spielmeyer et al. 2003). This would preclude the use of this SSR marker for the introgression of resistance in the four susceptible cultivars, requiring that additional markers be developed. Even with the large numbers of available markers in some crops, there can be specific chromosome regions containing an important gene or QTL for which it is difficult to find polymorphic markers.

(f) Effects of genetic background

It has been observed that QTLs identified in a particular mapping population may not be effective in different backgrounds (Liao et al. 2001). For example, Steele et al. (2006) found that only one of four root-length QTLs were effective when transferred by backcrossing into a new rice variety. In some cases, this is due to the small effect of an allele transferred into elite varieties (Charcosset & Moreau 2004). Often for QTL mapping experiments, parents that represent the extreme ends of a trait phenotype are selected. This increases the chance of detecting QTLs because QTL mapping is based on statistically different means of marker groups. The main disadvantage with this approach is that one (or even both) parent(s) may possess QTL alleles that are similar or even identical to the elite germplasm used in breeding programmes. In this case, the effect of a QTL may be insignificant when used for introgression into elite varieties. In other cases, the effect of a QTL may differ in different genetic backgrounds due to interactions with other loci or epistasis (Holland 2001; Li 2000).

(g) Quantitative trait loci×environment effects

While the effects of many QTLs appear to be consistent across environments, the magnitude of effect and even direction of QTLs may vary depending on environmental conditions due to QTL×environment interactions (Hayes et al. 1993; Romagosa et al. 1999; Bouchez et al. 2002; Li et al. 2003). This often occurs for QTLs with smaller effects. The extent of QTL×environment interactions is often unknown because the QTL mapping studies have been limited to only a few years (replications) or locations. The existence of QTL×environment interactions must be carefully considered in order to develop an effective MAS scheme.

(h) High cost of marker-assisted selection

The cost of using MAS compared with conventional phenotypic selection may vary considerably, although only a relatively small number of studies have addressed this topic. Landmark papers by Dreher et al. (2003) and Morris et al. (2003) showed that the cost–benefit ratio of MAS will depend on several factors, such as the inheritance of the trait, the method of phenotypic evaluation, the cost of field and glasshouse trials and labour costs. It is also worth noting that large initial capital investments are required for the purchase of equipment, and regular expenses will be incurred for maintenance. Intellectual property rights, for example, licensing costs due to patents, may also affect the cost of MAS (Jorasch 2004; Brennan et al. 2005). One approach to this problem is to contract the marker work out to larger laboratories that can benefit from economies of scale and high-throughput equipment.

(i) ‘Application gap’ between research laboratories and plant breeding institutes

In many cases, QTL mapping research is undertaken at universities whereas breeding is generally undertaken at different locations such as research stations or private companies. Consequently, there may be difficulties in the transfer of markers and relevant information to breeders in situations where the two groups do not work closely together. More importantly, Van Sanford et al. (2001) also pointed out that transfer problems may be related to the culture of the scientific community. Given the emphasis on conducting innovative research, and on the publication of research results within academic institutions, scientists do not have much motivation to ensure that markers are developed into breeder-friendly ones and that they are actually applied in breeding programmes. This is even truer for activities in the private sector where publication of results is generally discouraged.

DNA marker technology, QTL theory and statistical methodology for QTL analysis have undergone rapid developments in the past two decades. These concepts and the jargon used by molecular biologists may not be clearly understood by plant breeders and other plant scientists (Collard et al. 2005). In addition to this, many highly specialized pieces of equipment are based on sophisticated techniques used for molecular genotyping. Similarly, fundamental concepts in plant breeding may not be well understood by molecular biologists. This restricts the level of integration between conventional plant and molecular breeding and ultimately affects the development of new breeding lines.

5. Plant breeding in the future: the dawn of marker-assisted selection?

Despite the relatively small impact that MAS has had on variety development to date, there has been a ‘cautious optimism’ for the future (Young 1999). We predict that six main factors will give rise to a much greater level of adoption of MAS in plant breeding in the early part of the twenty-first century in many breeding programmes.

First, the extent to which DNA marker technology has already spread to plant breeding institutes coupled with the enormous amount of data from previous QTL mapping and MAS studies should lead to the greater adoption of MAS. Many such institutes now possess the essential equipment and expertise required for marker genotyping. Of course, the frequency of use will depend on available funding.

Second, since the landmark concept of ‘advanced BC QTL analysis’ directly integrated QTL mapping with plant breeding by combining QTL mapping with simultaneous variety development (Tanksley & Nelson 1996), there have been several encouraging examples of an efficient merging of plant and molecular breeding. Some of these excellent examples are Toojinda et al. (1998) and Castro et al. (2003) in which QTL mapping and MAS breeding were combined. There have also been encouraging reports of the combination of QTL validation and line development (Flint-Garcia et al. 2003b). The use of backcrossing and the development of near-isogenic lines (NILs) may be particularly advantageous in this context (Stuber et al. 1999; van Berloo et al. 2001). Ideally, QTL mapping and marker-assisted line development should now always be conceived together, in a holistic scheme.

Third, the increasing use of genetic transformation technology means that MAS can be used to directly select for progeny that possess transgenes via target gene selection. As discussed earlier, specific genotypes often with poor agronomic characteristics are routinely used for transformation. Therefore, MAS can be used to track the transgenes during elite line development.

Fourth, a rapid growth in genomics research has taken place within the last decade. Data generated from functional genomics studies have led to the identification of many candidate genes for numerous traits. SNPs within candidate genes could be extremely useful for ‘association mapping’ and ultimately MAS (Rafalski 2002; Flint-Garcia et al. 2003a; Gupta et al. 2005; Breseghello & Sorrells 2006). This approach also circumvents the requirement for constructing linkage maps and performing QTL analysis for new genotypes that have not been previously mapped, although genotyping and phenotyping of segregating populations (e.g. F2 or F3) is recommended for marker validation (Breseghello & Sorrells 2006). Furthermore, genome sequencing projects in rice and other crop species will provide considerable data that could be used for QTL mapping and marker development in other cereals (Gale & Devos 1998; Yuan et al. 2001; Varshney et al. 2005). However, the costs associated with genomics research may be considerable. This could be detrimental to breeding programmes if funding is diverted away from actual breeding efforts (Brummer 2004).

Fifth, many new high-throughput methods for DNA extraction and especially new high-throughput marker genotyping platforms have been developed (Syvanen 2001, 2005). A current trend in some crops is the adoption of high-throughput genotyping equipment for SSR and SNP markers, although the cost of these new platforms may be higher than for standard genotyping methods (Brennan et al. 2005). Some of these genotyping platforms use fluorescently labelled primers that permit high levels of multiplexing (Coburn et al. 2002). Some authors have predicted that SNP markers, due to their widespread abundance and potentially high levels of polymorphism, and the development of SNP genotyping platforms will have a great impact on MAS in the future (Rafalski 2002; Koebner & Summers 2003). Numerous SNP genotyping platforms have been recently developed, usually for medical applications; however, at present no superior platform has been universally adopted (Syvanen 2001). Array-based methods such as Diversity Array Technology (DArT; Jaccoud et al. 2001) and single feature polymorphism (SFP) detection (Hazen & Kay 2003; Rostoks et al. 2005) offer prospects for lower-cost marker technology that can be used for whole-genome scans.

Finally, the availability of large numbers of publicly available markers and the parallel development of user-friendly databases for the storage of marker and QTL data will undoubtedly encourage the more widespread use of MAS. In cereals, two of the most extensive and useful databases are ‘Gramene’ and ‘GrainGenes’ (Ware et al. 2002a,b; Matthews et al. 2003). The development and curation of these and other databases to keep pace with the continually growing amount of data generated will be critical for the efficient use of markers in the future (Lehmensiek et al. 2005).

Although we believe that these factors will lead to the greater adoption of MAS in many instances (especially for major cereals), there will clearly be situations in which the incorporation of MAS in plant breeding programmes will still be very slow or even non-existent, for example in orphan crop species and in developing countries (Naylor et al. 2004). In both of these situations, funding of research and breeding programmes is extremely limited. The improvement of orphan crop species, especially in developing countries—using any method—represents another great challenge for agricultural scientists.

Generally, the cost of MAS will continue to be a major obstacle for its application. Some cost estimates for consumables and labour associated with MAS are listed in table 3 in order to provide information for breeding programmes. It should be noted that MAS cost estimates may change depending on the number of samples and/or number of marker assays. The study by Dreher et al. (2003) indicated that costs may decrease as the number of samples and/or marker assays increases due to economies of scale and lack of divisibility for many components of MAS. One current trend is the establishment of marker genotyping companies, which will enable marker genotyping to be outsourced. Assuming that the costs for outsourcing genotyping are cheaper, and that logistical problems are not created or are minimal, this may provide breeding programmes with more opportunities for MAS. Furthermore, some new SNP high-throughput genotyping methods may also be comparable with or even cheaper than current methods, although a large initial investment is required for the purchase of equipment (Chen & Sullivan 2003).

Estimates of costs (consumables and labour) per data point for marker genotyping during MAS.

6. Realizing the potential of marker-assisted selection for crop improvement

Considering the enormous potential of MAS in plant breeding, achieving a tangible impact on crop improvement represents the great challenge of molecular breeding in the early part of the twenty-first century. Solutions to the above-mentioned obstacles of MAS need to be developed in order to achieve a greater impact. In the short term, the most important factors that should enable the impact of MAS to be realized include:

a greater level of integration among conventional breeding, QTL mapping/validation and MAS,

careful planning and execution of QTL mapping studies (especially for complex quantitative traits) and an emphasis on validating results prior to MAS,

optimization of methods used in MAS such as DNA extraction and marker genotyping, especially in terms of cost reduction and efficiency, and

For MAS to reach its full potential for crop improvement, the advantages of MAS over conventional breeding need to be fully exploited. This may depend on ex ante studies evaluating alternative schemes prior to experimentation. Computer simulations may indicate the most effective breeding schemes in order to maximize genetic gain and minimize costs (Kuchel et al. 2005). Based on the schemes of MAS reviewed in this paper, the most important areas to target include:

use of markers for the selection of parents in breeding programmes,

continued use of MAS for high-priority traits that are difficult, time consuming or expensive to measure,

using markers to minimize linkage drag via recombinant selection,

screening of multiple traits per line (i.e. per unit of DNA), especially populations derived from multiple F1s for pyramiding,

exploiting the ability to rapidly eliminate unsuitable lines after early generation selection or tandem selection in breeding programmes, thus allowing breeders to concentrate on the most promising materials, and

exploiting the time savings for line development (especially using background selection) for accelerated variety release.

For MAS in orphan crops and breeding programmes in developing countries, emphasis should be given to careful prioritization of traits for marker development as well as simplifying and optimizing methods to reduce marker genotyping costs. Currently at IRRI, we are investigating ways in which marker genotyping costs can be further reduced. Preliminary cost analysis indicates potential for cost reduction of standard genotyping methods, which was also reported to be the case at CIMMYT (Dreher et al. 2003). An effective strategy to increase the arsenal of DNA markers in orphan crops could be to conduct data mining of genomics databases. An excellent example of the use of publicly available DNA sequence data to develop new markers for an orphan crop was the development of single-strand conformational polymorphism (SSCP)–SNP markers in pearl millet (Bertin et al. 2005). Similarly, information on rice markers has been useful for genetics of American wild rice, Zizania palustris (Phillips et al. 2006).

Generally, innovation—big and small—may play an important role in obtaining tangible benefits from MAS. Dekkers & Hospital (2002) stated that there is considerable scope for innovative plant/molecular breeding schemes that are tailor-made for using DNA markers; such schemes could lead to a completely new plant breeding paradigm.

Advances in functional genomics will lead to the rapid identification of gene functions in the major cereal crops. This strategy usually relies on fine mapping using molecular markers, as well as other methods such as gene-expression studies (microarray), mutants and gene knockouts, RNAi and association genetics. The identification of gene function will allow the development of allele-specific markers that will be more efficient than using linked DNA markers. In addition, the identified genes can be used for transformation studies as well as mining of gene banks to find more useful alleles. Even though we can expect far-reaching advances in the area of gene function identification, the complex genetic interactions that produce different phenotypes may remain unexplained for the most part. However, even in these cases, we may identify chromosome fragments that are conducive to improved phenotype.

A breeding application resulting from the development of high-throughput genotyping equipment is the use of ‘whole-genome scans’ for determining allelic variation at many agronomically important loci in the genome (Langridge & Chalmers 2005; Langridge 2006). One recent approach called ‘breeding by design’ could enable breeders to exploit known allelic variation to design superior genotypes by combining multiple favourable alleles (Peleman & van der Voort 2003). This also means that plants with the desired combinations of genes can be pre-selected before extensive and expensive field testing. In many cases, the objective would be just to avoid advanced testing of a number of lines with similar genotypic constitutions. Current limitations to the application of breeding by design or similar approaches include the prohibitive cost, since thousands of marker loci need to be scored in breeding material and, perhaps more importantly, our current knowledge and understanding of the function of the majority of agronomically important genes and allelic interactions with respect to phenotype which remain unknown. Therefore, at least in the short term, such approaches will probably not have a great impact on crop improvement.

7. Conclusions

Plant breeding has made remarkable progress in crop improvement and it is critical that this continue. It seems clear that current breeding programmes continue to make progress through commonly used breeding approaches. MAS could greatly assist plant breeders in reaching this goal although, to date, the impact on variety development has been minimal. For the potential of MAS to be realized, it is imperative that there should be a greater integration with breeding programmes and that current barriers be well understood and appropriate solutions developed. The exploitation of the advantages of MAS relative to conventional breeding could have a great impact on crop improvement. The high cost of MAS will continue to be a major obstacle for its adoption for some crop species and plant breeding in developing countries in the near future. Specific MAS strategies may need to be tailored to specific crops, traits and available budgets. New marker technology can potentially reduce the cost of MAS considerably. If the effectiveness of the new methods is validated and the equipment can be easily obtained, this should allow MAS to become more widely applicable for crop breeding programmes.

Borlaug, N. E. 1957 The development and use of composite varieties based upon the mechanical mixing of phenotypically similar lines developed through backcrossing. Report of the Third International Wheat Conference, pp. 12–18.

2004Comparison of identity by descent and identity by state for detecting genetic regions under selection in a sorghum pedigree breeding program. Mol. Breed. 14, 441–454. doi:10.1007/s11032-005-0901-y.

1998Quantitative trait locus (QTL) mapping using different testers and independent population samples in maize reveals low power of QTL detection and large bias in estimates of QTL effects. Genetics. 149, 383–403.

1998Critical role of plant biotechnology for the genetic improvement of food crops: perspectives for the next millennium. Electron. J. Biotechnol. 1(3), [cited 15 August], doi:10.2225/vol1-issue3-fulltext-7.