Abstract

Two strategies to control mosquito-borne diseases, such as malaria and dengue fever, are reducing mosquito population sizes or replacing populations with disease-refractory varieties. We propose a genetic system, Semele, which may be used for both. Semele consists of two components: a toxin expressed in transgenic males that either kills or renders infertile wild-type female recipients and an antidote expressed in females that protects them from the effects of the toxin. An all-male release results in population suppression because wild-type females that mate with transgenic males produce no offspring. A release that includes transgenic females results in gene drive since females carrying the allele are favored at high population frequencies. We use simple population genetic models to explore the utility of the Semele system. We find that Semele can spread under a wide range of conditions, all of which require a high introduction frequency. This feature is desirable since transgenic insects released accidentally are unlikely to persist, transgenic insects released intentionally can be spatially confined, and the element can be removed from a population through sustained release of wild-type insects. We examine potential barriers to Semele gene drive and suggest molecular tools that could be used to build the Semele system.

MOSQUITO-BORNE diseases such as malaria and dengue fever continue to pose a major health problem through much of the world. The goal of the Roll Back Malaria Initiative to halve malaria deaths by 2010 was not successful even in reducing malaria deaths (Shiff 2000; World Health Organization 2009), and a treatment for dengue fever still remains elusive. The failure of existing methods to control these diseases has renewed interest in approaches to disease prevention that involve the use of genetically modified mosquitoes (Braig and Yan 2001; Alpheyet al. 2002; Sinkins and Gould 2006; Marshall and Taylor 2009).

There are two main strategies being considered to control vector-borne diseases using transgenic vectors. The first involves the release of genetically modified males that will mate with wild females and produce unviable offspring (Whitten and Foster 1975; Alpheyet al. 2002; Dycket al. 2005; Catterucciaet al. 2009). This is a genetic version of the sterile insect technique and is intended to dramatically reduce the vector population size and consequently reduce disease transmission. The technology for this strategy has already been developed for Aedes aegypti—the main vector of dengue fever—and preparations are currently being made for an environmental release (Vasan 2009).

The second strategy for disease prevention is to replace entire populations of mosquitoes with varieties that are refractory to disease transmission. A variety of genes conferring disease refractoriness have been identified in nature and engineered in the laboratory. For example, with respect to malaria, Itoet al. (2002) engineered a gene that saturates the receptor sites that the malaria parasite requires to pass through the mosquito gut following ingestion; de Lara Capurroet al. (2000) developed antibodies that kill malaria parasites; Riehleet al. (2006) discovered genes that govern refractoriness in natural populations; and Corby-Harriset al. (2010) activated a signaling pathway that dramatically reduces both parasite development and mosquito longevity. Expression of RNAs that induce RNA interference targeting dengue virus has also been shown to reduce dengue transmission (Franzet al. 2006). Mosquitoes carrying genes that mediate disease refractoriness are not expected to experience a fitness benefit in both the presence and the absence of infection (Lambrechtset al. 2008) and may well experience a cost (Schmid-Hempel 2005). Given that a very high fraction of mosquitoes must be disease refractory to achieve significant levels of disease protection (Boete and Koella 2002, 2003), it is generally thought that population replacement will require that genes mediating disease refractoriness be linked to a genetic system capable of driving them into the population (Braig and Yan 2001; James 2005; Sinkins and Gould 2006).

A number of gene drive systems have been proposed, including naturally occurring selfish genetic elements such as transposons, B chromosomes, meiotic drive, Medea elements, homing endonuclease genes, and the intracellular bacterium Wolbachia. Another set of approaches to bringing about population replacement involves creating insects in which genes of interest are linked to engineered chromosomes: compound chromosomes or translocations (Curtis 1968; Fosteret al. 1972; Gould and Schliekelman 2004) or pairs of unlinked lethal genes, each of which is associated with a repressor of the lethality induced by expression of the other lethal gene—a system known as engineered underdominance (Daviset al. 2001; Magori and Gould 2006).

A synthetic version of the Medea drive system was recently created and observed to spread rapidly through laboratory populations of Drosophila melanogaster (Chenet al. 2007). The ability of Medea to spread and the rate at which it spreads are a function of its fitness cost and introduction frequency. Medea elements with large fitness costs are expected to require high introduction frequencies to spread; but elements with small fitness costs are expected to spread from very low frequencies, particularly in real populations where population structure and stochastic effects become relevant. These features make Medea an interesting system for large-scale population replacement with a fully characterized arsenal of antipathogen genes.

In the early stages of testing, particularly in the field, it would be desirable to have gene drive systems that are either self-limiting or unlikely to spread following an accidental release (Benedict and Robinson 2003; Benedictet al. 2008; Marshall 2009). Candidates include a system known as killer rescue, which is designed to spread a linked transgene locally for a limited period of time before falling out of the population (Gouldet al. 2008), and engineered underdominance, which requires a population frequency of >67% for a single-locus system and 27% for a two-locus system to spread (Daviset al. 2001). Neither of these systems has been implemented to date. Gene drive systems with high release thresholds are desirable since they may be confined to single populations or nearby populations exchanging large numbers of migrants with each other. This is an important property during open field trials, which must ultimately occur to test the efficacy of these systems. A high release threshold also creates a mechanism for removing the drive system and antipathogen genes from the population through large-scale release of wild-type insects, thus diluting the drive system to subthreshold frequencies.

Here, we describe a genetic system, Semele, which may be used for both suppression and replacement of disease vector populations. The system consists of two components—a toxin expressed in the reproductive system of transgenic males and an antidote expressed in females that protects them from being killed or rendered infertile following mating with a transgenic male (Figure 1). We name this system after the mortal female in Greek mythology, Semele (pronounced “Sem-uh-lee”), who was impregnated by Zeus but later died after witnessing his godliness because she was not herself a god. The name also stands for semen-based lethality.

Schematic diagram of a Semele element. (A) The element is composed of two genes: a toxin expressed in the accessory glands and an antidote expressed in female somatic tissues or a toxin expressed in the male germ line and an antidote expressed in the female germ line for deposition in the egg. (B) Crosses between transgenic males and wild-type females result in loss of progeny or adult female death because wild-type females do not express the antidote to the male's toxic ejaculate. If the antidote is recessive (requiring inheritance of two copies of the Semele element to function), then only homozygous females or their progeny are protected from the toxin.

A number of approaches may be taken for engineering the Semele system, none of which have been realized to date. The toxin could be expressed in the premeiotic germ line, with the product being shared by all haploid products of meiosis. This approach has been proposed in the context of utilizing genes that mediate cytoplasmic incompatibility (CI) in the intracellular bacterium Wolbachia (Sinkinset al. 1997; Turelli and Hoffmann 1999). Alternatively, the toxin could be expressed in somatic cells that synthesize components of semen transferred to the female upon mating (Gould 2007). Upon mating with a wild-type female, the ejaculate either renders the female infertile through the action of a toxin released by sperm that kills fertilized eggs or kills her through the action of a semen-associated toxin on, for example, her nervous system. The second component of this system is an antidote expressed in females that protects them from being rendered infertile or killed following mating with a transgenic male (Figure 1). A release of purely transgenic males results in population suppression, since wild females who mate with transgenic males produce no offspring. A release that includes transgenic females results in gene drive under permissive conditions since females having the Semele allele are favored due to their immunity to the toxic ejaculate of transgenic males.

Importantly, the Semele drive system has a release threshold. At low population frequencies, transgenic males are disadvantaged when they mate with wild-type females because these crosses produce no offspring. However, this disadvantage diminishes at high population frequencies, when element-bearing females are common. In short, when the advantage conferred on transgenic females outweighs the disadvantage conferred on transgenic males, Semele is driven into the population.

Here, we present simple one- and two-locus genetic models that describe the population dynamics of the Semele drive system. We explore a range of parameters over which Semele can function as a gene drive system, including element-associated fitness costs and efficiency of toxin action. We also explore several system variants, X-linked alleles, and recessive antidotes. Finally, we explore the severity of potential barriers to spread such as the prior existence of an allele conferring toxin resistance in the population and assortative mating. In conclusion, we discuss ways in which the Semele drive mechanism could be engineered.

MODEL DEVELOPMENT

We use discrete-generation difference equations to model the population dynamics of the Semele system and its variants. We consider the Semele element as a single allele, which we denote as “T”. This allele carries a minimum of two genes: one that encodes the toxin expressed in males and a second that encodes an antidote expressed in females. By placing the toxin gene within an intron of the antidote gene, these genes may be considered inextricably linked, since any product of breakage and rejoining will lack a functional antidote and will therefore be rendered unviable (Chenet al. 2007). We refer to the corresponding position on the wild-type chromosome as “t”. In the simplest form of our model, we consider the case of a dominant toxin and a dominant antidote. In this case, only one copy of the toxin gene is sufficient to kill susceptible females and one copy of the antidote gene is sufficient to neutralize the toxin in a transgenic female, independent of whether her mating partner has one or two copies of the toxin gene.

A one-locus model can then be used for the discrete generation dynamics. For the deterministic version of this model, we assume random mating and an infinite population size. We account for a fitness cost due to having one or two copies of the allele and allow this cost to differ in males and females. This gender specificity is incorporated to account for the fact that a toxin expressed in males (which is designed to kill) is likely to be more costly than an antidote (whose only function is to inhibit the toxin) expressed in females. We also account for the possibility that toxin efficiency is <100% and allow for an unequal gender ratio at the time of release.

The mathematical formulation of this model is as follows: the proportions of the kth generation that are males of genotypes tt, Tt, and TT are denoted by , , and , respectively. The corresponding proportions for females are , , and . By considering all possible mating pairs, the genotypes of embryos in the next generation are described by the ratio , where(1)(2)(3)Here, denotes toxin efficiency, which is equal to the probability that, when a transgenic male mates with a wild-type female, the female will either die or be rendered infertile due to the toxin. In our basic model, we assume that toxin efficiency is the same in heterozygous and homozygous transgenic males and that antidote efficiency is 100% in transgenic females. Later, we relax both of these assumptions. Genotype frequencies in the next generation are then given by(4)(5)(6)(7)(8)(9)Here, and represent the fitness costs for males that are homozygous and heterozygous for the Semele allele, respectively; and and represent the equivalent fitness costs for females. The normalizing term, , is given by(10)We consider a release of TT individuals in which both the release size and the gender ratio may be varied. Considering a release at generation 0, the initial condition for the difference equations is given by(11)(12)(13)Here, the released individuals represent a proportion, , of the total population, and a fraction, r, of the released individuals are male. Using this initial condition and the difference equations described above, the equilibria, thresholds, and time-series dynamics of the Semele system can be calculated.

All of the models in this article are based on this simple one-locus model. Using this model, we calculate the optimal gender ratio for a release, fitness cost effects, and the degree of toxin inefficiency that can be tolerated for the system to still drive. We develop a stochastic version of the model to estimate loss probabilities and expected times to loss or fixation in a finite population. Additionally, we adapt the model to account for a continuous release. This variant of the model has implications for spread of the allele into secondary populations. Two variants of the Semele system are also considered. First, we consider the case of a dominant toxin and a recessive antidote. In this case, two copies of the antidote gene are required to neutralize the toxin in a transgenic female. Second, we consider the case of a Semele element located on the X chromosome rather than on an autosome.

Finally, we explore some of the barriers that may prevent the Semele system from working. First, we develop a two-locus model for the prior existence of toxin resistance in the population. We consider a natural toxin-resistance allele, denoted by “R”, which is unlinked to the Semele allele and has a prior equilibrium frequency in the population. Second, we develop a two-locus model to account for assortative mating and its implications for gene drive. In this model, an unlinked allele causes laboratory-reared mosquitoes to be less appealing to wild mosquitoes. Such an allele, which we denote as “A”, could be considered the product of laboratory inbreeding.

All of these models apply to the gene drive application of the Semele system. We focused on gene drive because the genetic version of the sterile insect technique has already been modeled (Phucet al. 2007) and the dynamics of a male-only release of Semele are expected to be very similar. That said, we use the continuous release model to investigate the effect a few stray transgenic females would have on an all-male transgenic release intended for population suppression.

RESULTS

No fitness costs:

We begin by considering an autosomal Semele element with a dominant toxin produced by males and a dominant antidote produced by mature females. First, we consider the case where there is no fitness cost associated with the Semele allele. All mating pairs produce equal numbers of male and female offspring and so, even if the gender ratio is initially unequal, the genotype distribution among males and females will be identical from the second generation on. The proportions of the kth generation that are individuals of genotypes tt, Tt, and TT may then be denoted by , , and . The following simplified form of Equations 1–10 then applies,(14)(15)(16)where is the normalizing term defined in Equation 10. This system has three biologically feasible equilibrium points:(17)The first of these points represents allele fixation, the second represents absence or loss of the Semele allele, and the third represents coexistence of wild, heterozygous, and homozygous individuals in the population. We calculate the stabilities of these points in supporting information, File S1. Our analysis shows that both fixation and loss of the element are represented by stable equilibrium points and the intermediate point is unstable.

The location and stability of the three equilibrium points suggest the existence of a threshold, above which the element becomes fixed and below which the element is lost. Mapping genotype trajectories onto a De Finetti diagram, we see that there is a family of points that act as a threshold between loss and fixation (Figure 2A). This family of points includes the third equilibrium point in Equation 17 and is referred to as a separatrix. The fact that an unstable equilibrium point exists for all values of suggests that, in the absence of fitness costs, even a minimally efficient toxin can facilitate gene drive (Figure 2B). The case where corresponds to Hardy–Weinberg equilibrium.

Population dynamics of a Semele element with no fitness cost. (A) De Finetti diagram for the case of 100% toxin efficiency. A family of threshold points (separatrix) exists, above which the construct is fixed and below which the construct is lost. (B) De Finetti diagram showing separatrices for a variety of toxin efficiencies. (C) Time-series dynamics for the element in B incorporating additive and nonadditive toxin efficiencies. Reducing toxicity leads to much slower rates of allele spread. (D) Time-series dynamics for the element in B incorporating additive and nonadditive antidote efficiencies. For the nonadditive case, reducing antidote efficiency increases the release threshold and leads to slower rates of allele spread.

Figure 2C depicts time-series dynamics for Semele alleles having no fitness costs and a variety of toxin efficiencies. Transgenic males that express a less-efficient toxin have more offspring and consequently require a smaller release frequency to spread. However, elements with reduced toxicity cause allele frequencies to change less quickly, leading to much weaker drive. For example, a release of TT males and females at a population frequency of 40% is expected to result in fixation of the T allele regardless of toxin efficiency. For a 100%-efficient toxin, wild-type individuals are predicted to fall below a population frequency of 1% within 22 generations; however, the same reduction is expected to take 230 generations if the toxin is 10% efficient.

In the above analysis, we assumed that toxin efficiency is the same in heterozygous and homozygous transgenic males; however, this assumption can be relaxed by replacing Equation 15 above with the following modified equation:(18)Here, represents toxin efficiency in heterozygous males; while toxin efficiency in homozygous transgenic males is either 100% or , whichever is smaller. Figure 2C also depicts the time-series dynamics for Semele alleles having additive toxin efficiencies. The dynamics are very similar to those for nonadditive toxin efficiencies; however, additive toxin efficiencies lead to slightly faster gene drive. For a 40% release and 10% toxin efficiency, wild-type individuals are predicted to fall below a population frequency of 1% within 160 generations, as opposed to 230 generations for the nonadditive case.

Figure 2D depicts time-series dynamics for Semele alleles having a variety of antidote efficiencies. For Semele alleles having additive antidote efficiencies, the following modified form of Equations 14–16 applies:(19)(20)(21)Here, represents antidote efficiency in heterozygous females, which is equal to the probability that, when a transgenic male mates with a heterozygous female, the female will survive and retain fertility. Antidote efficiency in homozygous transgenic females is either 100% or , whichever is smaller. For the case of nonadditive antidote efficiencies, this latter quantity is simply equal to . For the additive case, changing antidote efficiency between 50 and 100% has very little effect on the spread of the Semele allele—in all cases, nearly all individuals are transgenic within ∼15 generations. For the nonadditive case, decreasing antidote efficiency leads to higher release thresholds and slower gene drive. For 75% antidote efficiency, wild-type individuals fall below 1% within 50 generations of a 50% release, as opposed to 20 generations for the case of 100% antidote efficiency. For 50% antidote efficiency, a 50% release is insufficient to achieve gene drive.

Equal fitness costs:

Next, we consider the case where there is an equal fitness cost associated with the element in both females and males (). As in the previous case, all mating pairs produce equal numbers of male and female offspring and so the genotype distribution is identical among males and females from the second generation on. The following simplified form of Equations 1–10 applies:(22)(23)(24)Here, s and hs represent the fitness costs for individuals that are homozygous and heterozygous for the Semele allele, respectively. For nonzero fitness costs, this system has four biologically feasible equilibrium points. Two of these correspond to fixation and loss of the Semele allele. The other two represent the coexistence of wild, heterozygous, and homozygous individuals and have expressions too complex to be useful, even if simplifications are made such as 100% toxin efficiency (). We calculate the stabilities of fixation and loss of the Semele allele in File S1. Our analysis shows that fixation is unstable for a fitness cost that is recessive or shows any degree of heterozygosity, but is stable for a completely dominant fitness cost. Loss is stable under all scenarios.

Observation of time-series data together with the location and stability of the four equilibrium points suggests the existence of a separatrix—or family of threshold points—above which the element reaches a nontrivial equilibrium frequency in the population and below which the element is lost (Figure 3, A and B). Fixation is not a realistic equilibrium since any perturbation will bring the Semele allele back to the nontrivial equilibrium frequency and any real population is subject to perturbations.

Population dynamics of a Semele element with equal fitness costs in males and females. (A) De Finetti diagram for the case of a 10% fitness cost. Above the release threshold, the allele approaches a stable equilibrium consisting mostly of heterozygotes and homozygotes. (B) De Finetti diagram showing separatrices and stable equilibria for a variety of fitness costs. (C) Region of drive as a function of fitness cost. The region of drive is the set of population frequencies above the release threshold and below the stable transgenic equilibrium between which the Semele element will increase in frequency up to the stable equilibrium. (D) Region of drive as a function of toxin efficiency for a variety of fitness costs.

As the fitness cost on the element increases, the threshold required for spread increases and the nontrivial equilibrium reached decreases, reducing the impact of drive from both sides (Figure 3C). To illustrate this, for an element having a 10% fitness cost in homozygotes, TT males and females must be released at a frequency >41.7% for transgenic individuals to spread to a population frequency of 98.9%. For a 20% fitness cost, the release threshold increases to 48.8%, above which transgenic individuals spread to a population frequency of 94.4%. The maximum tolerable fitness cost is 27%, at which point the release threshold and the nontrivial equilibrium are identical and drive does not occur. For fitness costs >27%, the Semele allele is lost for all cases other than initial fixation.

In the presence of a fitness cost, decreasing toxin efficiency also has the effect of reducing the impact of drive (Figure 3D). The major effect is that a less efficient toxin causes the allele to reach a lower equilibrium frequency in the population because the driving force is weakened relative to the fitness cost. The same effect causes the release threshold to increase as the toxin becomes less efficient; however, at the same time, more transgenic offspring survive at lower toxin efficiencies, causing the release threshold to decline. The result is a release threshold that declines as the toxin begins to become less efficient and then rises. The minimum toxin efficiency required for drive to occur depends on the fitness cost: for a 10% fitness cost, toxin efficiency must exceed 36.9%; and for a 20% fitness cost, toxin efficiency must exceed 74%. Below these efficiencies, the Semele allele is lost for all cases other than initial fixation. As before, reducing toxin efficiency greatly reduces the speed of spread.

Putting these results into perspective requires some idea of the real-world fitness costs that we could expect; however, since the Semele system has not yet been engineered, we are limited to estimates of fitness costs due to refractoriness and hypothetical considerations of the cost of a male-expressed toxin and a female-expressed antidote. Mounting an immune response is generally thought to be associated with an evolutionary cost in insects (Kraaijeveld and Godfray 1997; Schmid-Hempel 2005). For example, Ahmedet al. (2002) measured egg production to be reduced by 18.6% in Anopheles gambiae mosquitoes whose immune system was artificially stimulated with lipopolysaccharides. However, transgenic mosquitoes have also been engineered that have no noticeable fitness cost when fed on Plasmodium-free blood (Moreiraet al. 2004) and have a 35–50% fitness advantage when fed on Plasmodium-infected blood (Marelliet al. 2007). This fitness advantage would be much smaller in a real population in which only a fraction of mosquitoes are infected with malaria parasites. These observations give some idea of the large range of fitness costs that must be explored. As discussed earlier, a toxin expressed in the male accessory glands is likely to be much more costly for males than an antidote expressed specifically in females. For this reason, we explore the scenario of male-specific fitness costs in the following section.

Male-specific fitness costs:

To model male-specific fitness costs, we use the basic model in Equations 1–13, with the one simplification that the Semele allele confers no fitness cost on females (). Symbolic analysis of these equations is too complex to be useful; however, the system can be easily numerically iterated (Figure 4).

Population dynamics of a Semele element with male-specific fitness costs. (A) De Finetti diagram for the case of a 20% male-specific fitness cost. (B) Region of drive as a function of male-specific fitness cost. (C) Time-series dynamics for constructs with 10% and 20% sex-independent and male-specific fitness costs. A construct with a 20% male-specific fitness cost behaves similarly to one with a 10% fitness cost expressed in males and females. (D) Release threshold as a function of release gender ratio. The threshold is minimized for a female-biased release; however, a release consisting of equal numbers of males and females is acceptable.

Observation of time-series data reveals dynamics very similar to the case of equal fitness costs for males and females: a family of threshold points above which the allele reaches a nontrivial equilibrium frequency in the population and below which the allele is lost (Figure 4A). As the fitness cost on the element increases, the threshold required for spread increases and the nontrivial equilibrium reached decreases. The maximum tolerable fitness cost is 41%, at which point the release threshold and nontrivial equilibrium are identical (Figure 4B). An element with a 20% fitness cost on males spreads to a transgenic equilibrium frequency of 98.3% within 40 generations, which is very similar to an element conferring a 10% fitness cost on males and females (Figure 4C). An element with a 10% fitness cost on males spreads to a transgenic equilibrium frequency of 99.6% within 30 generations.

Interestingly, male-specific fitness costs have little effect on the optimal gender ratio of a release. Gender ratio is clearly relevant since an all-male release cannot result in spread while, in the absence of fitness costs, a release of equal numbers of males and females has a threshold of 36.4%. This threshold is minimized when the release is 33% male. For a 10% fitness cost on males and females, the threshold is minimized for a 38% male release: the same as the optimal gender ratio for a 20% fitness cost on males only (Figure 4D). We generally consider a release of equal numbers of males and females since this is biologically convenient and is close to the optimal gender ratio in terms of release threshold.

Stochastic formulation:

Real populations have a finite number of individuals and are subject to a multitude of chance events. For this reason, we consider a stochastic version of Semele dynamics. For simplicity, we consider the equal fitness cost model with 100% toxin efficiency. At generation k, the number of individuals with genotypes tt and TT is denoted by and , respectively, half being male and half being female. The total population size is denoted by N. Following from Equations 22–24, the genotypes of individuals in the next generation are described by the expected proportions(25)(26)The normalizing term, , is analogous to that in Equation 10 and is given by(27)At each generation, these expected proportions are reduced to a population of N adults consisting of wild types and homozygotes by sampling from the multinomial distribution,(28)We consider a release of homozygotes in a population of wild types at generation 0. We calculate the loss probability by iterating until the Semele allele is either fixed or lost and calculating the proportion of trials that reach the state . The distribution of extinction times can be calculated by recording the generation that this state is reached. The distribution of fixation times can be calculated by recording the generation in which the state is reached.

Results are shown in Figure 5 for population sizes of 100, 1000, and 10,000. The mosquito populations of interest for disease control are on the order of 1000–10,000 individuals per village. Malaria-transmitting A. gambiae populations show large seasonal variation in size, but have been estimated to behave like randomly mating populations of several thousand individuals (Taylor and Manoukis 2003). Dengue-transmitting A. aegypti mosquitoes are similarly numerous in villages, but tend to have more structured populations, suggesting smaller effective population sizes (Reiteret al. 1995; Tripset al. 1995; Harringtonet al. 2005; Jefferyet al. 2009).

Stochastic spread of Semele for population sizes of 100, 1000 and 10,000. (A) Loss probability as a function of release proportion for an element with no fitness cost. Release thresholds are not strict cutoff frequencies as suggested by the deterministic analysis. (B) Loss probability as a function of release proportion for an element with a 10% fitness cost.

The implication of these results is that the release thresholds mentioned earlier are not strict cutoff frequencies, above which the allele spreads and below which it does not. The deterministic thresholds are in fact the frequencies at which the allele is equally likely to spread or to go extinct. For a Semele allele with no fitness cost, the deterministic threshold is 36.4%; however, in a population of 1000, the allele has a 10% chance of spreading for a 34.6% release and a 1% chance of spreading for a 33.1% release (Figure 5A). A similar pattern is seen in the presence of a fitness cost (Figure 5B).

The results of the stochastic simulations also suggest that, in the event of an accidental release, a Semele allele is very unlikely to persist in the wild. The allele will almost certainly become extinct for a release frequency <30% in a population of ≥1000 (Figure 5). Accidental releases are expected to be much smaller than this and hence even more likely to be self-limiting.

Continuous release:

For an intentional release, population replacement is most easily achieved by a sustained release of transgenic individuals. If the transgenic individuals are released at a high enough rate, they will accumulate in the population and eventually exceed the release threshold, at which point they are capable of spreading on their own. To model a sustained release, we add a proportion, μ, of Semele homozygotes to the mating pool at each generation. This proportion is measured relative to the total population prior to introduction. A fraction, r, of the released individuals are male. The relative proportion of homozygous males in the mating pool then becomes and the relative proportion of homozygous females is . Modifying Equations 1–10 accordingly, only Equations 2 and 3 are affected:(29)(30)Since the release is incorporated into the difference equations, then the population is entirely wild type at generation 0 (). Using this initial condition and the difference equations described above, the continuous release thresholds, equilibria, and time-series dynamics can be calculated.

Figure 6A depicts the time-series dynamics for a sustained release of Semele homozygotes with a 20% fitness cost. For a continuous release of 5% homozygotes per generation, the allele persists at a low frequency and transgenic individuals persist at <12% in the population; however, for release rates of ≥10% per generation, the allele either fixes or spreads to a very high frequency. Figure 6B depicts the equilibria reached for a variety of release rates. These confirm the existence of a continuous release threshold that depends on the element fitness cost: in the absence of a fitness cost, the threshold is 4.4% per generation; while for a 20% fitness cost, the threshold is 6.5% per generation. For fitness costs <10%, the allele will fix in the population if this threshold is exceeded (Figure 6C). The population will return to the stable equilibria described earlier if continuous releases are terminated and occasional wild-type individuals enter the population.

Continuous release of a Semele allele. (A) Time-series dynamics for the case of a 20% fitness cost. Above the continuous release threshold, the allele either fixes or reaches a high equilibrium consisting mostly of transgenic individuals. (B) Transgenic equilibrium frequency as a function of continuous release proportion for a variety of fitness costs. This clearly shows the existence of a continuous release threshold. (C) Continuous release threshold as a function of fitness cost. Also shown is the continuous release proportion required for fixation to occur. (D) Continuous release threshold as a function of release gender ratio for release gender ratios between 99 and 100% male. Even a small number of transgenic females in an all-male transgenic release intended for population suppression can lead to gene drive occurring instead.

These results have several implications for population replacement. First, if a transgenic release on the order of 37–50% homozygotes is considered unfeasible, then a sustained release of 5–7% homozygotes per generation would provide an achievable solution. Second, it is very likely that a release of mosquitoes with Semele alleles will be confined to the release population – a very desirable feature in the early stages of testing. If the allele fixes in the release population, this population will act as a source for neighboring populations; however, mosquito migration rates between villages in Africa tend to be <1% per generation (Tayloret al. 2001), suggesting that the allele will persist only at low levels in neighboring populations.

Finally, the results have implications for an all-male release intended for population suppression. If the toxin is 100% efficient, then transgenic males will have no offspring with wild females, leading to suppression; but uncertainty arises when a small number of transgenic females are included in the release. Figure 6D depicts the continuous release thresholds that must be exceeded for population replacement to occur when the transgenic male-to-female ratio is ≥99:1. For a 20% fitness cost, a 99:1 gender ratio will lead to population replacement for release rates >19.5% per generation. For a 999:1 gender ratio, the threshold is 23.5% per generation. These release rates are much less than those planned for population suppression (Phucet al. 2007), suggesting that, if sexing is not perfect, a separate element lacking the antidote gene should be used for population suppression.

Recessive antidote:

We consider two simple variants of the Semele system. First, we consider the case in which two copies of the antidote gene are required to neutralize the toxin instead of one. The dynamics of this system are the same as those for a dominant antidote with the exception that crosses between transgenic males and heterozygous females are also unviable. For the case of equal fitness costs in males and females, the following modified form of Equations 1–10 applies,(31)(32)(33)where is the normalizing term defined in Equation 10. Assuming 100% toxin efficiency, this system has three biologically feasible equilibrium points:(34)The first of these points represents fixation, the second represents loss, and the third represents a completely heterozygous population. We calculate the stabilities of these points in File S1. In conjunction with time-series data (Figure 7, A and B), we see that fixation and loss are stable equilibria, while the case of an all-heterozygote population lies on a separatrix above which the allele is fixed and below which the allele is lost. These results are interesting because they imply that, above the release threshold, a Semele allele with a recessive antidote will be driven to fixation in a population regardless of its fitness cost. A fitness cost merely has the effect of increasing the release threshold.

Population dynamics of a Semele element with a recessive antidote. (A) De Finetti diagram for the case of a 10% fitness cost. Above the release threshold, the allele is completely fixed. (B) De Finetti diagram showing separatrices for a variety of fitness costs and toxin efficiencies. (C) Release threshold as a function of toxin efficiency for a variety of fitness costs. Above these thresholds, the element is always fixed. (D) Continuous release threshold as a function of toxin efficiency for a variety of fitness costs.

The case of imperfect toxin efficiency in conjunction with a recessive antidote does not lend itself to analytic treatment; however, observation of time-series data suggests that the release threshold increases as the toxin becomes less efficient (Figure 7B). The minimum toxin efficiency required for drive to occur depends on the fitness cost: for a 10% fitness cost, toxin efficiency must exceed 9.6%; and for a 20% fitness cost, toxin efficiency must exceed 21.6% (Figure 7C). Above these efficiencies, a super-threshold release will result in element fixation regardless of its inefficiency and fitness cost. Below these efficiencies, the element will be lost for all cases other than initial fixation.

The release thresholds for Semele with a recessive antidote are relatively high: in the absence of fitness costs, TT males and females must be released at a frequency >50% to spread; and for a 20% fitness cost, the threshold increases to 57.3% (Figure 7C). The same result can be achieved by a sustained release: in the absence of fitness costs, a continuous release rate >10.3% homozygotes per generation will lead to fixation; and for a 20% fitness cost, the threshold is 13.4% per generation (Figure 7D). These results are encouraging because they suggest that a Semele allele with a recessive antidote is even more confinable to a single population than an element with a dominant antidote.

X-linked Semele:

Second, we consider the case in which the Semele allele is inserted at a location on the X chromosome. The dynamics of this system differ due to the fact that females carry two copies of the X chromosome, while males carry only one. The only unviable cross is between XTY males and XtXt females. The proportions of the kth generation that are males of genotypes XtY and XTY are denoted by and , respectively. The corresponding proportions for females are unchanged. For the case of equal fitness costs in males and females, the following modified form of Equations 1–10 applies,(35)(36)(37)(38)(39)where is the normalizing term defined in Equation 10. We consider a release of XTY males and XTXT females at generation 0,(40)(41)(42)where is the release proportion and r is the fraction of released individuals that are male. Using this initial condition and the difference equations described above, the dynamics of the system can be calculated.

Figure 8A depicts the time-series dynamics of an X-linked Semele allele with a 10% fitness cost. The dynamics are very similar to those of an autosomal allele: there is a release threshold of 38.5%, above which the allele spreads to an equilibrium frequency of 92%, and below which the allele is lost. Although the Semele allele reaches only a frequency of 92%, >96% of the population is transgenic at this allele frequency.

Population dynamics of a Semele element located on the X chromosome. (A) Time-series dynamics for an element with a 10% fitness cost. Above the release threshold, the allele approaches a stable equilibrium consisting mostly of transgenic individuals. (B) Region of drive as a function of toxin efficiency for a variety of fitness costs. (C) Continuous release threshold as a function of toxin efficiency for a variety of fitness costs. (D) Release threshold as a function of release gender ratio. The threshold is minimized for a female-biased release.

Figure 8B depicts the release thresholds and equilibria reached for a variety of fitness costs and toxin efficiencies. For 100% toxin efficiency, an X-linked allele with no fitness cost will fix in the population for releases >33.3%; an allele with a 5% fitness cost will spread to a transgenic frequency of 98% for a release >35.7%; and an allele with a 20% fitness cost will spread to a transgenic frequency of 89.2% for a release >45%. As for an autosomal allele (Figure 8D), the region of drive decreases from both sides as the fitness cost increases and the toxin becomes less efficient. X-linked alleles have a larger region of drive for high toxin efficiencies; but allele location makes less difference at low efficiencies.

Similarly, for the case of 100% toxin efficiency, an X-linked allele has a slightly lower continuous release threshold than an autosomal allele (Figure 8C). In the absence of fitness costs, the threshold is 3.7% per generation (compared to 4.4% per generation for an autosomal allele), and for a 20% fitness cost, the threshold is 5.7% per generation (compared to 6.5% per generation for an autosomal allele). Although X-linked alleles are less confinable than autosomal alleles at these toxin efficiencies, the amount of migration required to colonize a secondary population is still greater than that observed between typical African villages (Tayloret al. 2001) .

The gender ratio of a release is particularly relevant for X-linked alleles because released females have two copies of the allele while males have only one. In the absence of fitness costs, majority-female releases are favored for this reason; however, more equal ratios are favored in the presence of high fitness costs because released females have more copies of the allele, making them more vulnerable to fitness costs (Figure 8D).

Barriers to spread:

Simple models can predict gene drive systems to spread in abstract populations; however, real populations are far more complex and may have complicating features that prevent spread. In this section, we explore two potential complicating features of real populations: first, the prior existence of toxin resistance in nature; and second, the tendency for wild females to be less attracted to transgenic males.

Prior toxin resistance:

In both cases, we use a two-locus model to study the discrete generation dynamics. For prior toxin resistance in nature, we consider a toxin-resistance allele, denoted by R, which is unlinked to the Semele allele, T. We then use a series of 81 dihybrid crosses to keep track of the proportions of each generation that are males and females of genotypes TTRR, TTRr, TTrr, TtRR, TtRr, Ttrr, ttRR, ttRr, and ttrr. For simplicity, we assume equal fitness costs in both sexes due to the Semele allele and 100% toxin and antidote efficiency. Females that have either the Semele allele or the natural toxin-resistance allele are protected against the Semele toxin. The only unviable crosses are therefore between males having the Semele allele and ttrr females. The Matlab code for this model is available from the authors upon request.

We assume that none of the released mosquitoes have the natural toxin-resistance allele and that, prior to a release, this allele exists in the population at Hardy–Weinberg equilibrium with frequency . Considering a release of TTrr males and females at generation 0, the initial condition for the difference equations is given by(43)(44)(45)(46)Here, the released individuals represent a proportion, , of the total population. Using this initial condition and the model described above, the dynamics of the T and R alleles can be calculated.

A Semele allele without fitness costs is relatively unaffected by the prior existence of toxin resistance in a population. Figure 9A depicts the time-series dynamics of a 50% transgenic release without fitness costs. Here, even if the R allele is initially at 50% in the population, the T allele will spread to fixation within 100 generations. Both the T and the R alleles are selected for following the release; however, the T allele is generally released at such a high frequency that it still reaches fixation. For a 50% transgenic release, it is only when the R allele is initially at a frequency >83% that it prevents the T allele from being fixed. In this case, the R allele is fixed first and the T allele plateaus at an equilibrium frequency >85%, corresponding to a frequency of transgenic individuals >98% (Figure 9B).

Prior existence of a toxin-resistance allele, R, in a natural population. (A) Time-series dynamics for a Semele allele with no fitness cost released at 50% into a population having various prior frequencies of the R allele. (B) Release thresholds and stable equilibria following a 50% release as a function of prior R-allele population frequency. (C) Time-series dynamics for a construct with a 10% fitness cost released at 50% into a population having various prior frequencies of the R allele. (D) Release thresholds and stable equilibria following a 50% release as a function of prior R-allele population frequency.

Prior toxin resistance is potentially debilitating when the Semele allele confers a fitness cost. Figure 9C depicts the time-series dynamics of a 50% transgenic release for an element having a 10% fitness cost. If the R allele has an initial population frequency <10%, the T allele will spread to an equilibrium frequency of ∼90%, corresponding to a frequency of transgenic individuals of ∼98.5%, within 100 generations (Figure 9D). However, if the R allele has an initial frequency >10%, the T allele will be lost from the population. This suggests that, for moderate to high fitness costs, only low prior levels of toxin resistance in the population can be tolerated. Experiments would be advised to test for toxin resistance in the environment and determine its frequency prior to a release.

Assortative mating:

We model the impact of assortative mating on the spread of a Semele allele in File S1. Our model is based on the assumption that there is nothing in the biology of Semele, in and of itself, that would lead to an assortative mating phenotype; however, mosquitoes released with the Semele allele will have been raised in an industrial setting and will differ from wild mosquitoes due to laboratory selection, inbreeding, and strain differences. We model these differences in the form of a single unlinked allele, A, which is responsible for some degree of unattractiveness of transgenic males to females. Our analysis shows that, if only wild females are less attracted to AA males, then assortative mating has very minor effects on the spread of the Semele allele. If both transgenic and wild females are less attracted to AA males, then strong assortative mating tendencies may require increased introduction frequencies; however, provided that these frequencies are exceeded, the Semele allele will spread to the same equilibrium frequency in the population.

DISCUSSION

The failure of existing technology to control mosquito-borne diseases has renewed interest in the development of transgenic mosquitoes as a component of an integrated strategy for controlling insect disease vectors (Braig and Yan 2001; Alpheyet al. 2002; Sinkins and Gould 2006; Marshall and Taylor 2009). Here, we have described a genetic system that may be used for both suppression of mosquito population sizes and replacement of mosquito populations with disease-refractory varieties. The system has several features that make it attractive in the early stages of testing and development, when it is essential that the spread of transgenes be limited in space and time.

As a population suppression system, Semele has the potential to control disease locally without persisting over time and spreading from one population to another. As a population replacement system, Semele is highly unlikely to spread following an accidental release and can be reasonably confined to a single population following an intentional release. Semele alleles can also be removed from a population through a combination of mosquito control measures and the introduction of large numbers of wild-type mosquitoes. Finally, as a chromosomally located toxin–antidote system, the original Semele allele can be bumped out of the population in favor of a new allele consisting of the old antidote in combination with a new toxin–antidote pair, provided that the new element is located at the same genomic location (Chenet al. 2007). The Semele system therefore satisfies many of the safety criteria required for release of transgenic mosquitoes.

An all-male release of mosquitoes with Semele results in population suppression because wild females that mate with transgenic males produce no offspring. Our modeling suggests, however, that a separate Semele allele lacking the antidote gene should be used for population suppression because even the tiniest contamination of transgenic females in an all-male release will result in gene drive rather than suppression. It is therefore safer to create a sterile variant than to rely on perfect sexing prior to an all-male release.

For gene drive to occur, the element must include all components of the Semele system: the toxin, the antidote, and two promoters. One promoter must allow expression in the male germ line or accessory glands, and the other must allow expression in the female germ line or somatic tissues exposed to seminal fluid. The toxins used to build such an element are likely to be nearly 100% efficient; however, they may also confer a significant fitness cost on the males expressing them. To be conservative, let us consider a high fitness cost on males, approximated by a 10–20% fitness cost on both males and females. Our modeling then suggests that a 50% release should result in gene drive; however, the release could also be spread out over multiple generations by releasing 7% transgenic mosquitoes each generation. A release of equal numbers of males and females is adequate. We predict that such an element will spread to a transgenic frequency of 95% within 15 generations and to a transgenic frequency of 98% within 30 generations. The Semele system is therefore able to drive into a population quickly and efficiently.

Counterintuitively, a Semele allele with a recessive antidote is expected to drive into a population more quickly and efficiently (while having a higher release threshold) than an element with a dominant antidote. This is because, with a recessive antidote, the Semele allele distorts the offspring ratio even when there are no wild-type individuals in the population. This has the effect of driving the Semele allele to fixation, provided that the initial release exceeds a certain threshold. For comparable fitness costs, modeling suggests that a 55% release should result in gene drive and that the Semele allele will be fixed within 20 generations. Interestingly, the allele is expected to spread to fixation regardless of the fitness cost. The release threshold is slightly higher; but the flipside of this is that the element requires higher migration rates to spread into adjacent populations, suggesting it is more strongly confined to a single population.

These results inspire interest in the ability to engineer traits that are expressed only when an element is present in two copies (in this case at a common position on both homologs), but not one. Such a cellular counting system has not yet been engineered by humans; but it is a task that has been achieved by nature, in the context of sex determination in Drosophila (Sanchez 2008) and during X chromosome inactivation and allelic exclusion in mammals (Keverne 2009; Zakharovaet al. 2009). One possible method for achieving this takes advantage of pairing-sensitive silencing, a phenomenon in which the presence of specific sequences near genes located at the same site on homologous chromosomes results in strong silencing of these genes in homozygotes, but much weaker silencing in heterozygotes (Kassis 2002). Perhaps a recessive antidote could be engineered by using pairing-sensitive silencing to repress a repressor of antidote activity in homozygotes, but not in heterozygotes.

The spread of a Semele allele is relatively immune to several potential complicating features of real populations, such as assortative mating and the prior existence of resistance alleles. This is largely due to the fact that the strategy is directed at single populations, involves releases at high proportions, and reaches an equilibrium frequency within a small number of generations. Assortative mating is a small hindrance that may lead to increased release thresholds. Prior toxin resistance in the population is a problem only when the Semele allele confers a moderate to high fitness cost and toxin resistance is present at a population frequency of ∼ ≥10%. Mutational inactivation of the antidote gene is selected against and inconsequential. Mutational inactivation of the toxin and refractory genes are consequential, but are a feature of any toxin–antidote-based drive system, and can be forestalled to some extent through gene multimerization. There are likely other complicating factors that we have not considered.

As for any mathematical model, simplifications were made that may compromise the quality of the predictions. In using difference equations for the majority of our models, we considered an infinite, randomly mating population with discrete, nonoverlapping generations. We ignored the population structure of mosquito populations, for example, spatial structure, age structure, and mating structure. We also ignored density dependence and behavior effects. Despite this, discrete-generation difference equations have been successfully used to gain insight into several other gene drive systems (Wade and Beeman 1994; Daviset al. 2001; Deredecet al. 2008; Gouldet al. 2008), including those generated using nuclear-encoded CI-causing factors (Turelli and Hoffmann 1999), which display analogous dynamics to the Semele system. We believe there is a mandate to use these models and that they capture the main features of Semele dynamics.

Engineering the Semele system:

Finally, there is reason to believe that the Semele drive system can be constructed using existing reagents or reagents that could be created using existing technologies. Synthesis could be achieved in three ways: first, by manipulating gene expression in male and female somatic tissues; second, by manipulating gene expression in the germ line; and third, by isolating the genes that mediate CI and inserting these onto a nuclear chromosome under the control of promoters that recapitulate their patterns of expression in an infected insect. In the case where genes are expressed in somatic tissues, peptides or proteins can be expressed under the control of promoters that drive expression in male accessory glands (Sirotet al. 2009). If these proteins enter the female along with other seminal fluid components during mating, they may be able to disrupt essential functions, perhaps in the nervous system. Candidates might include insect-specific peptide neurotoxins (Nicholson 2007). The activity of these toxins in female recipients could be blocked through the female-specific expression and secretion into the hemolymph of neutralizing antibodies or through female-specific expression of variants of the receptor targeted by the toxin that are insensitive to toxin function, perhaps in conjunction with the use of RNAi to silence expression of the transcript encoding the endogenous receptor.

In the case where genes are expressed in the germ line, sperm-based toxins could be used in conjunction with oocyte- or egg-based antidotes. Candidate toxins include DNAses that cleave zygotic DNA following fertilization, but do not cleave haploid sperm DNA. In this approach strategies are necessary to prevent toxin expression until late stages of spermatogenesis, by which time spermatid chromatin is in a highly condensed form, hopefully resistant to cleavage. UTR sequences that mediate translational repression during earlier stages of spermatogenesis provide one approach to achieving this goal (Schaferet al. 1995; Blumeret al. 2002). Sperm-specific expression of microRNAs that are designed to translationally silence the toxin-encoding transcript provide another approach. To protect against a sperm-based toxin, the antidote must be presynthesized in the newly fertilized egg, ready for immediate action following sperm entry and chromatin decondensation. Candidate antidotes might include a protease that cleaves a target site engineered into the toxin or a maternally expressed intrabody—cytoplasmic versions of antibodies (Loet al. 2008)—that binds to the nuclease, neutralizing its toxic function.

A first step toward a sperm-based toxin has been taken with the demonstration that expression of the homing endonuclease I-Ppol under the control of a male germ-line-specific promoter results in zygote lethality in A. gambiae. Lethality is caused by cleavage of I-PpoI target sites in zygote DNA, found in multiple copies in the X chromosome-linked 28S ribosomal RNA gene cluster (Windbichleret al. 2008). I-PpoI expression in the male germ line also results in bias toward Y-bearing spermatozoa, probably due to I-PpoI-dependent cleavage of the ribosomal gene cluster found in X-bearing sperm. This damage presumably leads to their loss during spermatogenesis. This last fact highlights the necessity of being able to silence nuclease expression until stages of spermatogenesis in which nuclear DNA is hidden from such an enzyme.

Semele could also be created using molecules (presumably proteins) that mediate Wolbachia-induced CI. Unidirectional CI is seen in crosses between infected and uninfected individuals: matings between infected males and uninfected females result in death of some or all progeny, while matings between infected or uninfected males and infected females produce viable, infected progeny. As a result, infected females gain a reproductive advantage in the presence of Wolbachia, and since Wolbachia is transmitted through the female germ line, it benefits as well (Werrenet al. 2008). Unidirectional CI behaves as though sperm produce a toxin, which is counteracted in the zygote by a maternally provided antidote. The idea then is to link genes that mediate toxin and rescue CI activities and express them from the nuclear genome in patterns that facilitate their CI-inducing function (Sinkinset al. 1997; Turelli and Hoffmann 1999). This approach to Semele generation is attractive because CI can be quite robust. In addition, Wolbachia strains exist that display CI but do not rescue each other, implying the existence of multiple, independent toxin–antidote functions. These positive points notwithstanding, the genes mediating CI remain to be identified. In addition, it is unclear whether the site and timing of expression of these proteins can be recapitulated from the nuclear genome. It is also important to note that Semele elements generated using components of the CI system could be used only in populations uninfected by Wolbachia bacteria expressing the same CI proteins.

Finally, it is interesting to consider the possibility that Semele elements could evolve in nature as a consequence of male–female conflict over reproduction. Male mating often results in a cost to females as a consequence of male traits designed to increase their paternity. These costs are sometimes mediated by seminal fluid components or modifications of sperm. In response, females sometimes evolve counteradaptations that decrease these costs (Chapman 2006; Parker 2006; Wolfner 2009). Typically, these effects are imagined to be the results of actions at unlinked loci. Here we note that linkage between a gene that mediates a male-derived cost to females and a locus expressed in females that counters this cost would create a Semele-like element. While Semele requires a threshold frequency to be exceeded for spread to occur, even when it carries no fitness cost, highly structured populations may provide an environment in which such elements could gain a foothold. Over evolutionary time, it is possible that such elements could sweep through a population, with the only hint that such an element existed being tight linkage between genes regulating fitness in response to mating in a reciprocal manner.

Acknowledgments

The authors thank Fred Gould for helpful insight into the Semele system, Catherine Ward for helpful discussions on model design, and two anonymous reviewers whose constructive comments improved the manuscript. John M. Marshall was supported by grant DP1 OD003878 to Bruce A. Hay from the National Institutes of Health.

The Genetics Society of America (GSA), founded in 1931, is the professional membership organization for scientific researchers and educators in the field of genetics. Our members work to advance knowledge in the basic mechanisms of inheritance, from the molecular to the population level.