Abstract

Allopolyploid hybridization serves as a major pathway for plant evolution, but in its early stages it is associated with phenotypic and genomic instabilities that are poorly understood. We have investigated allopolyploidization between Arabidopsis thaliana (2n = 2x = 10; n, gametic chromosome number; x, haploid chromosome number) and Cardaminopsis arenosa (2n = 4x = 32). The variable phenotype of the allotetraploids could not be explained by cytological abnormalities. However, we found suppression of 20 of the 700 genes examined by amplified fragment length polymorphism of cDNA. Independent reverse transcription–polymerase chain reaction analyses of 10 of these 20 genes confirmed silencing in three of them, suggesting that ∼0.4% of the genes in the allotetraploids are silenced. These three silenced genes were characterized. One, called K7, is repeated and similar to transposons. Another is RAP2.1, a member of the large APETALA2 (AP2) gene family, and has a repeated element upstream of its 5′ end. The last, L6, is an unknown gene close to ALCOHOL DEHYDROGENASE on chromosome 1. CNG DNA methylation of K7 was less in the allotetraploids than in the parents, and the element varied in copy number. That K7 could be reactivated suggests epigenetic regulation. L6 was methylated in the C. arenosa genome. The present evidence that gene silencing accompanies allopolyploidization opens new avenues to this area of research.

INTRODUCTION

Allopolyploids are hybrids whose genomes contain a complete diploid set of chromosomes from each parental species. Both parental genomes are maintained with little changes through successive generations by limiting meiotic pairing to homologous chromosomes (i.e., from the same genome, in contrast to homeologous, from different genomes). Although many wild and cultivated allopolyploids that originated in prehistoric times are fertile, well adapted, and genetically stable, allopolyploids of more recent origin commonly display genomic and phenotypic instability (Pope and Love, 1952; Allard, 1960; Gupta and Reddy, 1991). Genomic instability has been shown to involve changes in chromosome structure, including the appearance (Burns and Gerstel, 1967) or disappearance (May and Apples, 1980) of heterochromatic blocks, the loss of nucleolar organizing regions (Vaughan et al., 1993), rearrangements of repeated DNA (Kamm et al., 1995; Zhao et al., 1998), and frequent changes in restriction fragment length polymorphism patterns (Song et al., 1995). Phenotypic variability of synthetic allotetraploids has been shown to involve numerous abnormalities, including sterility (Leitch and Bennett, 1997); novel homeotic phenotypes in Digitalis (Schwanitz, 1957), Gilia (Grant, 1956), and cotton (Meyer, 1970); flower variegation in Nicotiana (Gerstel and Burns, 1966); and the global dominance of one parental phenotype (Heslop-Harrison, 1990).

The causes of these extreme phenotypes are largely unknown. McClintock (1984) invoked the concept of genomic shock, which she defined as a preprogrammed response to an unusual challenge that led to extensive restructuring of the genome. A possible contributor to the “unusual challenge” is epigenetic gene silencing, which is triggered by homologous gene–gene interactions (Meyer and Saedler, 1996; Matzke and Matzke, 1998). The sudden union of redundant and diverged homeologous sets of genes in allopolyploids could trigger widespread gene silencing (Leitch and Bennett, 1997; Henikoff and Comai, 1998b; Rieseberg and Noyes, 1998) with accompanying changes in chromatin structure and DNA methylation (Henikoff and Matzke, 1997). Whatever its cause, the greater instability of synthetic allopolyploids than of established allopolyploids suggests that the latter have evolved mechanisms to gain fertility and stabilize their phenotypes and genomes (Sears, 1976; Eckenwalder and Brown, 1986; Gupta and Reddy, 1991). The importance of allopolyploid hybridization to basic and applied botany makes this area worthy of further investigation.

The model plant Arabidopsis thaliana should prove valuable for a detailed analysis of allopolyploidy. This species is the maternal parent of the allotetraploid A. suecica, a selfing species native to northern Europe. The parental origin of the A. suecica chromosomes was demonstrated by DNA sequence analyses (Price et al., 1994; Kamm et al., 1995; O’Kane et al., 1996) and by in situ hybridization to the repeat sequences of either A. thaliana or Cardaminopsis arenosa, which mark 10 and 13 centromeres, respectively (Kamm et al., 1995). The 26 chromosomes of A. suecica (2n = 26, set A′A′C′C′; n, gametic chromosome number; x, haploid chromosome number) make up the two sets depicted in Figure 1A: 10 from the diploid selfing species A. thaliana (2n = 2x = 10, set AA) and 16 from the tetraploid outcrossing species C. arenosa (2n = 4x = 32, set CCCC). Although very closely related in some aspects, these two taxa nonetheless exhibit 5 to 8% divergence of nucleotide sequence in protein-coding genes (Hanfstingl et al., 1994; Henikoff and Comai, 1998a) and 30 to 40% divergence in the 180-bp centromeric repeats (Martinez-Zapater et al., 1986; Vongs et al., 1993; Round et al., 1997).

The availability of allopolyploids of Arabidopsis could help us address genetic and evolutionary aspects of allopolyploidy. However, the utility of A. suecica for this purpose is limited by the probable genotypic divergence of any contemporary stocks from plants that were parental to the allopolyploidization. Nonetheless, synthetic allotetraploids generated from our current stocks should approximate the ancestral hybridization and permit precise comparison of parental and allotetraploid phenotypes as well as provide a perspective on what changes may occur early in allotetraploid evolution.

Our characterization of synthetic allotetraploids between A. thaliana and C. arenosa indicates that they are indeed phenotypically unstable and less fit than the parents and that this instability cannot be explained by the meiotic behavior of the allotetraploids. Interestingly, analysis of the expression of random genes demonstrated gene silencing in the allotetraploids. The characterization of these silenced genes revealed a role for epigenetic regulation and repeated sequences in silencing.

RESULTS

Hybridization and Seed Development

To reconstruct an Arabidopsis allotetraploid similar to A. suecica, we performed interspecific crosses in both directions between diploid or tetraploid A. thaliana Landsberg erecta (Ler) and C. arenosa Care-1 (all C. arenosa isolates are tetraploid). The schematic karyotypes are represented in Figure 1. On cross-pollination of C. arenosa, most siliques failed to develop. The seeds of cross-pollinated diploid A. thaliana plants enlarged at the normal rate for 10 days, but the embryos within these seeds became arrested at the globular stage; subsequently, the seeds turned brown and collapsed (Table 1). However, when we cross-pollinated a tetraploid A. thaliana, the seeds contained embryos at various more advanced stages of development, and although the majority of the seeds turned brown and collapsed, a small fraction matured. The development of the embryos in these crosses was scored at silique maturity and assigned the following stages: 70% globular, 7% heart to torpedo, 6% walking stick, 10% mature (including normal and malformed), and 7% viviparous. Approximately half of the matured seeds produced hybrid plants. The other half of the matured seeds, as well as a few rare seeds obtained by cross-pollination of C. arenosa, produced plants that resembled the maternal parent and lacked the paternal genome, as determined by random amplified polymorphic DNA (RAPD) analysis (data not shown). Clearly, hybridization between these species is severely challenged but can succeed when parental genomic ratios are balanced.

F1 Phenotype

A. thaliana and C. arenosa differ in overall size (small versus large), leaf shape (smooth versus serrated), mating habit (selfing versus outcrossing), and flower size, shape, and petal color. Various phenotypic features of A. thaliana are compatible with a selfing habit, whereas the outcrossing habit of C. arenosa depends on both morphological features and self-incompatibility. The rosette and cauline leaves of the hybrid plants resembled those of C. arenosa and A. suecica in serration and size (Table 2). Although several of the hybrids died for unknown reasons before flowering, four hybrids (of ∼300 original interspecific zygotes) flowered and set seed, establishing hybrid lines. Replicated crosses have demonstrated the reproducibility of the above method for generating allotetraploid lines (data not shown).

Figure 2 illustrates the hybrid phenotype: three of the four hybrids (605A, 605B, and 49-2B) had flowers that were intermediate in size and color to those of the parents (Figure 2H). In addition, the flowers were scented, had well-developed nectaries, and were slightly zygomorphic. Although pollen production was poor, manual pollination resulted in strong carpel elongation. Microscopic examination of the ovules within these carpels revealed developing embryos and endosperm, indicating self-compatibility and good fertilization. Nevertheless, many embryos then arrested at various developmental stages, and only ∼20% of the seeds were viable (Figure 2D). In most cases, as shown in Table 3, cross-pollination with pollen from A. thaliana (either diploid or tetraploid), C. arenosa, or A. suecica led to better seed set than did selfing. The hybrids resembled C. arenosa and A. suecica in their perennial growth habit. Whereas A. thaliana undergoes senescence and dies after a majority of siliques have matured, C. arenosa, A. suecica, and these three hybrids displayed abundant new basal growth and produced many flowers.

The fourth hybrid, 49-2A, behaved differently. Although it formed a rosette resembling those of the other hybrids, it grew more slowly and flowered later. Its inflorescence resembled that of A. thaliana, the cauline leaves were smooth-margined instead of serrated, and the flowers were smaller and had shorter petals. Although these flowers were efficient at self-pollination, the seeds produced were mostly shrunken and inviable. Molecular marker analysis, in addition to the progeny phenotype and karyotype (see below), confirmed the hybrid nature of this plant, which died after forming <30 flowers.

Molecular Markers

We confirmed the genomic constitution of the hybrids, their parents, and A. suecica Sue-2 by DNA fingerprinting with RAPD (Williams et al., 1990) and microsatellite (Bell and Ecker, 1994) markers (http://faculty.washington.edu/comai/molmarks.htm). These assays suggested a considerable heterozygosity in the C. arenosa genome. For example, analysis of the inheritance of the chromosome 1 microsatellite nga111 in the F2 plants derived from the dimorphic F1 hybrid 49-2B and the trimorphic hybrid 605B showed that all F2’s had the A. thaliana allele (162 bp) but differed in the alleles inherited from C. arenosa. The 49-2B progeny had a single 116-bp form, whereas the 605B progeny showed segregation of the 116 and 132 forms (data not shown here, but see http://faculty.washington.edu/comai/nga111.htm). Of 44 individuals scored, 21 were homozygous for the 116 allele, four were homozygous for the 132 allele, and 19 were heterozygous. Using the χ2 test, the probability of the null hypothesis for a 1:2:1 ratio is P = 0.0009; therefore, the results for F2 scoring indicate strong segregation distortion. Segregation distortion in hybrids has been attributed to the interaction of incompatible genic systems such as those resulting in gametocidal action (Endo, 1990).

The pot size in (A) and (B) is 10 cm. The flower in (H) is 4 mm in diameter, and all flowers are to the same scale.

Cytology

We examined the karyotype of pollen mother cells from hybrid and parental plants to establish their expected chromosome number and to assess whether they displayed normal meiosis. Figure 3 provides examples of these analyses for hybrid 605B, and Figure 4 provides examples of parental karyotypes and other hybrid karyotypes. We anticipated that the hybrids would have 26 chromosomes—the 10 chromosomes from A. thaliana and 16 from C. arenosa. Indeed, we observed this number in all three of the F1 hybrids that were available for analysis, as well as in the F2’s of hybrid 49-2A, whose F1 parent had died prematurely. Thirteen bivalents were visible at metaphase I (e.g., Figure 3A), and 13 chromosomes (chromatid pairs) could be seen in the metaphase II plates (Figures 3F and 3G). Thus, the allotetraploids had the expected chromosome numbers.

Homeologous pairing and recombination in the hybrids between A. thaliana and C. arenosa might have been expected to result in the formation of multivalents and univalents during meiosis I and, if chromosomal inversions differentiated the A. thaliana and C. arenosa genomes, in the formation of anaphase bridges. Indeed, some of the pollen mother cells displayed meiotic abnormalities such as laggard chromosomes and chromosomal bridges. Although the small chromosome size made it difficult to count paired chromosomes in many cells, all genotypes (except diploid A. thaliana; Figures 4A to 4D) clearly formed some multivalents, which were evident as figures of greater complexity (Figures 3B, 3C, 4G, and 4I) than were the normal rod and ring bivalents (Figures 3A, 4A to 4C, and 4J). In addition, the abnormal segregations seen as bridges and lagging chromosomes were observed in anaphase cells of both C. arenosa (Figure 4L) and A. suecica (Figures 4N and 4O), as well as in those of the synthetic hybrids. The incidence of meiotic abnormalities per observed meiosis was 5 to 15% for the synthetic hybrids, 10% for one C. arenosa plant and 0% for a second one, and 2 to 25% for A. suecica (see Methods for details). In conclusion, meiotic abnormalities were evident in the hybrid but were not demonstrably more extreme than those seen in the C. arenosa parent or in the natural allotetraploid.

Phenotypic Variation and Instability among Allotetraploid F2 Progeny

The allotetraploid F2’s displayed considerable variation in morphology, flowering time, and fertility—in contrast to the uniform phenotypes of Sue-1, Sue-2, and several A. thaliana ecotypes grown in the same environment. Leaf and flower phenotypes are illustrated in Figure 2, and the variation in several characters is indicated in Table 4. These striking patterns of variation could be caused by segregation of alleles within the C. arenosa genome, which is highly heterozygous.

In addition to phenotypic variation, phenotypic instability was observed in several instances: First, the phenotype of all 12 of the 49-2A F2 plants grown was distinctly allotetraploid (having large pink flowers and serrated leaves), even though the plants were derived from seeds produced by flowers similar to those of A. thaliana. In the F3 and F4 generations, the allotetraploid phenotype was inherited exclusively. The phenotype of the F1 and its progeny cannot be readily explained by Mendelian genetics. Second, reversion to an A. thaliana–like phenotype was again observed in the F2 of 605B, as exemplified in Figure 5. The switch to the Arabidopsis parental phenotype affected the first few flowers of basal lateral inflorescences. The subsequent flowers produced by this individual reverted progressively to the allotetraploid phenotype (Figures 5A to 5D). Eight of 13 F2 individuals in this family displayed dimorphic flowers: normal allotetraploid flowers and flowers with shorter petals. Third, variegated epidermal characters were observed, such as the stem anthocyanin shown in Figure 5E. Fourth, in >50% of the F2’s, the first one to three flower pedicels were aberrant; some formed a bifurcation (Figures 5F and 5G), whereas others were sharply basipetal rather than orthogonal like their parents (Figures 5I and 5J). We hypothesize that these structures resulted from partial arrest of the shoot apical meristem, followed by switching the main growth axis to a lateral shoot and converting the original shoot apical meristem to a lateral floral shoot.

Meiotic figures in pollen mother cells of the F1 hybrid 605B. Those in (A), (E) right, (F), (G), and (H) left are apparently normal; those in (B) to (D), (E) left, (H) right, and (I) are abnormal. Thirteen chromosomes are visible (some overlapping) within the metaphase II plates ([F] to [H]). A three-pointed multivalent chromosomal association is visible in the center of the cell in (B). The meiotic phase is indicated in the upper right corner of each figure (A, anaphase; M, metaphase; I and II, meiotic division numbers; MI/AI, transition between phases). The arrows indicate laggards, and the arrowhead indicates a chromosomal bridge. The bar in (A) = 5 μm for (A) to (I).

Gene Silencing and Reactivation in the Allotetraploids

In the absence of any evidence that the allotetraploids were unusually unstable in their karyotypes, we sought an explanation for their phenotypic instability in gene silencing or ectopic gene expression. To examine this possibility, we compared gene expression in the allotetraploids with that in the parents. Because changes in ploidy have been associated with changes in gene expression (Guo et al., 1996; Mittelsten Scheid et al., 1996), we undertook these comparisons in lines of the same ploidy, the parents being autotetraploid and the progeny allotetraploid. Using these strains, we first screened for differences in gene expression by means of amplified fragment length polymorphism (AFLP) analysis (Bachem et al., 1996), a polymerase chain reaction (PCR)–based method that displays random restriction enzyme fragments of cDNA (heretofore referred as “cDNAs”) on denaturing polyacrylamide gels. Analysis of ∼700 cDNAs derived from leaf or flower mRNA (see Methods) revealed frequent differences between F2 allotetraploids and the parents, as shown in Figure 6. Specific cDNAs varied in intensity between genotypes and, interestingly, were undetectable from some genotypes (Figure 6A). In several cases, cDNAs that could be detected in the parents were absent in certain allotetraploids. On the other hand, in two cases, cDNAs were present only in the allotetraploids. Because of their relative abundance, we focused on those cDNAs that were absent in the allotetraploids. We identified nine Arabidopsis cDNAs that were absent in all tested allotetraploids and eight that were absent in a subset of the allotetraploids. Similarly, eight Cardaminopsis cDNAs were absent in all tested allotetraploids, whereas ∼10% of Cardaminopsis cDNAs were absent in some but not all of the allotetraploids. Because heterozygosities are common within the C. arenosa genome because of its outcrossing habit (see “Molecular Markers”), absence from the allotetraploids of cDNAs specific to C. arenosa might reflect the fact that a putatively polymorphic allele was not inherited. Therefore, we chose as candidates for silenced alleles only those eight cDNAs that were absent in all tested allotetraploids.

To verify the reliability of the AFLP-cDNA analysis and establish whether a silenced gene was present, we cloned DNAs from 10 differential gel bands, sequenced them, and designed primers (18 to 22 nucleotides long) specific for each gene. Using these primers to conduct reverse transcription (RT)–PCR, we tested the expression of each candidate gene in new RNA preparations, both from the allotetraploids that were used in the original AFLP-cDNA analysis (Figure 6C, individuals 1 and 12) and from the 10 F2 siblings of each allotetraploid, and from selected F3 progeny (Figures 6B and 6C). We confirmed that three of the candidate genes, designated K7, K9, and L6, were silenced; at the same time, we confirmed the presence of the relevant genes by PCR of genomic DNA. Expression of gene K7 was detected in C. arenosa but was absent in the two allotetraploids (Figure 6). It was also absent in 10 of 10 siblings of individual 12 and in eight of 10 siblings of individual 1. It reappeared in one of eight F3 progeny. This positive F3 individual was produced by individual 3, a nonexpressor (Figures 6B and 6C), thus indicating reactivation of this element.

We also confirmed the silencing of two other genes among those that had provisionally been identified by AFLP-cDNA analysis as candidates for silencing. Both genes were silenced in a single allotetraploid individual but remained active in its siblings (Figure 6A; data not shown). Several other cloned cDNAs that seemed likely to represent silenced genes on the basis of the AFLP-cDNA analysis were shown by RT-PCR to be positive for expression. The differential behavior of these cDNAs during the AFLP-cDNA analysis could have had any of several causes. For example, these cDNAs could be subject to undetermined PCR artifacts, or if these cDNAs were derived from C. arenosa, they could represent alleles that were polymorphic in one of the two restriction sites defining each AFLP-cDNA product. Alternatively, they could represent minor background cDNAs that were coelectrophoresed and coeluted from the gel band with a cDNA that showed differential activity. Finally, they could represent genes that were only partially silenced but retained sufficient activity to score as positive under our RT-PCR conditions. Although seven of these isolates failed to pass the RT-PCR test for silencing, three of the original 10 cDNAs were confirmed to represent bona fide silenced genes in the allotetraploids. Given that these three were identified from ∼700 that were sampled, we estimate that at least ∼0.4% of the genes may be silenced in allotetraploids.

Characterization of the Silenced Loci

The cloned differential AFLP-cDNAs were derived from genes of either C. arenosa (K7 and L6) or A. thaliana (K9). The Arabidopsis genome should contain the K7 and L6 homeologous genes (expected to have 92 to 95% identity) and the K9 gene. We investigated the nature of the loci encoding these transcripts by a combined approach that included similarity searches of DNA databases, cloning of the DNA flanking the AFLP-cDNAs, and gel blot analysis. Our results are summarized in Figure 7.

Apparently normal ([A] to [D], [F], [H], [J], [K], and [M]) and abnormal ([E], [G], [I], [L], and [N] to [Q]) meiotic figures of pollen mother cells from diploid A. thaliana ([A] to [D]) and C. arenosa ([F], [G], [K] to [M]), from tetraploid A. thaliana ([H] and [I]), from F1 hybrids 49-2B ([E], [P], and [Q]) and 605A (J), and from A. suecica ([N] and [O]). The meiotic phase is indicated in the upper right corner of each figure (A, anaphase; M, metaphase; T, telophase; I and II, meiotic division numbers. The arrows indicate laggards or a micronucleus formed around laggards (E), and the arrowheads indicate chromosomal bridges. The bars in (A) to (Q) = 5 μm.

We searched the database of sequenced A. thaliana DNA by BLASTN (Altschul et al., 1990) analysis with the K7 sequence but detected no match of the expected homeology, indicating that this DNA element is absent from the sequenced genome database. We found a match of low but significant similarity between K7 and the long terminal repeat (LTR) of a copia-like retrotransposon found in bacterial artificial chromosome F28H18 (nucleotides 49,801 to 56,100, E = 4.5E–4), which we named Brewmeister1. Brewmeister1 is probably inactive, having several lesions in its coding region. K7 was even more similar to “solo” copies of this LTR (found at loci F14P22 and T3E15) that occur within genes putatively encoding products related to those of DNA transposons. Thus, K7 could be expressed from a gene related to either a retroelement or a DNA transposon. To distinguish between these and other possibilities, we isolated DNA flanking the K7 sequence from the genomes of A. thaliana and C. arenosa by inverse PCR (IPCR) (Ochman et al., 1988). Regardless of whether the template genomic DNA was undigested (and unligated) or had been digested with any of several restriction enzymes before ligation, the IPCR products were of similar size (2.2 kb), suggesting that circularization of a restriction fragment carrying K7 was not necessary for successful PCR with diverging IPCR primers. We assumed, therefore, that tandem repeats containing K7 were present in both parental genomes. The K7-flanking sequences isolated from both species were similar and defined several repetitive DNA elements. Their ensemble is defined here as the K7 repeat. To determine whether the K7 repeat is a gene, we subjected its sequence to BLASTX and gene prediction analysis (see Methods). The results indicate that a gene or pseudogene (Figure 7, G7) spans the original K7 AFLP-cDNA. This putative gene encodes a protein related to the transposon En/Spm hypothetical protein 1 (pirS29329; E = 0.083). Similar proteins are encoded by the F14P22 and T3E15 loci. Additional repetitive elements are found 5′ and 3′ of the putative gene (Figure 7, U7, D7, and Gnomo) but do not occur in loci F14P22 and T3E15. Instead, they are found separately and individually in the A. thaliana genome (e.g., in the A. thaliana genome, U7 matched LERJF13TF; Gnomo matched 77 elements), and the element called Gnomo is found in the genomes of other taxa such as mulberry and Brassica juncea.

Figure 8 shows gel blot analysis of genomic DNA probed with the K7 cDNA (Figure 8A) or with the K7 repeat (Figures 8C and 8D; see legend). Sequences showing high similarity to the K7 repeat are present in two to five copies in the A. thaliana genome but in higher copy numbers (20 to 50) in the C. arenosa genome. This indicates that many copies have been gained by C. arenosa or eliminated by A. thaliana, given that these two species diverged from a common ancestor. Surprisingly, K7 elements are present in variable copy numbers in the allotetraploids. For example, two F3 allotetraploids displayed a higher copy number than the C. arenosa parent (Figures 8A and 8B). Most copies were refractory to HpaII digestion but were partially digested by MspI, indicating heavy CG methylation and partial CNG methylation. In the allotetraploids, the K7-related elements were more susceptible to MspI digestion than were those in C. arenosa, implying less extensive CNG methylation of the K7 elements. Taken together, these data indicate that the K7 gene is related to transposable elements and that it may be heterochromatic in both parental genomes.

K9, one of the other two AFLP-cDNAs, is transcribed from the A. thaliana genome and matches a partial cDNA encoding the RAP2.1 protein of the APETALA2 (AP2) family (Okamuro et al., 1997). Approximately 10 genes in the Arabidopsis DNA database shared 50 to 75% identity with some portion of the RAP2.1 cDNA. The genomic regions flanking RAP2.1, which we isolated by IPCR as a 5-kb fragment, were not represented in the database. An ATG codon is present a few codons upstream of the available cDNA entry. Further upstream and downstream of the RAP2.1 cDNA sequence are regions that were assigned low coding probability by the gene prediction program NetPlantGene2 (Hebsgaard et al., 1996). Approximately 0.7 kb upstream of the putative RAP2.1 start codon, an open reading frame diverges from RAP2.1 and may represent a different gene, here called G9. G9 is similar to ∼100 A. thaliana sequences (65 to 90% identity) and is therefore repeated (some G9 repeats have 8-bp inverted repeats, suggesting that they are mobile elements; B. Belknap, personal communication). To determine whether mutations in the promoter region could have caused the silencing, we sequenced 800 bp of the RAP2.1 5′ region, which presumably includes the promoter, from both the parent A. thaliana and the silenced allotetraploid and found no changes. Hybridization of the sequences of RAP2.1 isolated by IPCR to A. thaliana DNA confirmed the existence of closely related repeated sequences in the A. thaliana genome (Figure 8G) that were at least partially methylated at CG sites (data not shown). The K9 flanking regions did not hybridize to the C. arenosa DNA (Figure 8G), which suggests that the regions were absent or that considerable divergence had occurred.

The last AFLP-cDNA, L6, is from C. arenosa and displays high similarity to a sequenced region of the A. thaliana genome adjacent to ALCOHOL DEHYDROGENASE 1 on chromosome 1. This relationship was confirmed by analysis of the flanking DNA isolated by IPCR from both A. thaliana and C. arenosa, which verified that L6 is the transcription product of the C. arenosa homeolog. A small gene, F22K20.6, is predicted around nucleotide 39,000. Both F22K20.6 and L6 are moderately well represented in the database of expressed sequence tags and could be part of the same transcriptional unit. The L6/F22K20.6 region has limited similarity to three genomic DNAs in the Arabidopsis database. When several randomly chosen A. thaliana genes were tested by BLASTN analysis, they showed database matches of comparable similarity and frequency (L. Comai, unpublished observations). Therefore, unlike K7 and K9, the L6 locus does not show an unusual association with repeated DNA. Gel blot analysis of genomic DNAs probed with L6 revealed an unexpected HindIII digestion pattern (Figure 8F): a 0.9-kb digestion product of A. thaliana DNA was absent in the allotetraploid and in C. arenosa DNA lanes, which instead showed an additional 2-kb band. Perhaps asymmetric methylation of the HindIII site (Nelson et al., 1993) might have prevented complete digestion of the allotetraploid and of the C. arenosa DNA. Methylation of the C. arenosa homeolog was detected in the DNA of at least one tested allotetraploid and in one C. arenosa individual.

(A) to (D) Switch in flower morphology. The main inflorescence produced typical hybrid flowers (A), but flowers on basal coflorescences varied from a small-petaled form (C) to a large-petaled one (B). All flowers are shown at anthesis. The position of the different flower types on the plant body is illustrated in (D).

(D) and (E) Variegation. Anthocyanin-free regions were present along an inflorescence stem (in the position indicated by the rectangle).

(F) to (J) Irregular transition. Irregular transition between vegetative and floral stages resulted in bifurcation of the growth axis. The normal A. thaliana phenotype is shown in (H). A related alteration led to formation of basipetal flower pedicels ([I] and [J]).

DISCUSSION

The causes of the phenotypic and genomic instabilities occurring in synthetic allopolyploids have long been mysterious. In this article, we show that gene silencing can take place in the F2 generation after allopolyploid hybridization of A. thaliana and C. arenosa, affecting both putative euchromatic genes and a repeated gene related to transposons. This discovery was made possible by the analysis of synthetic allotetraploids, which were generated through sexual crosses. Consistent with the instability of gene expression, these allotetraploids displayed phenotypic instability, although their rate of meiotic dysfunctions did not vary appreciably from the C. arenosa parental rate.

Synthesis and Characteristics of the Allotetraploid Hybrids

Synthesis of these allotetraploid hybrids was simple and reproducible, despite several barriers to hybridization. On average, pollination of just two to three flowers (∼75 zygotes) of tetraploid A. thaliana with C. arenosa (a natural tetraploid) was sufficient to produce each viable allotetraploid. These findings are relevant to the manner of hybridization that generated A. suecica, which has been the subject of debate (Redei, 1974). Our data suggest that this natural allotetraploid originated from pollination of a diploid egg of A. thaliana by C. arenosa pollen. Diploid A. thaliana fails to hybridize successfully with C. arenosa, because the triploid hybrid embryos (ACC genomes) arising from this cross arrest at the globular stage, even though triploid A. thaliana embryos (AAA genomes) develop relatively normally (Scott et al., 1998). Consistent with the A. thaliana–like cytoplasm of A. suecica (Price et al., 1994), hybridization between A. thaliana and C. arenosa was successful only if A. thaliana was the female parent. Failure of the reciprocal cross suggests that cytoplasmic–nuclear incompatibilities or parent-specific DNA imprinting (Scott et al., 1998) may affect the outcome.

Barriers to hybridization between these species were evident not only from the poor seed set and inviability but also from the phenotypic variability and instability of those hybrids that germinated. Although our cytological analysis of meiosis revealed some of the karyotypic abnormalities that are expected to cause inviability and instability (such as anaphase bridges and laggard chromosomes), similar defects were also seen at comparable frequencies in the C. arenosa parent as well as in the natural allotetraploid A. suecica. Furthermore, we found no evidence for any striking chromosomal instability that might explain stochastic changes in phenotype. More compellingly, the observed phenotypic instability often involved reversion between two states in a manner that would be inconsistent with any irreversible changes of chromosomal complement. More probably, the phenotypic instabilities seen in these allotetraploids result from regulatory dysfunctions such as epigenetic gene silencing (Meyer and Saedler, 1996; Matzke and Matzke, 1998).

Comparison of gene expression in the parents, tetraploid A. thaliana (At) and tetraploid C. arenosa (Ca), and in synthetic F2 and F3 allotetraploids (1 to 24).

(A) Portions of an AFLP-cDNA gel displaying random leaf RT-PCR products from parental and filial genotypes (1 and 12) and an artificial reconstruction of the expected allotetraploid expression pattern (AC) made by mixing the cDNAs. The product corresponding to product K7 is indicated in the top section. Examples of AFLP-cDNA patterns are shown in the sections below.

(B) RT-PCR analysis of gene expression performed with mRNA preparations from allotetraploids and parents. Amplification of K9, K7, and L6 mRNAs is compared with that of the control ROC1 (CYC, cyclophilin) mRNA (Lippuner et al., 1994) or that of the control actin ACT2 mRNA. The unmarked lanes contain molecular size standards (25-bp ladder; the strongest band is a 125-bp or 100-bp ladder). Lanes labeled DNA display control amplification products from genomic DNAs. Allotetraploids 2 to 11 and 13 to 22 were positive for the K7 gene (data not shown). Bl, “no template” control; P, K7 PCR product from a plasmid clone. The low CYC signal in the DNA control lane marked Ca (second gel from the top) is probably attributable to a low input of DNA, which would affect the repetitive K7 signal to a lesser degree.

(C) Genealogical tree showing three generations of the tested hybrids. Individuals (rounded squares) are numbered corresponding to the gel lanes in (A) and (B). The RT-PCR results for unnumbered individuals without the (?) mark are not shown.

Gene Silencing

A well-established case of gene silencing in interspecific hybrids—nucleolar dominance—is triggered in some manner by interactions between homeologous sets of ribosomal genes (Navashin, 1934; Durica and Krider, 1977). Our parallel investigation of nucleolar dominance in the hybrids used in the present study revealed differences in rRNA gene expression among the F1 hybrids, thus showing variation between codominance and dominance of the C. arenosa rRNA genes (Chen et al., 1998). This instability was resolved in the F2 plants, which consistently showed silencing of the A. thaliana genes. This dominance relationship could be reversed, however, by changing the parental genomic ratio from 1:1 (AACC) to 3:1 (AAAC). In this article, we extend the analysis of phenotypic instability in these allotetraploids to establish whether the genes that are transcribed by RNA polymerase II to generate mRNAs are also subject to silencing.

The schematic drawing represents the structure of the three silenced loci, showing the elements identified by BLAST analysis, algorithm prediction (see Methods), or transcription analysis. Gnomo, G7, K9, and G9 are highly to moderately repetitive elements. The putative repeated unit of K7 is indicated. The GenBank accession numbers are given in the right lower box. chr., chromosome; cM, centimorgan; Seq., sequence.

The present comparison of AFLP-cDNA products in the parents and the allotetraploids suggested the occurrence of gene silencing but also, though less frequently, gene activation in the allotetraploids. Sequencing candidate cDNAs followed by RT-PCR provided definitive evidence for gene silencing in three instances. The silenced K7 gene (the K7 repeat) is related to transposable elements; it appears to consist of a solo LTR embedded inside a putative transposon gene. In addition, the K7 locus is methylated, and its copy number differs between the parental genomes, being low in A. thaliana and higher in C. arenosa. Its chromosomal location in the A. thaliana genome is unknown. A very closely related element (T3E15) resides in a region of chromosome 4 that has been cytologically identified as paracentromeric heterochromatin and is densely populated by Athila, MuDR-like sequences, and retroelements (Fransz et al., 2000). Although silenced in most allotetraploids tested, the K7 gene was active in a few of them. One of these active individuals was the progeny of an F2 allotetraploid silenced for K7. In this lineage, therefore, the K7 gene had undergone reactivation, indicating that its initial silencing must not have been caused by a permanent genetic lesion. The two other silenced loci have no obvious association with transposons, although the K9 locus contains repeated DNA. K9, identified as RAP2.1, is a member of a large gene family that encodes proteins related to AP2; its chromosomal location is unknown. L6 is located at 114 centimorgans (from the top) on chromosome 1 close to ALCOHOL DEHYDROGENASE, in a region presumed to be euchromatic, although its homeolog in C. arenosa, unlike that in A. thaliana, was methylated at HpaII sites. L6 may be part of gene F22K20.6, which is annotated as encoding a protein of unknown function.

These results reveal the occurrence of rapid silencing of polymerase II–transcribed genes in interspecific hybridization. Such silencing has important implications for phenotypes of hybrid organisms. The conjunction of two diverged genomes into a hybrid individual may reveal incompatibilities between biochemical, regulatory, or developmental pathways that are controlled by genes segregating in the usual Mendelian manner (Dobzhansky, 1937). If this hybridization is also accompanied by gene silencing, then the range of phenotypes exhibited by the hybrid could vary even more widely. On one hand, homology-dependent silencing may completely prevent the expression of an important genetic function that otherwise would have been conferred by homeologous genes from both parental genomes; in that case, this silencing would strengthen the barrier to hybridization. Or gene silencing might improve the fitness of the hybrid individual by preventing the simultaneous expression of homeologous functions for which coexpression is incompatible with normal development. Given the instability of silencing interactions, this latter advantageous result might benefit the hybrid only transiently and have to be replaced in subsequent generations by stable genetic changes conferred by mutation or gene conversion. Thus, gene silencing might either impede hybridization or provide a plasticity of gene expression that could foster the vitality of allotetraploids.

Gene silencing appears to have an especially crucial relationship, both as a cause and an effect, to the genomic instability that often accompanies allopolyploid hybridization (Song et al., 1995; Feldman et al., 1997; Liu et al., 1998). Instability of this sort might derive from the breakdown of genomic surveillance systems, for example, if the genes silenced encode functions in DNA repair or in chromatin structure. In marsupial hybrids, for example, chromosome remodeling resulting from runaway transposition of retroelements is associated with hypomethylation of the hybrid genome (O’Neill et al., 1998). Similarly, reactivation of dormant transposons may be responsible for the frequent changes in repeated DNA elements that are often observed in allopolyploid genomes (Burns and Gerstel, 1967; Zhao et al., 1998). Although gene silencing in the Arabidopsis allotetraploids we examined affected euchromatic genes, such as K9 and L6 may be, it was more frequently directed against the K7 gene, a repeated element related to transposons, the copy number of which changed in some of the allotetraploid genomes. Silencing of the K7 gene may be related to a defense response against transposons (Comai, 2000). The hypothesis that K7-related elements become active in allopolyploidization, thereby triggering a genomic defense response, could be explored in greater detail by analyzing the expression, copy number, and chromosomal distribution of these elements in F1 and successive allotetraploid generations. On the other hand, silencing of K9 (RAP2.1) and L6 (perhaps F22K20.6) may well depend on paramutagenic interactions between homeologous genes. For example, given the methylated state of L6 in C. arenosa, this locus is possibly paramutable and perhaps paramutagenic.

(A) K7 hybridization to HpaII (H)-resistant DNA revealed heavy cytosine DNA methylation at CG sites. Partial resistance of hybridizing DNA to MspI (M) revealed reduced but still substantial methylation of CNG sites. These sites of the C. arenosa K7 sequences were less methylated in the hybrids. The copy number of K7 varied in different hybrids: the F2 hybrids of the 605B family displayed strong hybridization even though comparatively less DNA was loaded, as shown in (B). The probe used was a 200-bp PCR product corresponding to the K7 AFLP-cDNA amplified from the genomic DNA of C. arenosa. This probe distinguished the C. arenosa K7 locus from the A. thaliana homeolog.

(D) K7 hybridized to HindIII-digested DNA. The probe used was the K7 IPCR product (2.3 kb) isolated from C. arenosa.

(E) and (F) L6 hybridized to DNA digested with HpaII (E) and HindIII (F); the DNA used was prepared from different individuals. Asterisks show partial HpaII digestion products protected by methylation. The probe was the L6 1-kb IPCR product obtained from A. thaliana DNA.

(G) K9 hybridized to EcoRI-digested DNA. The lack of hybrididization to the C. arenosa DNA is explained by the use of a probe (K9/RAP2.1 IPCR; see Methods) representing the flanking sequences and covering only a few codons of the RAP2.1 transcribed region.

Gel lanes were loaded with equal amounts of DNA. Therefore, the lanes with hybrid DNA contain approximately half the amount of each parental DNA. In (C) to (G), the size in kilobases of molecular size standards is shown at right of the gel. At, A. thaliana; Ca, C. arenosa; Hy, hybrid or allotetraploid.

In conclusion, the discovery of rapid gene silencing in synthetic allotetraploids provides a new avenue of investigation into the molecular events that shape the outcome of allopolyploidy. Uncovering the causes of silencing will further our understanding of how related genomes that have diverged since different species arose from a common ancestor then interact when reunited. This task will be facilitated by the powerful tools available in Arabidopsis for genetic and genomic analyses.

Embryo Development

To monitor the fate of each cross, we followed seed development by scoring specific stereotypical, well-characterized events (Bowman, 1994). At several different times after fertilization, immature siliques were dissected, and the developing seeds were placed in a drop of water on a slide. An average of 150 seeds of each available genotype was examined microscopically by either dissecting them individually with a teasing needle or by expressing their content with a cover slip. The measurements performed during development were used to estimate the embryonic fate in mature siliques because collapsed and dried seeds in mature siliques could no longer be dissected. Inviable (lethal) embryos (brown, red, collapsed, or very small seeds) were scored by dissecting 10 green siliques approaching maturity.

DNA Analysis

DNA was prepared using the LPG prep. The extraction buffer consisted of 0.1 M Tris, pH 8.0, 50 mM EDTA, 0.5 M NaCl, and 0.7% (w/v) SDS; to this, proteinase K was added to 50 μg/mL (final concentration) just before extraction. Fifty to 100 mg of tissue was ground for 10 sec in a plastic 1.5-mL tube with a disposable plastic pestle (Kontes, Vineland, NJ), 150 μL of extraction buffer was added, and the sample was ground for another 20 sec. Finally, 700 μL of extraction buffer was added, and if tissue fragments were still visible, the sample was further dispersed with the pestle. The lysate was incubated at 55°C for 1 to 5 hr, mixed with 520 μL of a saturated NaCl solution, and centrifuged at 12,000g for 20 min. The nucleic acids in the supernatant were precipitated by adding 1.7 mL of 85% isopropanol and collected by centrifuging at 10,000g for 10 min. The pellet was washed with 70% ethanol and resuspended in 200 μL of 10 mM Tris, pH 8, containing 1 mM EDTA (TE buffer). The RNA was precipitated by adding 133 μL of 5 M LiCl, incubating at 4°C for 5 hr, and centrifuging at 12,000g for 10 min; the pellet was removed. DNA was precipitated from the supernatant by addition of 2 volumes of ethanol, followed immediately by centrifugation. The DNA so isolated was resuspended in 100 μL of 10 mM Tris, pH 8.0, and stored at 4°C. Random amplified polymorphic DNA (RAPD) analysis and microsatellite DNA analysis were performed according to Williams et al. (1990) and Bell and Ecker (1994), respectively. We monitored the inheritance of 24 RAPD markers and 11 microsatellite markers. The χ2 test for independence was performed using Microsoft Excel 5.0. For DNA gel blots, 1 μg of DNA was digested with 10 units of the appropriate restriction enzyme per microgram of DNA, electrophoresed on an 0.8% agarose gel, blotted on Biodyne-B (Pall, Ann Arbor, MI) nylon membrane, and hybridized at 60°C in 2 × SSC (1 × SSC is 0.15 M NaCl and 0.015 M sodium citrate). Washes were in 0.2 × SSC at 60°C. Radiolabeled probe was prepared by the random priming polymerase reaction with the Prime-it kit from Amersham.

Analysis of DNA similarity was performed by using the WU-BLAST2 program (using the AATDB server, which was terminated, or the server at EMBL, http://dove.embl-heidelberg.de/cgi/blast2). In certain cases, the TAIR NCBI-BLAST 2 was used, modifying the query parameters as follows: nucleic mismatch, –4; nucleic match, 5; gap opening, 10; and gap extension, 10. The use of WU-BLAST or of the modified NCBI-BLAST was required to recognize more distant relationships between DNA sequences. For the K7 and K9 loci, similarity searches were performed using the DNA sequences deposited in GenBank. For the L6 locus, a sequence from nucleotides 37,000 to 40,000 of bacterial artificial chromosome F22K20 was used. Protein similarity searches were performed using the gapped NCBI BLAST 2.0 server at NCBI. Gene predictions were performed on the Net-Gene2 server (http://www.cbs.dtu.dk/services/NetGene2/) (Hebsgaard et al., 1996). Pairwise BLAST comparisons were performed on the DIYB BLAST server (http://www.proweb.org/Tools/new-blast.html).

To detect the occurrence of possible mutations, we amplified the putative promoter of the RAP2.1 gene as an 800-bp fragment by the polymerase chain reaction (PCR) with a proofreading thermostable polymerase mix (Klentaq-LA; Clontech, Palo Alto, CA) from the allotetraploid in which the RAP2.1 gene was silenced and from the Arabidopsis parent. Two PCR products per genotype, each derived from an independent PCR reaction, were sequenced in both orientations by using the amplification primers.

Cytology

Flower buds were fixed for 24 hr in 1:3 (v/v) glacial acetic acid/100% ethanol and stored in 70% ethanol at 4°C. Anthers were dissected, placed in a drop of acetocarmine (2% carmine in 50% acetic acid) or 4′,6-diamidino-2-phenylindole (0.3 μg/mL) on a glass slide, and squashed with an iron needle. A glass cover slip was applied, the slide was heated gently over a flame, and pressure was applied to burst and spread the pollen mother cells. A Nikon Microphot was used to view and photograph the chromosomes. Images were digitized and adjusted for contrast in Photoshop (Adobe, Menlo Park, CA). Meiotic abnormalities were scored in blind tests by observing the following number of meioses per genotype (metaphase I and anaphase I, respectively): Sue-1, 6, 33; Sue-2, 51, 9; 612, 10, 12; Care-1, 45, 8; 49-2B, 34, 12; 605A, 39, 18; and 605B, 30, 19. In nonblinded tests, we further scored 30 metaphase I and 30 anaphase I of A. thaliana 612. Pollen viability was estimated after staining with 1% acetocarmine and counting at least 100 cells per genotype.

Gene Expression Analysis

Amplified fragment length polymorphism (AFLP)–cDNA was performed with mRNA purified from comparable organs from the parental and F2 allotetraploid plants of comparable developmental age (from rosette leaves, before production of a visible inflorescence, or from flower buds). Because the F1 plants grew at different rates, they were not included in the expression analyses. RNA was purified from 100 mg of tissue (mature rosette leaves and inflorescence top, including buds and flowers) by using Trizol reagent according to the manufacturer’s instructions (Gibco Life Technologies) and subjected to oligo(dT) affinity purification by using the magnetic separation kit of Promega. The mRNA was subjected to AFLP-cDNA analysis (Bachem et al., 1996). In a first round of experiments, leaf and flower AFLP-cDNAs were made from diploid Ler, tetraploid Ler (612), two F2 allotetraploids, C. arenosa, and, as control, from a 1:1 mix of tetraploid Ler and C. arenosa cDNAs. In a second round of experiments, leaf and flower AFLP-cDNAs were prepared from duplicate mRNA preparations from the genotypes used above but using four F3 allotetraploids. In this second analysis, as a control, we also examined the AFLP-cDNA profiles of two different tetraploid A. thaliana individuals. The expression patterns were identical, suggesting that gene silencing is rare within inbred A. thaliana. Differential products were eluted from the gel by boiling in water for 2 min, reamplified, and cloned using the Invitrogen TA-cloning kit. Several clones were sequenced for each product by using the dye-terminator kit of Applied Biosystems (Palo Alto, CA). The three silenced genes reported here were isolated from the first round of AFLP-cDNA experiments. Often, the AFLP gel band yielded different cloned cDNAs. To determine which sequence represented the silenced cDNA, sequence-specific primers were designed from the Primer3 Web site (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi) and used in reverse transcription (RT)–PCR analysis. For RT-PCR, new mRNA preparations were made both from the plants originally used for the AFLP-cDNA analysis and from other allotetraploid individuals. RT was performed on 10 to 25% of each mRNA preparation and used random hexamers as primers. Of the RT reaction product, 5% was used for RT-PCR. The RT-PCR primers were L6-L, 5′-TCTCCAGCAAATGATGAACAA-3′; L6-R, 5′-CCGGAGAGTCTCAATTTGGT-3′; K7-A, 5′-TGGAGAGGCTTATGGACG-3′; K7-B, 5′-GCTCTTCTTAATGTGTTG-3′; K9-L, 5′-TTGTTTTTGTTTTTGATAAGAACTCTG-3′; and K9-R, 5′-CACTCGCTAGCTTCTCATGG-3′. The K7 primers were specific for C. arenosa DNA, the K9 primers were specific for A. thaliana DNA, and the L6 primers preferentially amplified the C. arenosa DNA.

IPCR Isolation of DNA Flanking the AFLP-cDNA

To isolate the DNA flanking the AFLP-cDNA, we used the AFLP-cDNA sequence of K7 and L6 and the available cDNA sequence of RAP2.1 (K9). We designed outward-facing primer pairs for inverse PCR (IPCR) (Ochman et al., 1988), using the Web-based Primer3 program. The primer sequences were as follows: for K7, K7-AL (5′-TCTCCTTTGCCTATTTAAAGGCTGT-3′) and K7-AR (5′-TCCAACGTCCATAAGCCTCTCC-3′); for K9, K9-AL (5′-CCGGTAATAAAACCCGACTTGAATC-3′) and K9-AR (5′-TGGTTAGGCACTTCTTCTTGAGGTG-3′); and for L6, L6-AL (5′-GGAGGACCGCGGCAATAAG-3′) and L6-AR (5′-TTCATCATTTGCTGGAGAATGAAA-3′). Genomic DNAs from A. thaliana and C. arenosa were digested in separate reactions with six restriction enzymes (AseI, SphI, HindIII, EcoRI, SpeI, and XbaI). Each digested DNA was ligated (2 ng/10 μL reaction volume), and 200 pg was used in long-distance PCR (20-μL reaction volumes) with an annealing step at 65°C (Henikoff and Comai, 1998a). Reactions that produced PCR products of desirable size were used for cloning with the Invitrogen Topo-TA kit. In the case of K7, control reaction with native genomic DNA showed that ligation of a circular fragment is not a requirement for amplification (see Results). The correct flanking fragments were identified by matching the known sequences adjacent to the IPCR primers. The sequences of K7, K9, L6, and their respective IPCR products have been deposited in GenBank (see Figure 7 for the accession numbers). The putative promoter of RAP2.1 was amplified from the DNA of A. thaliana 612 and from the silenced hybrid 623-K1 (Figure 6, individual 11) by using primers K9-AR (see above) and primer K9-PL (5′-TGGGTATTCAGCCCATTTTAAACC-3′). Two independently amplified PCR products per genotype were sequenced in both directions with the amplification primers.

Acknowledgments

We acknowledge the assistance of Jiang Aimin with the molecular typing of the allotetraploids, Chi-Min Fu with the embryo development studies, and Jorja Henikoff with BLAST search strategy. This work was supported in part by U.S. Department of Agriculture–National Research Initiative Competitive Grants Program Grant No. 97-35301-4429 to L.C. and by a National Institutes of Health grant to B.B.

Footnotes

↵1 Current address: Department of Biology, University of the South Pacific, P.O. Box 1168, Suva, Fiji.