We have isolated and sequenced all 23 members of the 22-kD alpha zein (z1C) gene family of maize. This is one of the largest plant gene families that has been sequenced from a single genetic background and includes the largest contiguous genomic DNA from maize with 346,292 bp to date. Twenty-two of the z1C members are found in a roughly tandem array on chromosome 4S forming a dense gene cluster 168,489-bp long. The twenty-third copy of the gene family is also located on chromosome 4S at a site approximately 20 cM closer to the centromere and appears to be the wild-type allele of the floury-2 (fl2) mutation. On the basis of an analysis of maize cDNA databases, only seven of these genes appear to be expressed including the fl2 allele. The expressed genes in the cluster are interspersed with nonexpressed genes. Interestingly, some of the expressed genes differ in their transcriptional regulation. Gene amplification appears to be in blocks of genes explaining the rapid and compact expansion of the cluster during the evolution of maize.

The 78,101 base pair long sequence of a cluster of 22-kDa alpha zein genes in the maize inbred BSSS53 was determined. Each zein gene is contained within a repeat unit that varies in length. If such a repeat, or amplicon, is aligned along the entire sequence, a 10.5-fold sequence amplification is delineated. Because of insertions and deletions in intergenic regions, many of the zein genes are spaced over different distances. Only three out of 10 zein-related sequences have an intact open reading frame, indicating an unusual large number of genes unable to contribute to the accumulation of normal-size 22-kDa zein proteins. It is proposed that the seven remaining zein-related sequences be considered gene reserves because of their potential to be restored by gene conversion. Intergenic insertions in the cluster range from 1098 to 14,896 base pairs. Although they are composed of transposable element sequences, they also contain additional open reading frames, two of them showing homology to rice cDNA sequences. The average amplicon is 4423 base pairs long, with the sequence surrounding each zein gene more than 90% conserved. Coincidently, the size of the amplicon is equivalent to the average gene density (one gene within 4640 bp) in the Arabidopsis thaliana genome, one of the smallest in plants. Successive steps of amplification and insertion of DNA might explain to a certain degree how genome size variation has been generated in plants.

Progress in agricultural and environmental technologies is hampered by a slower rate of gene discovery in plants than animals. The vast pool of genes in plants, however, will be an important resource for insertion of genes, via biotechnological procedures, into an array of plants, generating unique germ plasms not achievable by conventional breeding. It just became clear that genomes of grasses have evolved in a manner analogous to Lego blocks. Large chromosome segments have been reshuffled and stuffer pieces added between genes. Although some genomes have become very large, the genome with the fewest stuffer pieces, the rice genome, is the Rosetta Stone of all the bigger grass genomes. This means that sequencing the rice genome as anchor genome of the grasses will provide instantaneous access to the same genes in the same relative physical position in other grasses (e.g., corn and wheat), without the need to sequence each of these genomes independently. (i) The sequencing of the entire genome of rice as anchor genome for the grasses will accelerate plant gene discovery in many important crops (e.g., corn, wheat, and rice) by several orders of magnitudes and reduce research and development costs for government and industry at a faster pace. (ii) Costs for sequencing entire genomes have come down significantly. Because of its size, rice is only 12% of the human or the corn genome, and technology improvements by the human genome project are completely transferable, translating in another 50% reduction of the costs. (iii) The physical mapping of the rice genome by a group of Japanese researchers provides a jump start for sequencing the genome and forming an international consortium. Otherwise, other countries would do it alone and own proprietary positions.

Geminiviruses (Geminiviridae) are a diverse group of plant viruses differing from other known plant viruses in possessing circular, single-stranded DNA. Current classification divides the family into three subgroups, defined in part by genome organization, insect vector, and plant host range. Previous phylogenetic assessments of geminiviruses have used DNA and/or amino acid sequences from the replication-associated and coat protein genes and have relied predominantly on distance analyses. We used amino acid and DNA sequence data from the replication-associated and coat protein genes from 22 geminivirus types in distance and parsimony analyses. Although the results of our analyses largely agree with those reported previously, we could not always predict viral relationships based on genome organization, plant host, or insect vector. Loss of correlation of these traits with phylogeny is likely due to improved sampling of geminivirus types. Unrooted parsimony trees suggest multiple independent origins for the monopartite genome. genome organization is therefore a dynamic character. Estimates of nonsynonymous and synonymous nucleotide substitutions for extant and inferred ancestral sequences were used to evaluate hypotheses that the replication-associated and coat protein sequences evolve to accommodate plant host and insect vector specificities, respectively. Results suggest that plant host specificity does not solely direct replication-associated protein-evolution but that coat protein sequence does evolve in response to insect vector specificity. Genome organization and, possibly, plant host specificity are not reliable taxonomic characters.

We have investigated the methylation status of the alpha-tubulin genes, and the degree of accumulation of their mRNAs in endosperm, embryo and seedling tissues of Zea mays L. We have found that many of the alpha-tubulin genes are differentially demethylated in the endosperm relative to the embryo and seedling. However, only for tub alpha 2 and tub alpha 4 could a correlation between DNA demethylation and increased RNA accumulation be detected. By analyzing the inbred lines W64A and A69Y and their reciprocal crosses, we have also identified in the endosperm two alpha-tubulin genes, tub alpha 3 and tub alpha 4, that are differentially demethylated if transmitted by the maternal germline, but that remain hypermethylated when transmitted by the paternal germline.

Parental imprinting describes the phenomenon of unequivalent gene function based on transmission from the female or male parent. We have discovered parental imprinting of an allele of the dzr1 locus that posttranscriptionally regulates the accumulation of 10-kDa zein in the maize endosperm. The imprinted allele of MO17 inbred origin, dzr1 + MO17, conditions low accumulation of the 10-kDa zein and is dominant when transmitted through the female but recessive when transmitted through the male. Analyzing endosperms with equal parental contributions of dzr1 + MO17 ruled out the possibility that the unequivalent phenotype of dzr1 + MO17 was due to parental dosage imbalance in the triploid endosperm. Second-generation studies show that the dominant or recessive phenotype of dzr1 + MO17 is determined at every generation based on immediate parental origin with no grandparental effect.

Two instances of genetic transmission of spontaneous epimutation of the maize P-rr gene were identified. Transmission gave rise to two similar, moderately stable alleles, designated P-pr-1 and P-pr-2, that exhibited Mendelian behavior. Both isolates of P-pr conditioned a variable and variegated phenotype, unlike the uniform pigmentation conditioned by P-rr. Extensive genomic analysis failed to reveal insertions, deletions or restriction site polymorphisms between the new allele and its progenitor. However, methylation of the P gene was increased in P-pr relative to P-rr, and was greatly reduced (though not lost) in a revertant to uniform pigmentation. Variability in pigmentation conditioned by P-pr correlated with variability in transcript levels of the P gene, and both correlated inversely with variability in its methylation. Part of the variability in methylation could be accounted for by a developmental decrease in methylation in all tissues of plants carrying P-pr. We hypothesize that the variegated phenotype results from a general epigenetic pathway which causes a progressive decrease in methylation and increase in expression potential of the P gene as a function of cell divisions in each meristem of the plant. This renders all tissues chimeric for a functional gene; chimerism is visualized as variegation only in pericarp due to the tissue specificity of P gene expression. Therefore, this allele that originates from epimutation may exemplify an epigenetic mechanism for variegation in maize.

By utilizing a homologous transient expression system, we have demonstrated that the Opaque-2 (O2) gene product O2 confers positive trans-regulation on a 22-kD zein promoter. This trans-acting function of the O2 protein is mediated by its sequence-specific binding to a cis element (the O2 target site) present in the 22-kD zein promoter. A multimer of a 32-bp promoter fragment containing this O2 target site confers transactivation by O2. A single nucleotide substitution in the O2 target sequence not only abolishes O2 binding in vitro, but also its response to transactivation by O2 in vivo. We have also demonstrated that an amino acid domain including the contiguous basic region and the heptameric leucine repeat is essential for the trans-acting function of the O2 protein. Similar but not identical O2 target sequence motifs can be found in the promoters of zein genes of different molecular weight classes. Conversion of such a motif in the 27-kD zein promoter to an exact O2 target sequence by site-directed mutagenesis was sufficient to increase the binding affinity of the O2 protein in vitro and to confer transactivation by O2 in vivo.

We have determined the nucleotide sequences of zein cDNA clones ZG14, ZG15, and ZG35. The three clones have 95 to 98% homology to the previously published sequence of clone A20, and 84% homology to sequences of the zein subfamily A30. Comparison of all sequences of the A30 and A20 subfamilies highlights the following features: the 5' nontranslated regions are 68 and 57 nucleotides in length for the A20- and A30-like mRNAs, respectively, and contain at least three repeats of the consensus sequence ACGAACAAta/gG; the majority of these genes are highly clustered as judged from pulsed-field gel electrophoresis of high molecular weight maize DNA. Furthermore, we discuss a model for the evolution of the multigene family which stresses the special importance of unequal crossingover and gene conversion in this system.

Message levels for a methionine-rich 10 kDa zein were determined in three inbred lines of maize and their reciprocal crosses at various stages during endosperm development. Inbred line BSSS-53, which overexpresses the 10 kDa protein in mature kernels, was shown to have higher mRNA levels in developing endosperm, as compared to inbred lines W23 and W64A. Differences in mRNA levels could not be explained by differences in transcription rate of the 10 kDa zein gene, indicating differential post-transcriptional regulation of this storage protein in the different inbred lines analyzed. Among progeny segregating for the BSSS-53 allele of the 10 kDa zein structural gene Zps10/(22), mRNA levels are independent of Zps10/(22) segregation, indicating that post-transcriptional regulation of mRNA levels takes place via a trans-acting mechanism. In the same progeny, mRNA levels are also independent of allelic segregation of the regulatory locus Zpr10/(22). Thus, the trans-acting factor encoded by Zpr10/(22) determines accumulation of 10 kDa zein at a translational or post-translational step. Multiple trans-acting factors are therefore involved in post-transcriptional regulation of the methionine-rich 10 kDa zein.

A methionine-rich 10 kDa zein storage protein from maize was isolated and the sequence of the N-terminal 30 amino acids was determined. Based on the amino acid sequence, two mixed oligonucleotides were synthesized and used to probe a maize endosperm cDNA library. A full-length cDNA clone encoding the 10 kDa zein was isolated by this procedure. The nucleotide sequence of the cDNA clone predicts a polypeptide of 129 amino acids, preceded by a signal peptide of 21 amino acids. The predicted polypeptide is unique in its extremely high content of methionine (22.5%). The maize inbred line BSSS-53, which has increased seed methionine due to overproduction of this protein, was compared to W23, a standard inbred line. Northern blot analysis showed that the relative RNA levels for the 10 kDa zein were enhanced in developing seeds of BSSS-53, providing a molecular basis for the overproduction of the protein. Southern blot analysis indicated that there are one or two 10 kDa zein genes in the maize genome.

We have isolated the gene encoding a methionine-rich 10-kDa zein protein from a lambda EMBL3 maize genomic 'mini' library of the inbred line BSSS-53 and determined its nucleotide sequence. The sequence matches perfectly with a cDNA clone from the inbred line W22 (which has the same restriction fragment length polymorphism as many inbred lines tested) indicating that we have isolated a functional storage protein gene that is very conserved in maize. This comparison also excludes any splicing of any precursor mRNA and therefore any presence of introns. A number of potential regulatory sequences have been located in the flanking regions. The 10-kDa-zein gene represents the last size class in the zein multigene family to be characterized. Its structure allows us now to re-examine the relationship of all the zein proteins and also to compare the structure of a new class of storage proteins that are rich in methionine, an essential amino acid in livestock fodder.

The restriction endonuclease cleavage sites for SphI and KpnI have been added to the lac cloning region of the phage vectors M13mp10 and M13mp11, using oligodeoxynucleotide-directed in vitro mutagenesis. Complementary deoxy 16-, 21- or 18-mers with the desired base changes were annealed to the M13mp DNA strand and extended with the Klenow fragment of DNA polymerase I. In adding these sites we have shown that this technique can be used as a general method for inserting sequences of DNA as well as introducing deletions and base pair changes.

A set of programs is presented for the reconstruction of a DNA sequence from data generated by the M13 shotgun sequencing technique. Once the sequence has been established and stored other programs are used for its analysis. The programs have been written for the Apple II microcomputer. A minimum investment is required for the hardware and the software is easily interchangeable between the growing number of interested researchers. Copies are available in ready to use form.

The nucleotide sequence of a genomic clone (termed Z4 ) of the zein multigene family was compared to the nucleotide sequence of related cDNA clones of zein mRNAs. A tandem duplication of a 96-bp sequence is found in the genomic clone that is not present in the related cDNA clones. When the duplication is disregarded, the nucleotide sequence homology between Z4 and its related cDNAs was approximately 97%. The nucleotide sequence is also compared to other isolated cDNAs. No introns in the coding region of the zein gene are detected. The first nucleotide of a putative TATA box, TATAAATA , was located 88 nucleotides upstream of the first nucleotide of the first ATG codon which initiated the open reading frame. The first nucleotide of a putative CCAAT box, CAAAAT , appeared 45 nucleotides upstream of the first nucleotide of the zein cDNA clones in the 3' non-coding region also appeared in the genomic sequence at the same locations. The amino acid composition of the polypeptide specified by the Z4 nucleotide sequence is similar to the known composition of zein proteins.

A series of plasmid vectors containing the multiple cloning site (MCS7) of M13mp7 has been constructed. In one of these vectors a kanamycin-resistance marker has been inserted into the center of the symmetrical MCS7 to yield a restriction-site-mobilizing element (RSM). The drug-resistance marker can be cleaved out of this vector with any of the restriction enzymes that recognize a site of the flanking sequences of the RSM to generate an RSM with either various sticky ends or blunt ends. These fragments can be used for insertion mutagenesis of any target molecule with compatible restriction sites. Insertion mutants are selected by their resistance to kanamycin. When the drug-resistance marker is removed with PstI, a small in-frame insertion can be generated. In addition, two new MCSs having single restriction sites have been formed by altering the symmetrical structure of MCS7. The resulting plasmids pUC8 and pUC9 allow one to clone doubly digested restriction fragments separately with both orientations in respect to the lac promoter. The terminal sequences of any DNA cloned in these plasmids can be characterized using the universal M13 primers.

The nucleotide sequence of two zein cDNAs in hybrid plasmids A20 and B49 have been determined. The insert in A20 is 921 bp long including a 5' non-coding region of 60 nucleotides, preceded by what is believed to be an artifactual sequence of 41 nucleotides, and a 3' non-coding region of 87 nucleotides. The B49 insert is 467 bp long and includes approximately one-half the protein coding sequence as well as a 3' non-coding region of 97 nucleotides. These sequences have been compared with the previously published sequence of another zein clone, A30 . A20 and A30 , both encoding 19 000 mol. wt. zeins , have approximately 85% homology at the nucleotide level. The B49 sequence, corresponding to a 22 000 mol. wt. zein, has approximately 65% homology to either A20 or A30 . All three zeins share common features including nearly identical amino acid compositions. In addition, the tandem repeats of 20 amino acids first seen in A30 are also present in A20 and B49 .

We have determined the complete primary structure (8031 base pairs) of an infectious clone of cauliflower mosaic virus strain CM1841. The sequence was obtained using the strategy of cloning shotgun restriction fragments in the sequencing vector M13mp7. Comparison of the CM1841 sequence with that published for another caMV strain (Strasbourg) reveals 4.4% changes, mostly nucleotide substitutions with a few small insertions and deletions. The six open reading frames in the sequence of the Strasbourg isolate are also present in CM1841.

We have used the newly engineered transposable element Dsg to tag a gene that gives rise to a defective kernel (dek) phenotype. Dsg requires the autonomous element Ac for transposition. Upon excision, it leaves a short DNA footprint that can create in-frame and frameshift insertions in coding sequences. Therefore, we could create alleles of the tagged gene that confirmed causation of the dek phenotype by the Dsg insertion. The mutation, designated dek38-Dsg, is embryonic lethal, has a defective basal endosperm transfer (BETL) layer, and results in a smaller seed with highly underdeveloped endosperm. The maize dek38 gene encodes a TTI2 (Tel2-interacting protein 2) molecular cochaperone. In yeast and mammals, TTI2 associates with two other cochaperones, TEL2 (Telomere maintenance 2) and TTI1 (Tel2-interacting protein 1), to form the triple T complex that regulates DNA damage response. Therefore, we cloned the maize Tel2 and Tti1 homologs and showed that TEL2 can interact with both TTI1 and TTI2 in yeast two-hybrid assays. The three proteins regulate the cellular levels of phosphatidylinositol 3-kinase-related kinases (PIKKs) and localize to the cytoplasm and the nucleus, consistent with known subcellular locations of PIKKs. dek38-Dsg displays reduced pollen transmission, indicating TTI2's importance in male reproductive cell development.

Maize kernels do not contain enough of the essential sulphur-amino acid methionine (Met) to serve as a complete diet for animals, even though maize has the genetic capacity to store Met in kernels. Prior studies indicated that the availability of the sulphur (S)-amino acids may limit their incorporation into seed storage proteins. Serine acetyltransferase (SAT) is a key control point for S-assimilation leading to Cys and Met biosynthesis, and SAT overexpression is known to enhance S-assimilation without negative impact on plant growth. Therefore, we overexpressed Arabidopsis thaliana AtSAT1 in maize under control of the leaf bundle sheath cell-specific rbcS1 promoter to determine the impact on seed storage protein expression. The transgenic events exhibited up to 12-fold higher SAT activity without negative impact on growth. S-assimilation was increased in the leaves of SAT overexpressing plants, followed by higher levels of storage protein mRNA and storage proteins, particularly the 10-kDa δ-zein, during endosperm development. This zein is known to impact the level of Met stored in kernels. The elite event with the highest expression of AtSAT1 showed 1.40-fold increase in kernel Met. When fed to chickens, transgenic AtSAT1 kernels significantly increased growth rate compared with the parent maize line. The result demonstrates the efficacy of increasing maize nutritional value by SAT overexpression without apparent yield loss. Maternal overexpression of SAT in vegetative tissues was necessary for high-Met zein accumulation. Moreover, SAT overcomes the shortage of S-amino acids that limits the expression and accumulation of high-Met zeins during kernel development.