This article has a correction. Please see:

ABSTRACT

UTF1 is a transcriptional coactivator which has recently been isolated and found to be expressed mainly in pluripotent embryonic stem (ES) cells (A. Okuda, A. Fukushima, M. Nishimoto, et al., EMBO J. 17:2019–2032, 1998). To gain insight into the regulatory network of gene expression in ES cells, we have characterized the regulatory elements governing UTF1 gene expression. The results indicate that the UTF1 gene is one of the target genes of an embryonic octamer binding transcription factor, Oct-3/4. UTF1 expression is, like the FGF-4 gene, regulated by the synergistic action of Oct-3/4 and another embryonic factor, Sox-2, implying that the requirement for Sox-2 by Oct-3/4 is not limited to the FGF-4 enhancer but is rather a general mechanism of activation for Oct-3/4. Our biochemical analyses, however, also reveal one distinct difference between these two regulatory elements: unlike the FGF-4 enhancer, the UTF1 regulatory element can, by its one-base difference from the canonical octamer-binding sequence, selectively recruit the complex comprising Oct-3/4 and Sox-2 and preclude the binding of the transcriptionally inactive complex containing Oct-1 or Oct-6. Furthermore, our analyses reveal that these properties are dictated by the unique ability of the Oct-3/4 POU-homeodomain that recognizes a variant of the Octamer motif in the UTF1 regulatory element.

Understanding cell-type commitment and differentiation at the molecular level is a major subject of developmental biology. The first overt differentiation step occurs at about 3.5 days postcoitum (d.p.c.) in murine development. At this time, a blastocyst is formed which consists of two distinct cell lineages, i.e., the trophectoderm and the inner cell mass (ICM). The former generates a major part of the placenta and the extraembryonic yolk sacs, whereas the latter can be differentiated into essentially any type of cells, including embryonic and extraembryonic tissues. Subsequently, an epithelial layer (primitive endoderm) appears on the surface of the ICM at ca. 4.0 d.p.c., and the remaining compacted core of ICM becomes a layer known as primitive ectoderm (for details, see reference 13). Up until the later stages of gastrulation, primitive ectoderm is defined, by a number of criteria, as a pluripotent embryonic tissue from which both somatic and germ line tissues are derived. For example, Gardner and Rossant (12) demonstrated that, when a single primitive ectodermal cell was isolated from a 4.5-d.p.c. embryo and injected into a 3.5-d.p.c. embryo of a different genotype, the injected cell underwent normal embryonic development and was found in all kinds of somatic tissues and germ cells of the adult chimera mouse. Thus, understanding the pluripotent properties of the primitive ectoderm at a molecular level is one of the fundamental issues of early mammalian embryogensis. Embryonic carcinoma (EC) cell lines, such as F9 and P19 cells, are widely used as a model system for studying gene regulation during early developmental stages of mammals, since EC cells share a number of properties with early embryonic cells. Most significantly, these EC cells can be induced to differentiate in vitro into a variety of cell types by using retinoic acids and/or by aggregate formation, allowing systematic analyses of gene regulation which governs and maintains the pluripotent state of embryonic cells (29).

It is known that a number of transcription factors, including the HOX gene family, are upregulated during differentiation of EC cells (7, 23, 30). However, to understand the molecular basis of maintenance of the pluripotent state of embryonic cells, it is important to identify factors whose expression is restricted to undifferentiated pluripotent EC cells and downregulated during differentiation of these cells. So far, only five transcription factors, i.e., Oct-3/4 (33, 39, 41), Oct-6 (43,49), SOX-2 (48), PEA3 (46), and REX-1 (17, 38) have been shown to display such an expression profile, and their expression is significantly decreased or even extinguished when EC cells are induced to differentiate. In addition to these factors, we have recently cloned a transcriptional coactivator termed UTF1, whose expression is also rather restricted to embryonic stem cells (34). Thus, these factors, including UTF1, may well be involved in maintaining the pluripotent properties of early embryonic cells and, for this reason, it is important to systematically analyze their biochemical and biological properties. At the same time, it is equally important to know how the expression of the genes encoding these embryonic transcriptional factors or cofactors is controlled, as such information may lead to the identification of novel embryonic factors and/or uncover the regulatory hierarchy among these embryonic stem cell factors. Ultimately, these studies may help to unravel the broader aspects of a regulatory network operating in pluripotent embryonic stem cells.

In this context, we have characterized the regulatory elements of the UTF1 gene and found that this gene is under the control of Oct-3/4. Interestingly, as in the case of the enhancer element of the FGF-4 gene (1, 8, 48), UTF1 expression is regulated by a synergistic action of Oct-3/4 and another embryonic factor, Sox-2, implying that the requirement for Sox-2 to exert transcriptional stimulating activity of Oct-3/4 is not limited to the FGF-4 enhancer but is rather a widespread mode for Oct-3/4 to support target gene activation. Moreover, we also demonstrate the unique property of the UTF1 regulatory element to selectively recruit the Sox-2/Oct-3/4 complex.

MATERIALS AND METHODS

DNA manipulations.The sequence of the UTF1 genomic DNA from a BALB/c mouse has been deposited in the GenBank data library under accession number AB017360. The organization and restriction map of the gene are shown in Fig. 1A. To construct the ΔUTF1 reporter gene, a 4.9-kb genomic BamHI DNA fragment carrying the UTF1 gene was recovered and subcloned into pBluescript KS II(+). Subsequently, single-stranded DNA was recovered with the aid of the helper phage, and two KpnI sites were created at the positions of +158 and +359 (the adenine nucleotide of the initiating methionine codon is set to +1). To construct the 5′ deletion mutants shown in Fig. 3B,BamHI sites were created by site-directed mutagenesis according to the manufacturer (Amersham) at the indicated positions, and BamHI fragments bearing the ΔUTF1 gene were recovered and subcloned into pBluescript KS II(+). To construct 3′ deletion mutants, single-stranded DNA was recovered by using 5′ del-4 mutant as described above, and BglII sites were created at the indicated positions. Subsequently, appropriate DNA fragments were removed, and the remaining portions, including the vector portion, were self-ligated. However, to construct the 3′ del-1 mutant, theBamHI/BglII fragment carrying the ΔUTF1 gene was isolated and subcloned into pBluescript KS II(+). The mutant reporter genes shown in Fig. 4B were created by site-directed mutagenesis with single-stranded DNA derived from the 5′ del-4 mutant. Mutated sequences were as shown in the figure legend. The reporter plasmid used in Fig. 4D and Fig. 9B bearing the consensus Octamer sequence (5′-ATTAGCAT-3′) was also generated by the mutagenesis reaction. The expression vectors of Sox-2 and Oct-3/4 subcloned into pCEP4 (invitrogen) have been described by Yuan et al. (48). The entire Oct-1 and Oct-6 cDNAs were also subcloned into the pCEP4 expression vector. To construct Flag-tagged Oct-3/4 and Sox-2 expression vectors, the entire coding sequences of these cDNAs were recovered by PCR as KpnI/BamHI fragments and subcloned into pCEP4 vector together with the double-stranded oligonucleotides bearing initiating methionine codon followed by the Flag-tagged sequence. All chimeric protein expression vectors were made by creating restriction sites at the desired junction points and subsequently joining the relevant restriction fragments. Although, in certain cases, creation of each restriction enzyme site resulted in one or two amino acid changes in the protein, we confirmed that these mutations per se do not affect the intrinsic DNA binding properties of the original proteins in terms of interacting with consensus Octamer sequence and the wild-type UTF1 regulatory element (data not shown).

Cell culture and in vitro differentiation of ES cells.P19 EC cells and HeLa cells were cultured as described by Okamoto et al. (33) and Fukushima et al. (11), respectively. E14 ES cells (15) were cultured in the presence of leukemia-inhibiting factor as described by Williams et al. (45). Differentiation of embryonic stem (ES) cells was induced by the formation of embryoid bodies by culturing the cells with bacterial dishes in the absence of leukemia inhibitory factor for 4 days, and the embryoid bodies were replated to the tissue culture grade plates to form monolayers of differentiated cells (for details, see reference 37).

RNase mapping analyses.To prepare the riboprobe for the detection of transcripts from ΔUTF1 gene and endogenous UTF1 RNA, the 282-bp PCR fragment started from +62 was recovered by using ΔUTF1 gene as a template and subcloned into pBluescript KS II(+). Radiolabeled antisense UTF1 RNA was produced by the standard method according to the manufacturer (Promega) and hybridized with RNAs as described previously (34). Since the riboprobe does not carry the specific portion of exon 1 deleted in the ΔUTF1 gene, the probe would not hybridize with endogenous UTF1 contiguously due to the sequence not being present in the probe. Thus, RNase treatment with such duplex-comprising probe and endogenous RNA results in the generation of two short bands (186 and 96 bases) in the gel. For the detection of the transcripts of the internal neomycin-resistant gene, radiolabeled antisense RNA covering from 190 to 304 nucleotides of the gene was used. For the detection of various mRNA species shown in Fig.2, the following portions of the cDNAs, which appear to be unique in sequences, were used: Sox-2 (48), nucleotides 480 to 905; Oct-3/4 (39), nucleotides 1 to 402; Oct-6 (43), nucleotides 889 to 1359; GATA4 (2), nucleotides 1187 to 1560; and MyoD (36), nucleotides 1102 to 1425. The numbers represent the position when the adenine nucleotide of the initiating methionine codon is set to +1.

Gel shift analysis.HeLa cells were transfected with the indicated expression vectors as described previously (11), and the whole-cell extracts were prepared after 48 h of transfection according to the standard protocol (for details, see reference 9). The gel shift analyses were performed at room temperature for 20 min with 2 fmol of either one of the probes shown in Fig. 5A or one carrying the consensus Octamer sequence. The binding buffer contained 20 mM HEPES-KOH (pH 7.6), 1 mM EDTA, 1 mM dithiothreitol, 2 mM MgCl2, 20% glycerol, and 50 mM NaCl. The reaction mixture (20 μl) contained 1 to 4 μg of whole-cell extract and 2 μg of sonicated salmon sperm DNA. However, 5 μg of whole-cell extract was used for the analyses shown in Fig. 6C. In Fig.7, 4 μg of nuclear extracts from undifferentiated and differentiated P19 EC cells were used in the analyses. For the analysis shown in Fig.6B, 0.1 μg of the specific antibody was also added. The reaction mixture was loaded onto a 4% gel in 20 mM Tris-borate (pH 8.0)–0.5 mM EDTA, which had been prerun at 16 V/cm for 1 h. The gels were electrophoresed at the same voltage until the bromophenol blue dye migrated to the bottom of the gel.

RESULTS

Regulatory region governing UTF1 gene expression in pluripotent embryonic cells.We have recently cloned the cDNA for a transcriptional cofactor termed UTF1 by using genetic selection in yeast cells. Biochemical analyses reveal that UTF1 has many of the hallmark characteristics expected for a coactivator (34). One of the interesting features of UTF1 is its unique expression profile. UTF1 has been shown to be expressed in ES cells, as well as in EC cells, and this expression is extinguished when these cells are induced to differentiate. Furthermore, we have found that such expression profile is also observed in human teratocarcinoma cell lines (11). To identify the regulatory element(s) governing this UTF1 expression pattern, we first isolated a chromosomal DNA fragment carrying the UTF1 gene (32a; a physical map of part of the obtained DNA is shown in Fig. 1A) and examined whether any putative regulatory element(s) of the UTF1 gene are located in close proximity to the structural gene region. Namely, we stably integrated a genomic DNA fragment, designated ΔUTF1 (Fig. 1A), into P19 EC cells and analyzed the levels of expression from the introduced DNA by RNase mapping analysis. ΔUTF1 bears certain portions of 5′ (2.9 kb) and 3′ (0.8 kb) flanking regions in addition to the coding region of UTF1 gene, but it lacks a portion of exon 1. Therefore, the transcripts from ΔUTF1 can be distinguished from endogenously expressed UTF1 RNA, allowing us to monitor the relative levels of these two transcripts with a single probe (see Materials and Methods for more details). As shown in Fig. 1B, lane 1, when P19 cells were maintained in pluripotent states, transcripts from the ΔUTF1 gene and the endogenous UTF1 gene were detected. More importantly, both types of RNA became undetectable when P19 cells were induced to differentiate with the treatment of retinoic acid, although RNA was produced from an internal control neomycin-resistant gene which is under the control of the β-actin promoter (14) at comparable levels irrespective of the undifferentiated or differentiated state of the P19 cells (lane 2). Thus, these results indicate that the regulatory element(s) supporting specific expression of UTF1 in undifferentiated P19 cells are located within the introduced DNA fragment.

The ΔUTF1 gene carries the regulatory region(s) which are responsible for the pluripotent-state-specific expression of UTF1 gene. (A) Structures of UTF1 genomic gene and the reporter gene ΔUTF1. The gene consists of two exons interrupted by a short intron (97 bp). The exons are indicated as black boxes, and the portion of the first exon is deleted in the reporter gene to distinguish between exogenous and endogenous UTF1 RNAs. The shaded box indicates the region used for generating radiolabeled antisense RNA. The restriction enzyme sites are abbreviated as follows: B, BamHI; E,EcoRI; H, HindIII; and S, SalI. (B) The ΔUTF1 reporter gene supports the expression in undifferentiated P19 EC cells. P19 EC cells were transfected with ΔUTF1 gene, together with the neomycin-resistant gene which is under the control of β-actin promoter (14). After selection with G418, ca. 1,000 of the drug-resistant colonies were pooled and expanded. Subsequently, the cells were incubated with fresh medium containing charcoal-treated serum supplemented with or without 1 μM retinoic acid. After 48 h, RNA was recovered from such treated cells, and the levels of transcripts from ΔUTF1 and neomycin-resistant genes, as well as endogenous UTF1 RNA, were determined by the RNase mapping analyses as described in Materials and Methods.

To confirm that ΔUTF1 also shows a similar expression profile in other embryonic cells, we introduced the ΔUTF1 gene into ES cells. ES cells in which the transfected DNA was stably integrated were subjected to in vitro differentiation by forming simple embryoid bodies as described by Robertson (37). As shown in Fig.2, this process was accompanied by the extinction of transcripts from exogenously introduced DNA, as well as endogenous UTF1 transcripts, confirming that the ΔUTF1 gene displays an expression profile equivalent to that of the endogenous gene in ES cells. We also examined the expression of other genes and confirmed that the Oct-3/4, Oct-6, and Sox-2 genes were downregulated, whereas GATA4 and MyoD genes were markedly upregulated during this process, verifying that differentiation signal is properly transduced in the ES cells. We also noted that levels of UTF1 transcripts decreased faster than those of Oct-3/4 and Sox-2, and we present these data in more detail in the Discussion.

The regulatory region(s) in ΔUTF1 gene also work in ES cells. The ΔUTF1 gene was introduced into E14 ES cells by electropolation, together with the neomycin-resistant gene. After the selection of the cells in which ΔUTF1 gene had been stably integrated, these ES cells were subjected to in vitro differentiation as described in Materials and Methods. During the course of this procedure, RNA was recovered at the indicated days (0 to 20), and the levels of specific RNA were determined by the RNase mapping analysis as described in Materials and Methods.

To characterize the undifferentiated EC-cell-specific regulatory elements in more detail, we made a series of deletion mutants of ΔUTF1 DNA fragment (Fig. 3B). These deletion mutants were transiently transfected into P19 cells, and the levels of transcripts from these mutants were quantitated by RNase mapping analysis as described above. Four GC boxes are present in the canonical promoter region of the gene as shown in the sequence around and upstream of the initiating ATG codon (Fig. 3A), and the results shown in Fig. 3B are consistent with the notion that these elements are important for driving the UTF1 expression. We did not obtain any evidence for the presence of additional regulatory elements in the 5′ flanking region that act either positively or negatively to support the unique expression profile of UTF1. We also noted that the 5′ del-11 mutant (bp −65 to +2035), bearing no GC boxes, still produced a low level of transcripts. When 3′ deletion mutants were characterized, we found that one particular mutant, 3′ del-2 produced a significantly lower level of transcripts than ΔUTF1, indicating that the region deleted in this mutant contains important element(s) for sustaining UTF1 expression in P19 EC cells. Interestingly, the activity level supported by 3′ del-2 was even lower than that obtained with 5′ del-11, supporting the notion that the regulatory element(s) existing in the 3′ flanking region is involved in elevating transcription even from the 5′ del-11 reporter gene.

Localization of regulatory regions governing UTF1 expression. (A) Nucleotide sequence around the initiating ATG codon. Negative numbers indicate the nucleotide positions upstream of the adenine nucleotide of the initiating methionine codon. GC boxes are underlined. The oligo-capped method (28) revealed at least 15 transcription start sites downstream of the position of −56 (32b). (B) The UTF1 regulatory region is located in the 3′ flanking region of the gene. A series of 5′ and 3′ deletion mutants of ΔUTF1 gene was constructed as described in Materials and Methods. Number of GC boxes present in each construct is shown. These mutants were individually transfected with P19 EC cells, together with the internal control neomycin gene, which is subcloned into pHβApr-1 vector (14), and RNA was recovered at 48 h posttransfection. Subsequently, the levels of transcripts were determined by RNase mapping analysis as described in Materials and Methods. The intensity of the bands derived from the transcripts from ΔUTF1 or its derivatives and from the internal control gene was determined by the Fuji-BAS2000 bioimaging analyzer (Fuji Film). The ratio of these two different kinds of transcripts were calculated. The average value obtained with ΔUTF1 was arbitrarily set to 100%. The data were obtained from five independent experiments with comparable results.

UTF1 expression is regulated by two cis-active motifs: the Octamer and the Sox binding elements.Figure4A shows the sequence of the 3′ region that is implicated in regulating UTF1 expression. We noted that there are four potential cis-active elements, i.e., PEA3, Octamer, Sox, and En-like binding sequences, although the Octamer and En-like sequences are different from their consensus sequences. Interestingly, each of these elements has been shown to bind embryonic stem cell transcription factors. For example, a large number of Sry-related, Sox motif binding factors have been identified (35) and one of these, termed Sox-2, is expressed in EC cells (48). Likewise, specific Octamer binding factors, Oct-3/4 and Oct-6, have been shown to be expressed in embryonic stem cells and to play an important role during early mammalian development (32, 33, 39, 41,43, 49). Therefore, we mutated each element and examined whether the reporter gene expression was affected. As shown in Fig. 4C, little or no effect was evident after mutation of the En-like or PEA3 motif. In contrast, mutation of either the Octamer or Sox motif resulted in a drastic decrease in the level of expression, indicating that both the Octamer and Sox binding motifs are required for supporting UTF1 expression in P19 EC cells. Since the UTF1 regulatory element has an atypical Octamer sequence, we examined whether conversion of this variant sequence to the canonical Octamer sequence affected the level of reporter gene expression. As shown in Fig. 4D, we found that this nucleotide change did not affect the reporter gene expression. Thus, these results indicate that the Octamer-like sequence in the UTF1 regulatory element is able to exert its function in a way equivalent to that of the consensus sequence, although we do not have any evidence that both the consensus and variant Octamer elements exert their functions by interacting with the same transcription factors.

The UTF1 regulatory element is composed of Sox and Octamer motifs. (A) Sequence of the UTF1 gene implicated in supporting its expression in P19 EC cells. The potential transcription factor binding sites are indicated as boldface letters with underlines. (B) Schematic representation of four mutants carrying a mutation in one of four potential transcription factor binding sites. In each mutant, the sequence is mutated as follows: mtPEA (5′-AGGAAG-3′ to 5′-ATGCAT-3′), mtOct (5′-ACTAGCAT-3′ to 5′-ACGATCCT-3′), mtSOX (5′-AACAATG-3′ to 5′-ACCCATT-3′), and mtEn (5′-AATTTACAAATTCT-3′ to 5′-AATGTCCACATTCT-3′). (C) Both Octamer and Sox-binding sites are required for UTF1 expression in P19 EC cells. The mutants depicted in panel B were transiently introduced into P19 EC cells, and the effect of these mutations was evaluated by analyzing the levels of transcripts driven from these mutants by RNase mapping analysis as described in Materials and Methods. The 3′ del-2 mutant shown in Fig.3B was also used for the study to see the level of transcription in the absence of the regulatory element. (D) Conversion of the Octamer-like sequence to the canonical site does not change the level of reporter gene expression. The 5′ del-4 reporter gene (WT) bearing the wild-type UTF1 regulatory sequence and that bearing the consensus Octamer sequence (Consensus) were transiently introduced into P19 EC cells and compared to the levels of expression from these two reporter genes as determined by RNase mapping analyses as described in Materials and Methods.

Oct-3/4 displays a DNA binding specificity distinct from that of Oct-1 or Oct-6 on the UTF1 regulatory element.As the Octamer factor-binding sequence found in the UTF1 regulatory region differs from the consensus sequence, we examined whether the Octamer factors are indeed able to bind to this divergent sequence. To this end, three different Octamer factor expression vectors were individually introduced into HeLa cells, and whole-cell extracts were prepared from the transfected cells. Subsequently, the extracts were examined by using gel shift analyses. As shown in Fig.5B, exogenously expressed Octamer factors produced specific bands in the gel with the probe bearing the wild-type UTF1 sequence but not with the mtOct probe, indicating that all three factors can bind to the probe through the Octamer-like sequence. It is known that the Octamer sequence is essentially composed of two parts in which the first four nucleotides (5′-ATTA/T-3′) are recognized by the POU-homeo domain of the Octamer factor and the remaining four nucleotides (5′-GCAT-3′) are recognized by the POU-specific domain (16). To confirm that both units of the Octamer motif are used for the binding of these Octamer factors, we extended the gel shift analyses using mtOctH and mtOctS probes in which the 5′ and 3′ halves of the Octamer sequence are mutated, respectively (Fig. 5A). Although mtOctS probe did not serve as the binding site for any of the Octamer factors, all three factors bound to mtOctH probe as efficiently as to the wild-type UTF1 gene sequence. Thus, these results indicate that these Octamer factors do not use the 5′ half of the Octamer-like sequence of the UTF1 regulatory sequence for their binding. It has been shown that the relative positioning of the POU specific- and POU homeodomains is rather flexible, allowing them to recognize fairly diverse sequences and, in some cases, to adopt different orientations or positions on the DNA (16). As one base mismatch resides in the 5′ half of the original UTF1 Octamer sequence and the mtOctH mutation did not affect the binding of these Octamer factors to the probe, we considered the possibility that the POU homeodomain does not utilize this part of the sequence but instead binds through an adenine nucleotide-rich Sox binding sequence located at downstream of the Octamer-like sequence. To examine this possibility, we performed a gel shift analysis with the mtSox probe in which Sox-binding motif is mutated. As shown in Fig. 5B, this mutation impaired the binding of Oct-1 and Oct-6 to the probe. However, binding of Oct-3/4 to this mutant DNA was not significantly affected. From these results, it is assumed that all of the Octamer factors, including Oct-3/4, bind to the regulatory region, as predicted above, by utilizing the Sox site plus part of the Octamer-like sequence, i.e., 5′-GCAT-3′. However, it appears that Oct-3/4 is also able to bind to the probe through the Octamer-like sequence of the UTF1 gene without utilizing the Sox site. Next, we performed the gel shift analyses with Sox-2 protein and found that, as expected, the protein was able to bind to the mtOct probe, as well as to the wild-type UTF1 sequence, but the binding to the mtSox probe was abolished (Fig. 5C, left panel). We also performed the gel shift analyses with Oct-3/4 protein by using mtOctH and mtSox probes as in Fig. 5B and applied them onto the gel side-by-side to compare the mobilities of the protein-DNA complexes. These analyses revealed that there is a slight difference in mobility between the complexes generated with the mtOctH and mtSox probes (Fig. 5C, right panel). Thus, these results indicate that there are indeed two modes of Oct-3/4 binding to the UTF1 regulatory element and support the conclusion obtained with the analyses shown in Fig. 5A. Next, we decided to determine the DNA sequence requirements of the Oct-6 POU homeodomain to the Sox site by gel shift analyses by using mutant probes in which either one of the nucleotides in the Sox site is mutated, as shown in Fig. 5D. These analyses revealed that fourth and fifth adenine nucleotides are crucial for the interaction with Oct-6 binding, while most of other nucleotides, except for the seventh guanine nucleotide, also appears to be involved in the binding. Interestingly, mutation of the third cytosine residue to an adenine residue decreased the efficiency of the binding of Oct-6 to the probe. Thus, these results indicate that, although the richness of AT nucleotides appears to be an important factor, this is not the sole determinant. That is, it appears that a certain specific sequence in the Sox site is also required for the POU homeodomain to bind to the element.

Oct-3/4 displays a DNA binding specificity distinct from Oct-1 and Oct-6. (A) Sequences used for the gel shift analyses. “X” indicates the nucleotide which is different from that in the wild-type UTF1 sequence. (B) Oct-3/4, but not Oct-1 or Oct-6, can bind through the Octamer-like sequence of the UTF1 regulatory sequence. The Oct-1, Oct-3/4, and Oct-6 expression vectors were individually introduced into HeLa cells, and gel shift analyses were performed by using whole-cell extracts prepared from such transfected cells as described in Materials and Methods with either one of the probes shown in panel A. CONT, extract prepared from HeLa cells which have been transfected with the empty vector. (C) Binding modes of Sox-2 and Oct-3/4 to the UTF1 regulatory element. The whole-cell extracts were prepared from HeLa cells which had been transfected with either Sox-2 or Oct-3/4 expression vector, and gel shift analyses were done as described in Materials and Methods with the indicated probes. (D) DNA sequence requirements of the Sox site for interaction with the POU homeodomain of Oct-6. The Oct-6 expression vector was introduced into HeLa cells, and whole-cell extracts were prepared from such transfected cells. Subsequently, gel shift analyses were performed by using a series mutant probes in which either one of nucleotides in the Sox site is changed to the noncomplementary one (A↔C, G↔T). The efficiency of the complex formation of these probes with Oct-6 was determined by using the Fuji-BAS2000 bioimaging analyzer. The value obtained with the wild-type sequence was arbitrarily set to 100%, and the relative values obtained with different probes were calculated.

As it appears that Oct-1 and Oct-6 absolutely require the Sox-binding sequence to bind to the probe, we reasoned that these Octamer factors might not be able to bind to the probe together with Sox binding factors such as Sox-2. To examine this possibility, each of the Octamer factors was expressed together with Sox-2, and gel shift analyses were performed by using extracts prepared from the transfected cells. As shown in Fig. 6A, Sox-2 or Oct-3/4 alone each generated specific bands in the gel (lanes 2 and 3). However, when these proteins were coexpressed, a novel band was generated in addition to those observed when each protein was used alone (lane 4). It is probable that this is due to the simultaneous binding of Oct-3/4 and Sox-2 to the probe, since the mobility of this band is slower than those of the bands generated by either Oct-3/4 (lane 3) or Sox-2 (lane 2) alone. As predicted, Sox-2 was not able to bind the probe together with Oct-6, since no slow-migrating band was generated in spite of the presence of both Oct-6 and Sox-2 proteins (lane 6). Analysis with cell extracts prepared from cells transfected with Oct-1 expression vector produced a faint slowly migrating band, indicated by a star symbol, in the presence of Sox-2 (lane 8). Further experiments revealed that this is indeed due to the simultaneous binding of Oct-1 and Sox-2 to the probe (data not shown), indicating that Oct-1 may become able to recognize the Octamer-like sequence in the presence of Sox-2, albeit very inefficiently. Thus, these data suggest that the ability of Oct-3/4 to form a ternary complex with Sox-2 is due to its ability to recognize the Octamer-like sequence. To further confirm this notion, we also performed the gel shift analyses with Oct-3/4 and Sox-2 by using mtOctH probe. As Oct-3/4 binds to this probe by utilizing the Sox site, we reasoned that even Oct-3/4 may not be able to bind to this mutant probe together with Sox-2. Consistent with this idea, Oct-3/4 and Sox-2 gave rise to two distinct bands corresponding to independent binding of either one of these two proteins to mtOctH probe, and no evidence of the simultaneous binding of Oct-3/4 and Sox-2 on this probe was obtained (lane 12). Next, we performed experiments to confirm that the slowly migrating band reflected the simultaneous binding of Oct-3/4 and Sox-2 to the probe. To this end, we expressed the wild-type Oct-3/4 and a Flag-tagged Sox-2 protein in HeLa cells (see Materials and Methods), and the binding reaction, including the whole-cell extract prepared from the transfected cells, was incubated with anti-Flag antibody before electrophoresis. As shown in Fig. 6B, left panel, the specific antibody, but not the unrelated anti-polyhistidine antibody, impaired the generation of the slow-migrating band, indicating that Sox-2 is indeed involved in its formation. Next, we expressed Oct-3/4 as a Flag-tagged protein and found that incubation of the extract with anti-Flag-antibody impaired the production of both fast- and slow-migrating bands (Fig. 6B, right panel). These results are consistent with the notion that the slow-migrating band reflects the simultaneous binding of Oct-3/4 and Sox-2 to the probe. In Fig. 6A, we have also demonstrated that Sox-2 cannot bind simultaneously with Oct-6, and we speculate that this is because these Oct-6 and Sox-2 proteins compete with each other for the same site (Sox-element) to bind to the UTF1 regulatory element. To confirm this model, we extended the gel shift analyses as follows. Since Oct-6 and Sox-2 give specific bands which comigrate in the gel, we have used chimera 1, which is depicted in Fig. 8A, instead of using the Oct-6 protein. This chimera has a complete POU DNA binding domain of Oct-6, but its protein size is much smaller than Sox-2, allowing it to separate into two complexes containing either chimera 1 or Sox-2 in the gel. We performed gel shift analysis with a constant amount of chimera 1 and found that increasing the amount of Sox-2 protein in the reaction mixture resulted in a decrease of the formation of the probe/chimera 1 complex (Fig. 6C), indicating that these two proteins indeed bind to the UTF1 regulatory element in a mutually exclusive manner. We also performed gel shift analyses by using a probe in which the Octamer-like sequence was converted to the consensus sequence. As shown in Fig. 6D, all three Octamer factors appear to be able to bind efficiently to this probe together with Sox-2, indicating that the different properties of Oct-3/4, Oct-1, and Oct-6 only become evident when the nonconsensus Octamer element of the UTF1 gene is used. Figure 6E shows a model illustrating how the UTF1 regulatory element can specifically bind the Sox-2/Oct-3/4 complex and preclude the simultaneous binding of Oct-6 (also true for Oct-1) with Sox-2.

Gel shift analyses with Octamer factors and Sox-2 protein. (A) Oct-3/4, but not Oct-1 or Oct-6, can bind efficiently to the UTF1 regulatory element together with Sox-2. Three Octamer factors are individually expressed with or without Sox-2 expression, and gel shift analyses were performed with whole-cell extracts prepared from such transfected cells as described in Materials and Methods by using wild-type (WT, left panel) or mtOctH (right panel) probe. The faint band observed in lane 8, found to be generated due to the simultaneous binding of Oct-1 and Sox-2, is indicated by a star. (B) Evidence of the involvement of Oct-3/4 and Sox-2 in the formation of a slow-migrating band. A HeLa whole-cell extract bearing both Oct-3/4 and Sox-2 proteins in which either one of them is Flag-tagged was treated with anti-Flag or anti-polyhistidine antibody, and a gel shift analysis was performed with such antibody-treated extracts as described in Materials and Methods. The arrowhead indicates the band generated due to the binding of endogenous Oct-1 to the probe. (C) The Sox-2 and chimera 1 bearing the POU DNA binding domain of Oct-6 bind to the UTF1 regulatory element in a mutually exclusive manner. HeLa cells were transfected with chimeric protein 1 (see Fig. 8A for the structure of the protein) or Sox-2 expression vector, and whole-cell extracts were prepared from such transfected cells. A constant amount (1 μg of protein) of extract containing chimera 1 protein was mixed with variable amounts (0 to 4 μg) of Sox-2-containing extract, and gel shift analyses were performed as described in Materials and Methods. The total amount of extract in the reaction mixture was adjusted to 5 μg of protein by using extract from HeLa cells which had been transfected with the empty vector. (D) Both Oct-1 and Oct-6 became able to bind to the UTF1 regulatory element together with Sox-2 when the Octamer-like sequence was converted to the consensus sequence. The same set of whole-cell extracts used in panel A was used for the gel shift analyses by using the probe in which an Octamer-like sequence is changed to the consensus sequence. (E) Model showing how the UTF1 regulatory element can selectively bind the complex comprising Oct-3/4 and Sox-2 and preclude the binding of Oct-6 together with Sox-2.

As all of binding studies were done with HeLa cell extract with overexpressed Oct and Sox proteins, we next examined whether protein-DNA complexes similar to that observed in Fig. 6A are also generated with extracts from undifferentiated P19 EC cells. As shown in Fig. 7A, lane 1, four distinct bands were obtained with P19 EC cell extract. Next, we induced P19 EC cells to differentiate with retinoic acid and prepared the extract from such differentiated cells. Subsequently, gel shift analyses were performed. It is known that the expression of Oct-3/4 and Sox-2 is significantly decreased or even extinguished during this differentiation process (33, 39, 48). We also found that the level of Oct-6 expression is significantly attenuated during this process (32a). Therefore, it is anticipated that most of the complexes which are supposed to contain at least either one of these three proteins may not be detected with extracts from such differentiated P19 cells. The results of the analyses are shown in Fig.7A, lane 2. Consistent with this idea, only one strong band, which is supposed to correspond to the probe complexed with Oct-1, was detected, while signals of the other three bands had disappeared or were significantly decreased. Thus, these results support the assumption stated above based on the mobility of the DNA-protein complexes. We also performed gel shift analyses with P19 EC cell extracts by using a variety of mutant probes. From the results obtained with the data shown in Fig. 5B and Fig. 6A, it is anticipated that mtOct only binds to Sox-2, while mtSox probe only binds to Oct-3/4. However, mtOctH probe is supposed to be able to bind to all three Octamer factors (Oct-1, Oct-3/4, and Oct-6), as well as to Sox-2, although this probe cannot form ternary complex composed of Oct-3/4 and Sox-2. The results are shown in Fig. 7B. It was found that each probe gave exactly the expected pattern in the lane. Thus, these analyses with mutant probes further support the idea that factors which were used in the analyses shown in Fig. 6A are major factors present in P19 EC cells which bind to the UTF1 regulatory element.

Detection of factor(s) bound to the UTF1 regulatory element in a P19 EC cell extract. (A) Gel shift analyses with undifferentiated and differentiated P19 EC cell extracts. P19 EC cells were either kept in pluripotent states or induced to differentiate with 1 μM retinoic acid for 48 h, and nuclear extracts were prepared from such treated P19 cells. Subsequently, gel shift analyses were performed as described in Materials and Methods. Each symbol represents a specific DNA-protein complex. (B) Gel shift analyses with P19 EC cell extract and a series of mutant probes. Gel shift analyses were performed as described in Materials and Methods with nuclear extract from undifferentiated P19 EC cell with either wild-type, mtOct, mtOctH, or mtSox probe. From their mobilities, we assume that the fastest-migrating band (●) corresponds to probe complexed with Oct-3/4, while the second one (○) corresponds to the complex containing Oct-6 or Sox-2. Likewise, third (∗) and fourth (⧫) bands represent the ternary complex with Oct-3/4 and Sox-2 bound to the probe and Oct-1/probe complex, respectively.

The POU-homeo domain of Oct-3/4 is crucial for its ability to bind with Sox-2 on the UTF1 regulatory element.To determine which portion(s) of Oct-3/4 are critical for its ability to bind to the UTF1 regulatory element together with Sox-2, we made a series of chimeric proteins between Oct-3/4 and Oct-6 (Fig.8A) and compared their abilities to form the ternary complex with Sox-2 on the UTF1 DNA probe. As shown in Fig.8B, left panel, extracts containing chimera 1 or chimera 2 plus Sox-2 gave rise to two bands corresponding to independent binding of either the Oct-chimera or the Sox-2 protein to the probe but no bands showed the simultaneous binding of these two proteins. On the other hand, chimera 3 containing the Oct-3/4 POU homeodomain produced a clear third band with both the chimeric protein 3 and Sox-2. These results indicate that the POU-specific domain of Oct-3/4 can be replaced by that of Oct-6 but that the POU homeodomain is essential for the unique DNA binding specificity of Oct-3/4. Analyses of additional chimeric proteins (chimeras 4 to 6) further demonstrated that POU homeodomain of Oct-3/4 is sufficient for conferring its unique DNA binding activity on the chimeras (Fig. 8B, right panel). Experiments with specific antibody revealed that Sox-2 is indeed involved in the formation of the slow-migrating band (data not shown). We also found that all of the chimeric proteins bound, together with Sox-2, on the probe bearing the consensus Octamer sequence (data not shown).

The POU homeodomain is responsible for the different DNA binding specificity between Oct-3/4 and Oct-6. (A) Schematic representation of chimeric proteins between Oct-3/4 and Oct-6. The open and shaded boxes represent the Oct-3/4 and Oct-6 portions, respectively. The functional domains of the Octamer factor are abbreviated as follows: AD, activation domain; S, POU-specific domain; H, POU homeodomain; and L, linker portion. The expression vectors of these chimeric proteins were constructed as described in Materials and Methods. Chimeras 2, 3, 5, and 6 carry linker portion derived from that of Oct-6. We also constructed a similar set of chimeras whose linker portions are derived from that of Oct-3/4 (data not shown). However, for an unknown reason, those chimeric proteins did not show any DNA binding activity even on the consensus Octamer motif and, therefore, could not be used for the subsequent analyses. However, analyses with the six different chimeras depicted allowed us to conclude that the unique DNA binding specificity of Oct-3/4 is defined by its POU homeodomain (see below). (B) Gel shift analyses with the chimeric protein and Sox-2. Whole-cell extracts were prepared from HeLa cells in which chimeric protein expression vectors or corresponding empty vector had been individually introduced with or without Sox-2 expression vector. Whole-cell extracts were prepared at 48 h posttransfection, and a gel shift analysis was then performed with the extracts and the wild-type UTF1 element. The arrow indicates the band generated due to the simultaneous binding of chimeric protein and Sox-2 on the probe. The complex generated by Sox-2 alone is indicated by a single asterisk, while that generated by the chimeric protein is indicated by two asterisks. However, in the right panel, the probe complexed with Sox-2 and the probe complexed with one of the chimeric proteins (4 to 6) could not be resolved in the gel due to the similarity in the sizes of these proteins.

Functional interaction of the UTF1 regulatory element with the Sox-2/Oct-3/4 complex.In light of the above observations, we next investigated whether the specific interaction of the UTF1 regulatory element with the Sox-2/Oct-3/4 complex is also observed in vivo. For this purpose, we introduced a reporter gene bearing wild-type UTF1 genomic sequence into HeLa cells together with Octamer factor and Sox-2 expression vectors by transient transfection. We also used certain chimeras in this analysis to see whether these proteins also show the expected DNA binding specificity in vivo. As shown in Fig.9A, when the reporter gene (5′ del-4) bearing wild-type UTF1 sequence was used, an increasing amount of the Sox-2 expression vector elevated the level of transcription from the reporter gene significantly only when a constant amount of Oct-3/4 expression vector or chimera 3 bearing the same DNA binding specificity as that of Oct-3/4 was cotransfected. However, no significant activation was observed with Oct-6 or the chimera 2 protein. Thus, these results indicate that Oct-3/4 and Sox-2 indeed bind to the UTF1 regulatory element simultaneously in vivo and that this ternary complex could activate transcription of the reporter gene through this element. Interestingly, chimera 5 bearing the same DNA binding specificity as that of Oct-3/4 failed to activate reporter gene expression, indicating that the activation domains of Oct-3/4 also play an important role in the synergistic activation of the reporter gene activation with Sox-2, which is consistent with the previous conclusion obtained by Ambrosetti et al. (1). To further substantiate these notions, we extended these analyses by using a reporter gene whose Octamer-like sequence had been converted to the consensus Octamer sequence. These analyses revealed that the chimeric protein 2, like the wild-type Oct-3/4 and chimera 3, boosted the levels of transcription in concert with Sox-2 (Fig. 9B). These results indicate that chimeric protein 2 is intrinsically able to cooperate with Sox-2 to stimulate transcription, and the failure of this protein to activate transcription through the wild-type UTF1 sequence (Fig. 9A) is due to its inability to bind to the DNA together with Sox-2. These results also demonstrate that Oct-6 and chimera 5 fail to activate transcription even when they are able to bind the DNA together with Sox-2, further supporting the conclusion given above that activation domains of Oct-3/4 are crucial in activating reporter gene expression.

Cooperative function between Oct-3/4 and Sox-2 on the UTF1 regulatory element. (A) Sox-2 is able to cooperate with Oct-3/4 on the regulatory element of the UTF1 gene. HeLa cells were transfected with a constant amount (4.0 μg) of Oct-3/4, Oct-6, or one of chimera protein 2, 3, and 5 expression vectors, and the increasing amounts of Sox-2 expression vector are as indicated. In addition, the 5′ del-4 reporter gene (1.0 μg) and an internal control neomycin resistance gene (0.5 μg) were also introduced. RNA was recovered at 48 h posttransfection, and the transcripts from the reporter gene and those from the internal control gene were detected by the RNase mapping analyses as described in Materials and Methods. Subsequently, the intensity of the bands corresponding to these two different kinds of transcripts were determined as in Fig. 3B, and the ratio of these two bands was calculated and presented. The data represent the average from five independent experiments with comparative results. (B) Transient-transfection analyses with the reporter gene, which has a canonical Octamer sequence. The cotransfection experiments were done as described above with the reporter gene in which the Octamer-like sequence had been converted to the consensus sequence. The data were obtained as in panel A.

DISCUSSION

It is known that transcription is uniquely regulated during early mammalian embryogenesis (3, 10, 18, 22, 26, 27, 31). In this study, we have characterized the regulatory region governing the expression of the gene for the ES cell coactivator UTF1 in P19 EC cells. The UTF1 gene regulatory region is composed of essentially two transcription factor binding motifs, i.e., Octamer and Sox binding sites. Furthermore, our data demonstrate that a specific Octamer factor, Oct-3/4, is involved in supporting UTF1 expression in ES cells, even though a number of other Octamer binding factors are present in P19 cells and are coexpressed with UTF1 in early embryonic cells (43, 49). It is also known that a number of proteins of the Sox factor family can bind to the Sox motif (35). However, since only one such factor, Sox-2, is assumed to be expressed in early embryonic cells (48), this factor is the best candidate for the Sox factor involved in the regulation of UTF1 expression in P19 cells. Consistent with this idea, our data demonstrate that Sox-2 is involved in stimulation of transcription of the UTF1 gene by cooperating with Oct-3/4. Both Oct-3/4 and Sox-2 are expressed in a highly tissue-restricted manner, and the specific expression of Oct-3/4 during early embryonic stages as reported by Rosner et al. (39) is indeed quite similar to that of UTF1, which we have recently reported (34). Moreover, we have recently found that UTF1 and Oct-3/4 are also coexpressed in primordial germ cells (32a, 43, 47). Thus, these data are also consistent with the notion that UTF1 acts downstream of Oct-3/4 in a regulatory cascade. However, we noted that during the differentiation of ES cells, the disappearance of UTF1 expression occurred earlier than that of Oct-3/4 or Sox-2 (Fig. 2). That is, by day 6 after the induction of differentiation, UTF1 transcripts were almost undetectable, whereas those for Oct-3/4 and Sox-2 were still present. Therefore, one may consider that this result is not consistent with the idea that these two transcription factors play a vital role in regulating UTF1 expression. Although we do not have any definitive answer to this question at present, it is possible that the UTF1 regulatory element requires rather high levels of Oct-3/4 and Sox-2 to exert its activity and, therefore, the element is sensitive to the decrease in both Oct-3/4 and Sox-2. Alternatively, downregulation of UTF1 expression may result not only from the absence of Oct-3/4 and Sox-2 but also from an additional gene-specific repression mechanism that becomes operative upon EC cell differentiation.

There is a precedent for a regulatory element consisting of these Octamer and Sox-binding motifs. Yuan et al. (48) have demonstrated that both Oct-3/4 and Sox-2 are crucial for the expression of the FGF-4 gene in EC cells. Therefore, it is anticipated that a rather large number of downstream genes of Oct-3/4 are regulated in a similar manner, since the fact that two embryonically expressed genes contain similarly organized enhancers is probably not coincidental. It is also noteworthy that a number of other proteins bearing a high-mobility-group domain have been shown to interact with POU domain proteins (1, 24, 50). Most significantly, it has recently been shown that, in the brain, Sox-11, a distinct member of Sox family, synergizes with POU domain-containing Brn-1 by the binding of both proteins to adjacent DNA elements (21). Thus, these results indicate that the combinatorial action between POU- and Sox-domain-containing proteins occurs in numerous cell lineages and that Oct-3/4 and Sox-2 play such a role in early embryonic cells. In one case, Sox-2 has also been shown to negatively regulate Oct-3/4 activity on the regulatory region of the osteopontin gene (5).

As discussed above, our analyses demonstrate a similarity in the organization of the regulatory elements of the UTF1 and FGF-4 genes. However, our data also reveal one significant difference between these regulatory regions in the way in which they recruit the Sox-2/Oct-3/4 complex. It has been shown that the FGF-4 regulatory element can bind not only Oct-3/4 but also Oct-1 together with Sox-2 (48). In contrast, only Oct-3/4 can efficiently form the ternary complex with Sox-2 on the UTF1 regulatory element. Our experiments further reveal that this intrinsic difference is dictated by one base mismatch existing in the 5′ half of the Octamer-like sequence of the UTF1 gene. This portion of the consensus Octamer motif is recognized by the POU homeodomain of the Octamer factors (16). However, our data demonstrate that the homeodomains of Oct-1 and Oct-6 are unable to recognize the divergent UTF1 gene Octamer site because of this mutation and instead utilize the adenine-rich Sox-binding motif for their binding. Accordingly, as depicted in Fig. 6E, binding by these factors covers the surface of nucleotides important for the interaction of Sox-2 with the UTF1 regulatory element. In contrast, the POU homeodomain of Oct-3/4 is able to recognize the nonconsensus Octamer sequence and to allow Sox-2 binding to the adjacent Sox motif. It therefore appears that Oct-3/4 displays a distinct activity from the other Octamer binding proteins in its ability to bind to DNA together with Sox-2 on the UTF1 element. It still remains to be determined how the POU homeodomain of Oct-3/4 shows such an exquisite DNA binding specificity. The crystal structure of Oct-1 bound to DNA has shown that critical contact is made between a specific amino acid (Asn-429) of Oct-1 and the adenine nucleotide which base pairs with the second thymine of the Octamer motif (5′-ATTA/TGCAT-3′) (19). Thus, it is possible that the absence of this contact is responsible for the failure of Oct-1 to recognize the Octamer-like sequence of the UTF1 gene. Therefore, the fact that Oct-3/4 is able to bind to altered Octamer sequence may indicate that it recognizes the motif in a way distinct from Oct-1. A number of factors bearing the POU-DNA binding domain have been isolated and classified into six different groups on the basis of sequence homologies (44). Interestingly, Octamer factors used in this study (Oct-1, Oct-3/4, and Oct-6) belong to distinct classes. Indeed, amino acid sequences of POU homeodomains and POU-specific domains of these three Octamer factors are rather significantly divergent and, therefore, it is possible that nonconserved amino acids of the Oct-3/4 POU homeodomain are responsible for its unique DNA binding specificity. In any event, it appears that the regulatory region of UTF1 has evolved in such a way to ensure pluripotent ES cell expression by the ability to distinguish the transcriptionally active complex composed of Sox-2 and Oct-3/4 from transcriptionally inactive complexes containing Oct-1 or Oct-6.

The molecular basis of the pluripotent properties of ES cells is largely unknown at present. However, one recent approach toward this understanding is warranted. Gene-targeting analyses have clearly shown the importance of Oct-3/4 gene and its target genes for sustaining the early embryogenesis (32). For example, one of these target genes, FGF-4, plays an important role at least in promoting proliferation of trophectoderm precursors, and genetic knockout of this gene results in an early embryonic lethal phenotype. So far, several potential target genes of Oct-3/4, including the UTF1 gene, have been identified (4, 5, 20, 25, 40, 48). In this context it is noteworthy that two of them (FGF-4 and UTF1) are regulated in a similar manner. It is, therefore, tempting to speculate that a substantial number of other genes may also be subject to the synergistic action of Oct-3/4 and Sox-2.

ACKNOWLEDGMENTS

We are indebted to Lisa Dailey, Hiroshi Hamada, and Winship Herr for providing the Oct-3/4 plus Sox-2, the Oct-6, and the Oct-1 expression vectors, respectively. We also thank Hitoshi Niwa for critical comments on this manuscript and Namiko Hihara for her excellent technical assistance throughout this study.

This work was supported in part by the Ministry of Education, Science, Sports, and Culture, Japan. A.O. was also supported by the Ministry of Health and Welfare, Japan, and The Sankyo Foundation of Life Science. A.F. is the recipient of a postdoctoral fellowship from the Japan Health Science Foundation.

(1998) Rex-1, a gene encoding a transcription factor expressed in the early embryo, is regulated via Oct-3/4 and Oct-6 binding to an octamer site and a novel protein, Rox-1, binding to an adjacent site.Mol. Cell. Biol.18:1866–1878.