1Division of Human Biology, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America.

Abstract

Each unit of the D4Z4 macrosatellite repeat contains a retrotransposed gene encoding the DUX4 double-homeobox transcription factor. Facioscapulohumeral dystrophy (FSHD) is caused by deletion of a subset of the D4Z4 units in the subtelomeric region of chromosome 4. Although it has been reported that the deletion of D4Z4 units induces the pathological expression of DUX4 mRNA, the association of DUX4 mRNA expression with FSHD has not been rigorously investigated, nor has any human tissue been identified that normally expresses DUX4 mRNA or protein. We show that FSHD muscle expresses a different splice form of DUX4 mRNA compared to control muscle. Control muscle produces low amounts of a splice form of DUX4 encoding only the amino-terminal portion of DUX4. FSHD muscle produces low amounts of a DUX4 mRNA that encodes the full-length DUX4 protein. The low abundance of full-length DUX4 mRNA in FSHD muscle cells represents a small subset of nuclei producing a relatively high abundance of DUX4 mRNA and protein. In contrast to control skeletal muscle and most other somatic tissues, full-length DUX4 transcript and protein is expressed at relatively abundant levels in human testis, most likely in the germ-line cells. Induced pluripotent (iPS) cells also express full-length DUX4 and differentiation of control iPS cells to embryoid bodies suppresses expression of full-length DUX4, whereas expression of full-length DUX4 persists in differentiated FSHD iPS cells. Together, these findings indicate that full-length DUX4 is normally expressed at specific developmental stages and is suppressed in most somatic tissues. The contraction of the D4Z4 repeat in FSHD results in a less efficient suppression of the full-length DUX4 mRNA in skeletal muscle cells. Therefore, FSHD represents the first human disease to be associated with the incomplete developmental silencing of a retrogene array normally expressed early in development.

(A) Diagram of D4Z4 repeat array with two most telomeric full units (large triangles), the last partial repeat, and the adjacent pLAM sequence that contains exon 3. Exons shown as shaded rectangles, with exon 1 and 2 in the D4Z4 units and exon 3 in the pLAM region. (B) The open rectangle represents the region of D4Z4 and pLAM containing the DUX4 retrogene, and the solid and dashed lines represent the regions of exons and introns, respectively, in the short splice form (DUX4-s) and the transcript with the full-length DUX4 ORF (DUX4-fl), which has two isoforms with alternative splicing in the 3-prime untranslated region. First round PCR for DUX4-fl and DUX4-s was performed with primer sets 1 and 2 and second round PCR with nested primers 3 and 4. Nesting was used to ensure specificity and because of the very low abundance of DUX4 transcripts, both DUX4-fl and DUX4-s. MAL, represents location of initial amino-acid codons; *, Stop codons; P, polyadentylation site. (C) Composite image of representative PCR products from FSHD and control muscle biopsies for DUX4-fl, DUX4-s, and DUX4-fl3′. DUX4-fl and DUX4-s are indicated. Variation in size reflects alternative intron usage and the faint intermediate bands represent background non-DUX4 PCR amplicons frequently associated with repetitive sequence. (D) Representative PCR products of cultured FSHD and control muscle cells under differentiation conditions. ntc, no template control.

A small number of FSHD muscle cells express a relatively large amount of DUX4.

(A) RT-PCR for full length DUX4 (DUX4-fl3′) was performed in duplicate on polyadenylated RNA isolated from ten pools of 600 cultured FSHD muscle cells (lanes 1–10) and a single pool of 10,000 cells (lane 11). The presence of DUX4 mRNA in one-half of the pools indicates that approximately one cell per 1000 is expressing DUX4 mRNA. NTC, no template control. (B) Cultured FSHD muscle cells were differentiated and immunostained with monoclonal antibodies to DUX4. Cells were co-stained with the E5-5 rabbit monoclonal antibody to the DUX4 C-terminal region (panels a, c, e) and the P2G4 mouse monoclonal antibody to the N-terminal region (panels b,d,f), or co-stained with the P4H2 mouse monoclonal antibody to the C-terminal region and the E14-3 rabbit monoclonal antibody to the N-terminal region. Approximately 1 cell per 1000 showed nuclear staining and the co-localization of both the n-terminal and c-terminal regions indicates that these cells are expressing the full-length DUX4 protein. No positive nuclei were apparent in the control muscle cultures. (C) Control human muscle cultures infected with a lenti-virus expressing full length DUX4 (a–d) or DUX4-s (e) co-stained with P4H2 (panels a, c) and E14-3 (panels b, d, e). In the cells expressing DUX4-fl, at 24 hrs there is a relatively homogenous nuclear distribution of DUX4 (a,b), whereas at 48 hrs as the cells are undergoing apoptosis DUX4 staining becomes more focal and punctate (c,d), similar to the punctate DUX4 staining in the FSHD muscle cells in panel B. Expression of DUX4-s does not induce apoptosis (data not shown) and the DUX4-s protein maintains a relatively homogenous nuclear distribution at 48 hrs (e).

(A) RT-PCR of RNA from human tissues showing DUX4-fl in the testis sample (Testis-1) and DUX4-s in ovary heart and liver. Note that each sample is from an unknown individual and their genotype is not known. (B) RT-PCR of three additional testis samples and matched skeletal muscle RNA from the Testis-1 and Testis-4 donors. (C) RT-PCR of the full-length DUX4 ORF (nested with 36 total cycles) in muscle (M) and testes (T) RNA from the same individuals, showing expression in testes and not in muscle. Numbers indicate Testis-1, 4, 7, 6, 5. (D) Quantitative RT-PCR of DUX4-fl3′ showing relative abundance in muscle cells, muscle biopsies, human testis and othr indicated cells. The sample with the highest expression was set at 100% and the number of copies per cell roughly estimated based on titration of input DNA. H9-ES, human embryonic cell line H9; M83-9 iPS, iPS line made from control fibroblasts; 43-1 series are the fibroblast, iPS, and EB cells from an FSHD fibroblast line; MB183 and MB148, FSHD muscle cultures under growth (MB) and differentiation (MT) conditions; F-2331 and F-2316, FSHD muscle biopsy samples; hTestis, human testis. (E) Western detection of DUX4 protein in whole cell extracts from tissues and cell lines using a rabbit monoclonal antibody (E14-3) raised to the aminoterminus of the human DUX4 protein. Lanes: 1-control muscle culture; 2-HCT116 cell line; 3-mouse testis; 4-human testis; 5-C2C12 cells transfected with human DUX4 expression vector; 6-C2C12 cells. Specificity of the antibody is indicated by selective detection of DUX4 protein in C2C12 cells transfected with a DUX4 expression vector, pCS2-DUX4 (note that this vector contains two ATG codons resulting in the standard DUX4 protein and an in-frame slightly larger protein accounting for the two bands on the western), compared to untransfected C2C12. Human testis has a single reactive band that migrates marginally slower than the transfected standard DUX4 species in C2C12 cells. This DUX4-reactive band is not present in mouse testis extract (note that mice do not have a highly conserved DUX4), HCT116 human colon cancer cells, nor unaffected myotubes (MB196-36hr). Similar results were obtained with an additional DUX4 antibody (E5-5) raised to the carboxyterminal region of DUX4 (data not shown)) and on protein extracts from two additional testis samples (data not shown). (F) Immunoprecipitation of indicated protein extracts with the E14-3 rabbit monoclonal to the N-terminal region of DUX4 followed by western with the P4H2 mouse monoclonal to the C-terminal region of DUX4 demonstrating that the protein recognized by the rabbit anti-DUX4 is also recognized by an independent mouse monoclonal to DUX4. Lanes: 1-HCT116 cell line lysate; 2- Testis protein lysate; 3-C2C12 cells transfected with DUX4 expression vector.

Frozen sections of human testis stained with anti-DUX4 monoclonal antibodies P2B1 and E5-5 together (A), or P2B1 (B), or E5-5 (C) individually, compared to an isotype control (D). Arrowheads indicate a few of the many nuclei that show a brown antibody-dependent precipitate superimposed on the blue hematoxylin stained nuclei. (E) Antigen retrieval reveals a larger number of more differentiated germ-line cells stained with the P2B1 antibody, compared to an isotype control (F). Because of the range of nuclear morphologies and relatively large numbers of positive cells, the DUX4 expressing cells in the germ line lineage at a range of differentiation stages.

Alternative exon and polyadenylation site usage in germ-line and somatic tissues.

(A) Schematic of last D4Z4 unit, last partial repeat, and distal exons. Exon 7 is approximately 6.5 kb from the polyadenylation site in exon 3. DUX4-fl from FSHD muscle contains exons 1-2-3. DUX4-s uses a non-consensus splice donor in the middle of exon 1 to create a short exon 1: 1s-2-3. Both are derived exclusively from chromosome 4A in muscle and other sampled somatic tissues. Germ line tissue expresses the 4A transcript with exons 1-2-3 but also expresses both 4A and 10A transcripts with exons 1-2-6-7. We have also identified a transcript in testis that shows a 1-2-4-5-6-7 splice usage. (B–E) Exon sequences (upper case letters) with flanking genomic sequence (lower case letters). B, Exon 4. C, Exon 5. D. Exon 6. E. Exon 7. Exon 6 and 7 sequences are from a cDNA assigned to chromosome 10, exon 4 and 5 sequences are from a cDNA assigned to 4A161.

Expression of DUX4-fl and DUX4-s in pluripotent stem cells and differentiated tissues.

(A) Top panel shows Induced Pluripotent Stem (iPS) cells from unaffected and FSHD-affected individuals. Human iPS cell lines were generated from skin fibroblasts cultured from an unaffected (M83-9) and two FSHD-affected individuals (FSHD83-6 and FSHD43-1). Human ES cells (HESC) are also shown. Each iPS cell line was generated by transduction of fibroblasts with murine retrovirus vectors encoding Human SOX2, OCT4, and KLF4. Colonies developed approximately 20 days after infection and had the characteristic growth morphology of an iPS cell with flat, well organized colonies, sharply defined colony and cell borders, high nuclear to cytoplasmic ratio, and prominent single nucleoli. Cells contained tissue non-specific alkaline phosphatase activity (AP), had normal karyotypes, and were immunoreactive (green) for Stage Specific Embryonic Antigen 4 (SSEA4), NANOG, OCT4, and TRA-1-60. 4,6-Diamidino-2-phenylindole (DAPI) staining (blue) indicates total cell content per image. Bottom panel shows hematoxylin and eosin stained tissue sections of teratomas. Teratomas that developed in SCID-Beige mice after intramuscular injection of iPS cells generated from skin fibroblasts of a normal individual (M83-9) or two different FSHD-affected individuals (FHSD83-6 and FSHD43-1). Endoderm-derived tissue is identified by a gut-like structure surrounded by smooth muscle, parenchymal tissue, and lined with a columnar endothelium. Mesoderm-derived tissue is identified by bone (M83-9) or by the presence of cartilage containing chondrocytes (FSHD83-6 and FSHD43-1), and ectoderm-derived tissue is identified by the presence of pigmented neural epithelium (M83-9 and FSHD43-1) or neural rosettes (FSHD83-6). (B) Total cellular RNA was purified from human dermal fibroblasts (HDF), iPS cells used in this study (M83-9, FSHD83-6, FSHD43-1), and Human ES cells (HESC). The presence of RNA transcripts from the genes indicated was detected by priming reverse transcription reactions with oligo dT and PCR amplification of cDNA with oligonucleotides complementary to the sequence of the genes listed (28 cycles). Priming oligonucleotides used for OCT4, SOX2, and KLF4 amplification were specific for non-vector encoded transcripts. As a positive control, RNA from Human embryonic stem cells (HESC) was processed in parallel. Water instead of RNA was used as a negative control, and reverse transcriptase was left out of the cDNA synthesis step (-RT) to demonstrate RNA purity, and RNA transcripts from glyceraldehyde phosphate dehydrogenase (GAPDH) were amplified to demonstrate RNA integrity. (C) RT-PCR of DUX4 mRNA in iPS cells derived from control or FSHD fibroblasts. M8 (M83-9), control fibroblast line; F4 (FSHD 43-1) and F8 (FSHD 83-6), FSHD fibroblast line. (D) Chromatin immunoprecipitation (ChIP) analysis of H3K9me3 at the 5′-region of DUX4 in control and FSHD fibroblasts, induced pluripotent stem (iPS) cells, and embryoid bodies (EB) differentiated from the iPS. Bars represent relative enrichment by real-time PCR with primers previously described and confirmed as specific to D4Z4. The H3K9me3 IP signals were normalized to control IgG IP and to input, presented as mean ± stdev.