Abstract

Transcriptional program that drives human preimplantation development is largely unknown. Here, by using single-cell RNA sequencing of 348 oocytes, zygotes and single blastomeres from 2- to 3-day-old embryos, we provide a detailed analysis of the human preimplantation transcriptome. By quantifying transcript far 5'-ends (TFEs), we include in our analysis transcripts that derive from alternative promoters. We show that 32 and 129 genes are transcribed during the transition from oocyte to four-cell stage and from four- to eight-cell stage, respectively. A number of identified transcripts originates from previously unannotated genes that include the PRD-like homeobox genes ARGFX, CPHX1, CPHX2, DPRX, DUXA, DUXB and LEUTX. Employing de novo promoter motif extraction on sequences surrounding TFEs, we identify significantly enriched gene regulatory motifs that often overlap with Alu elements. Our high-resolution analysis of the human transcriptome during preimplantation development may have important implications on future studies of human pluripotent stem cells and cell reprograming.

Figures

Figure 1. Overview of the study and…

Figure 1. Overview of the study and changes in total cellular RNA content.

2

( a…

Figure 1. Overview of the study and changes in total cellular RNA content.

(a) Microphotographs and number of cells used from each developmental stage studied. A total of 348 cells were collected at oocyte and zygote stage, day 2 (4–6 blastomeres) and day 3 (7–10 blastomeres) after fertilization. Scale bar, 20 μm. (b) Heterogeneity of gene expression among oocytes, day 2 and day 3 embryo blastomeres. The heatmap shows Spearman correlation coefficient (in complete pairs of observations) between the samples in library L185. The correlation matrix is sorted by hierarchical clustering with Euclidean distance and complete linkage method. (c) Spearman correlation coefficient between the samples in library L186 showing biological variation in day 2 and day 3 blastomeres. (d) Comparison of relative poly(A)-tailed RNA content between the developmental stages from plates L233, L185 including exactly staged 4-cell embryos (n=23), and L186 including exactly staged 4-cell (n=11) and 8-cell (n=22) embryos. Relative poly(A)-tailed RNA content per cell is calculated as the ratio of total mapped reads on genome references to mapped reads on spike-in references with centering by average of the ratios in earlier stage. Red line indicates average level, and P value is the assessment of the mean rank difference by Wilcoxon rank sum test. (e) A comparison of overall proportion of TFEs according to genome annotation. The comparisons are made between oocyte to zygote, oocyte to 4-cell, and 4- to 8-cell. The annotated groups marked with dots or crosses and separated by black lines are plotted separately in f. (f) Proportion of annotated reads for functional transcripts from groups ‘upstream and 5′UTR of coding', ‘upstream and 1st exon of noncoding', and ‘intron' and ‘unannotated' in the transitions from oocyte to 4-cell, 4- to 8-cell and comparison of 8-cell stage blastomeres to human brain RNA. Purple lines indicate average ratio, and P value is the assessment of the mean rank difference by Wilcoxon rank sum test. The variation between 8-cell stage blastomeres was more than pure technical variation represented by human brain samples, suggesting biological variation. 4-cell, four-cell; 8-cell, eight-cell.

Figure 2. Analysis of single-cell gene expression…

Figure 2. Analysis of single-cell gene expression levels during EGA.

5

To exclude library bias effect,…

Figure 2. Analysis of single-cell gene expression levels during EGA.

To exclude library bias effect, we performed the comparisons between cells within the same STRT libraries. (a) Analysis of early EGA (oocyte-to-4-cell transition). Pvclust hierarchical clustering of the cells on plate L185 shows that cells from different cell stages are clearly isolated from each other, whereas cells from the same cell stage cluster together. The cells within the red boxes were included in downstream analyses. (b) Upregulated (red dots), downregulated (green dots) and genes showing no significant change (grey dots) in the two stages. The dotted line marks the cell division effect on cellular RNA content (two cell divisions from oocyte to 4-cell stage and one cell division from 4-cell to 8-cell stage). (c) A comparison of overall proportions of TFEs according to genome annotation for significantly upregulated and downregulated TFEs during early EGA. The 32 upregulated TFEs in the early EGA included known start sites for coding genes (red bar), but also unannotated transcription sites (light grey). Downregulated TFEs, in contrast, consisted of >50% internal exons or 3′UTR of coding genes (pink), consistent with partially degraded transcripts. (d) GO molecular function classification of the 32 upregulated coding genes during early EGA. (e) Analysis of major EGA (4-to-8-cell transition). Pvclust hierarchical clustering of the cells on plate L186. The cells within the red boxes were included in downstream analyses, while the outlier cell was excluded from further analyses. (f) Upregulated (red dots) and downregulated (green dots) and genes showing no significant change (grey dots). The dotted line marks the cell division effect on cellular RNA content as described above. (g) A comparison of overall proportions of TFEs according to genome annotation for significantly upregulated and downregulated TFEs during EGA. More than 70% upregulated TFEs mapped to annotated coding gene start sites, compared to <25% of the downregulated transcripts. (h) GO molecular function classification of the 129 upregulated coding genes during major EGA. 4-cell, four-cell; 8-cell, eight-cell.

(a) Conserved motifs in the promoter regions (at −2,000∼+500-bp distance of TFE cluster peak) present in 27 of the 32 TFEs upregulated in the transition from oocyte to 4-cell embryo. The red dots indicate the positions of the de novo 36-bp motifs, while the white dots indicate the positions of similar sequence to the de novo 36-bp motif. There are 50 Alus in the 27 de novo motif containing promoters in the oocyte to 4-cell transition (grey bars). (b) Known TF-binding sequences for PRD-like homeobox, T-box and bZIP and the first predicted de novo 36-bp DNA motif that includes these elements. Error bars indicate the confidence of a motif based on the number of sites. (c) Conserved motifs present in the promoters of 91 out of 129 TFEs upregulated at the major EGA. The promoters of upregulated TFEs show significant enrichment of Alu elements, with 208 Alus (grey bars) in the 91 de novo motif containing promoters. (d) T-box and PRD-like homeobox elements show similarities to a de novo 35-bp DNA motif sequence highly similar to the 36-bp motif and present in 39 promoters of the upregulated TFEs in the major EGA. Error bars indicate the confidence of a motif based on the number of sites.

Figure 4. PRD-like homeobox genes expression during…

Figure 4. PRD-like homeobox genes expression during EGA.

7

( a ) Comparison of conserved homeodomains…

Figure 4. PRD-like homeobox genes expression during EGA.

(a) Comparison of conserved homeodomains present in novel PRD-like homeobox genes cloned from early human development (blue shading) and in previously annotated genes (no shading). The highest degree of conservation can be found for the amino-acid residues responsible for forming the tertiary structure of the homeodomain. Drosophila melanogaster ANTP (grey shading) is given as reference for homeodomain proteins. The amino-acid sequences are aligned using ClustalW. (b) Early developmental expression pattern of the novel PRD-like homeobox genes (blue shading) and previously annotated genes. The expression profiles of the so-called Yamanaka factors (iPS induction factors) MYC, SOX2, KLF4 and POU5F1/OCT4 are shown for comparison. Heterogeneous expression between individual blastomeres at 8-cell stage (day 3 embryo) is seen for LEUTX and CPHX1 expression in particular. In addition, a reciprocal patterning of CPHX1 and CPHX2 expression is detected in developmental stages up until 8-cell stage, with CPHX1 being present in the oocyte, whereas CPHX2 is upregulated in a few cells of 4-and 8-cell stage embryos. The box at the far right describes the presence (indicated by black squares) of the novel PRD-like homeobox genes and the previously annotated genes in the different evolutionary branches. Background colour in the box visualizes the two main clusters from the heatmap (clusters shown at far left).