Abstract

The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding
of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect
of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe
here how a systems biology framework might allow large‐scale determination of the embryonic regulatory relationships encoded
in the C. elegans genome. This framework consists of two broad steps: (a) defining the “parts list”—all genes expressed in all cells at each
time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation.
Substantial progress has been made towards defining the parts list through imaging methods such as large‐scale green fluorescent
protein (GFP) reporter analysis. Imaging results are now being augmented by high‐resolution transcriptome methods such as
single‐cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known
within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on
individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging
comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists,
and would also allow scientists to ask questions not accessible without a comprehensive picture.

Images

A systems biology perspective on embryonic development. (a) Each C. elegans individual (here, with chromatin visualized with fluorescently‐labeled histones) develops from the fertilized egg (left) to a L1 larval stage worm (right) through identical patterns of 670 cell divisions that give rise to 558 terminal cells and 113 programmed cell deaths (center). (b) A systems biology framework for understanding development: first, the “parts list” (genes expressed in each cell across time) is identified (left), and then the parts list is integrated with other data to generate computational models of regulatory networks. These models make predictions that can be experimentally tested and then the results used to iteratively refine the models

Sequencing approaches to defining the embryonic parts list. (a) Individual embryos can be selected manually at specific stages and assayed by RNA‐seq to get temporal (but not spatial) expression dynamics genome‐wide. Comparing such time courses across mutants where specific lineage identities or cell fates are lost allows the inference of lineage or cell fate‐specific expression. (b) Single cell RNA‐seq of all cells from individual embryos gives full transcriptome information at single cell resolution, and cell ID assignment is simplified by the knowledge of the embryonic stage and which cells are present at that stage. (c) A “Whole Organism Shotgun” approach starts from mixed stage embryos and uses massively parallelized single cell RNA‐seq approaches to assay expression from 10s of thousands of cells. Current methods allow dense measurement across all cells and times but cell ID and temporal ID for each cell must be inferred solely from the expression measurements

Imaging approaches to defining the embryonic parts list. (a) Live imaging of fluorescent reporters for gene expression (red) along with a ubiquitous marker for cell tracking (green) allows the generation of single cell resolution expression profiling for each gene tested. Because each embryo has the same lineage, the fluorescence measurements for multiple genes imaged separately can be mapped onto the same reference lineage and directly compared (right). (b) Single molecule RNA FISH methods allow visualization of individual transcripts as diffraction limited spots (left). The same embryos can be imaged multiple times, with different combinations of transcripts labeled in each experiment, allowing assays of hundreds or thousands of different mRNA molecules. Once expression is measured for each gene across many embryos of different stages, new methods will be needed to segment the images, assign each mRNA molecule to the correct cell, and determine each cell's identity relative to the lineage tree, ultimately producing a single reference map with mRNA counts for each cell across time

Jarriault,, S., Schwab,, Y., & Greenwald,, I. (2008). A Caenorhabditis elegans model for epithelial‐neuronal transdifferentiation. Proceedings of the National Academy of Sciences of the United States of America, 105(10), 3790–3795. https://doi.org/10.1073/pnas.0712159105

Kinney,, J. B., Murugan,, A., Callan, Jr., C. G., & Cox,, E. C. (2010). Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proceedings of the National Academy of Sciences of the United States of America, 107(20), 9158–9163. https://doi.org/10.1073/pnas.1004290107

Schwarz,, E. M., Kato,, M., & Sternberg,, P. W. (2012). Functional transcriptomics of a migrating cell in Caenorhabditis elegans. Proceedings of the National Academy of Sciences of the United States of America, 109, 16246–16251. https://doi.org/10.1073/pnas.1203045109