William Greenleaf

Associate Professor of Genetics and, by courtesy, of Applied Physics

Bio

Bio

William Greenleaf is an Assistant Professor in the Genetics Department at Stanford University School of Medicine, with a courtesy appointment in the Applied Physics Department. He is a member of Bio-X, the Biophysics Program, the Biomedical Informatics Program, and the Cancer Center. He received an A.B. in physics from Harvard University (summa cum laude) in 2002, and received a Gates Fellowship to study computer science for one year in Trinity College, Cambridge, UK (with distinction). After this experience abroad, he returned to Stanford to carry out his Ph.D. in Applied Physics in the laboratory of Steven Block, where he investigated, at the single molecule level, the chemo-mechanics of RNA polymerase and the folding of RNA transcripts. He conducted postdoctoral work in the laboratory of X. Sunney Xie in the Chemistry and Chemical Biology Department at Harvard University, where he was awarded a Damon Runyon Cancer Research Foundation Fellowship, and developed new fluorescence-based high-throughput sequencing methodologies. He moved to Stanford as an Assistant Professor in November 2011. Since beginning his lab, he has been named a Rita Allen Foundation Young Scholar, an Ellison Foundation Young Scholar in Aging (declined), a Baxter Foundation Scholar, and a Chan-Zuckerberg Investigator. His highly interdisciplinary research links molecular biology, computer science, bioengineering, and genomics a to understand how the physical state of the human genome controls gene regulation and biological state. Efforts in his lab are split between building new tools to leverage the power of high-throughput sequencing and cutting-edge microscopies, and bringing these new technologies to bear against basic biological questions of genomic and epigenomic variation. His long-term goal is to unlock an understanding of the physical “regulome” — i.e. the factors that control how the genetic information is read into biological instructions — profoundly impacting our understanding of how cells maintain, or fail to maintain, their state in health and disease.

Contact

Links

Research & Scholarship

Current Research and Scholarly Interests

Our lab focuses on developing methods to probe both the structure and function of molecules encoded by the genome, as well as the physical compaction and folding of the genome itself. Our efforts are split between building new tools to leverage the power of high-throughput sequencing technologies and cutting-edge optical microscopies, and bringing these technologies to bear against basic biological questions by linking DNA sequence, structure, and function.

Abstract

We developed an allele-specific assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq) to genotype and profile active regulatory DNA across the genome. Using a mouse hybrid F1 system, we found that monoallelic DNA accessibility across autosomes was pervasive, developmentally programmed and composed of several patterns. Genetically determined accessibility was enriched at distal enhancers, but random monoallelically accessible (RAMA) elements were enriched at promoters and may act as gatekeepers of monoallelic mRNA expression. Allelic choice at RAMA elements was stable across cell generations and bookmarked through mitosis. RAMA elements in neural progenitor cells were biallelically accessible in embryonic stem cells but premarked with bivalent histone modifications; one allele was silenced during differentiation. Quantitative analysis indicated that allelic choice at the majority of RAMA elements is consistent with a stochastic process; however, up to 30% of RAMA elements may deviate from the expected pattern, suggesting a regulated or counting mechanism.

Abstract

Aptamers are a promising class of affinity reagents because they are chemically synthesized, thus making them highly reproducible and distributable as sequence information rather than a physical entity. Although many high-quality aptamers have been previously reported, it is difficult to routinely generate aptamers that possess both high affinity and specificity. One of the reasons is that conventional aptamer selection can only be performed either for affinity (positive selection) or for specificity (negative selection), but not both simultaneously. In this work, we harness the capacity of fluorescence activated cell sorting (FACS) for multicolor sorting to simultaneously screen for affinity and specificity at a throughput of 10(7) aptamers per hour. As a proof of principle, we generated DNA aptamers that exhibit picomolar to low nanomolar affinity in human serum for three diverse proteins, and show that these aptamers are capable of outperforming high-quality monoclonal antibodies in a standard ELISA detection assay.

Abstract

Chromatin structure at the length scale encompassing local nucleosome-nucleosome interactions is thought to play a crucial role in regulating transcription and access to DNA. However, this secondary structure of chromatin remains poorly understood compared with the primary structure of single nucleosomes or the tertiary structure of long-range looping interactions. Here we report the first genome-wide map of chromatin conformation in human cells at the 1-3 nucleosome (50-500 bp) scale, obtained using ionizing radiation-induced spatially correlated cleavage of DNA with sequencing (RICC-seq) to identify DNA-DNA contacts that are spatially proximal. Unbiased analysis of RICC-seq signal reveals regional enrichment of DNA fragments characteristic of alternating rather than adjacent nucleosome interactions in tri-nucleosome units, particularly in H3K9me3-marked heterochromatin. We infer differences in the likelihood of nucleosome-nucleosome contacts among open chromatin, H3K27me3-marked, and H3K9me3-marked repressed chromatin regions. After calibrating RICC-seq signal to three-dimensional distances, we show that compact two-start helical fibre structures with stacked alternating nucleosomes are consistent with RICC-seq fragmentation patterns from H3K9me3-marked chromatin, while non-compact structures and solenoid structures are consistent with open chromatin. Our data support a model of chromatin architecture in intact interphase nuclei consistent with variable longitudinal compaction of two-start helical fibres.

Abstract

Genome conformation is central to gene control but challenging to interrogate. Here we present HiChIP, a protein-centric chromatin conformation method. HiChIP improves the yield of conformation-informative reads by over 10-fold and lowers the input requirement over 100-fold relative to that of ChIA-PET. HiChIP of cohesin reveals multiscale genome architecture with greater signal-to-background ratios than those of in situ Hi-C.

Abstract

Spatial organization of the genome plays a central role in gene expression, DNA replication, and repair. But current epigenomic approaches largely map DNA regulatory elements outside of the native context of the nucleus. Here we report assay of transposase-accessible chromatin with visualization (ATAC-see), a transposase-mediated imaging technology that employs direct imaging of the accessible genome in situ, cell sorting, and deep sequencing to reveal the identity of the imaged elements. ATAC-see revealed the cell-type-specific spatial organization of the accessible genome and the coordinated process of neutrophil chromatin extrusion, termed NETosis. Integration of ATAC-see with flow cytometry enables automated quantitation and prospective cell isolation as a function of chromatin accessibility, and it reveals a cell-cycle dependence of chromatin accessibility that is especially dynamic in G1 phase. The integration of imaging and epigenomics provides a general and scalable approach for deciphering the spatiotemporal architecture of gene control.

Abstract

Metastases are the main cause of cancer deaths, but the mechanisms underlying metastatic progression remain poorly understood. We isolated pure populations of cancer cells from primary tumors and metastases from a genetically engineered mouse model of human small cell lung cancer (SCLC) to investigate the mechanisms that drive the metastatic spread of this lethal cancer. Genome-wide characterization of chromatin accessibility revealed the opening of large numbers of distal regulatory elements across the genome during metastatic progression. These changes correlate with copy number amplification of the Nfib locus, and differentially accessible sites were highly enriched for Nfib transcription factor binding sites. Nfib is necessary and sufficient to increase chromatin accessibility at a large subset of the intergenic regions. Nfib promotes pro-metastatic neuronal gene expression programs and drives the metastatic ability of SCLC cells. The identification of widespread chromatin changes during SCLC progression reveals an unexpected global reprogramming during metastatic progression.

Abstract

Cancer sequencing studies have primarily identified cancer driver genes by the accumulation of protein-altering mutations. An improved method would be annotation independent, sensitive to unknown distributions of functions within proteins and inclusive of noncoding drivers. We employed density-based clustering methods in 21 tumor types to detect variably sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and noncoding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs demonstrate spatial clustering of alterations in molecular domains and at interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated across tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest that mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally agnostic driver identification.

Abstract

A decade of rapid method development has begun to yield exciting insights into the 3D architecture of the metazoan genome and the roles it may play in regulating transcription. Here we review core methods and new tools in the modern genomicist's toolbox at three length scales, ranging from single base pairs to megabase-scale chromosomal domains, and discuss the emerging picture of the 3D genome that these tools have revealed. Blind spots remain, especially at intermediate length scales spanning a few nucleosomes, but thanks in part to new technologies that permit targeted alteration of chromatin states and time-resolved studies, the next decade holds great promise for hypothesis-driven research into the mechanisms that drive genome architecture and transcriptional regulation.

Abstract

Somatic cells can be transdifferentiated to other cell types without passing through a pluripotent state by ectopic expression of appropriate transcription factors. Recent reports have proposed an alternative transdifferentiation method in which fibroblasts are directly converted to various mature somatic cell types by brief expression of the induced pluripotent stem cell (iPSC) reprogramming factors Oct4, Sox2, Klf4 and c-Myc (OSKM) followed by cell expansion in media that promote lineage differentiation. Here we test this method using genetic lineage tracing for expression of endogenous Nanog and Oct4 and for X chromosome reactivation, as these events mark acquisition of pluripotency. We show that the vast majority of reprogrammed cardiomyocytes or neural stem cells obtained from mouse fibroblasts by OSKM-induced 'transdifferentiation' pass through a transient pluripotent state, and that their derivation is molecularly coupled to iPSC formation mechanisms. Our findings underscore the importance of defining trajectories during cell reprogramming by various methods.

Abstract

Spectacular advances in the throughput of DNA sequencing have allowed genome-wide analysis of epigenetic features such as methylation, nucleosome position and post-translational modification, chromatin accessibility and connectivity, and transcription factor binding. However, for rare or precious biological samples, input requirements of many of these methods limit their application. In this review we discuss recent advances for low-input genome-wide analysis of chromatin immunoprecipitation, methylation, DNA accessibility, and chromatin conformation.

Abstract

Transcription by RNA polymerase (RNAP) is interrupted by pauses that play diverse regulatory roles. Although individual pauses have been studied in vitro, the determinants of pauses in vivo and their distribution throughout the bacterial genome remain unknown. Using nascent transcript sequencing, we identified a 16-nucleotide consensus pause sequence in Escherichia coli that accounts for known regulatory pause sites as well as ~20,000 new in vivo pause sites. In vitro single-molecule and ensemble analyses demonstrate that these pauses result from RNAP-nucleic acid interactions that inhibit next-nucleotide addition. The consensus sequence also leads to pausing by RNAPs from diverse lineages and is enriched at translation start sites in both E. coli and Bacillus subtilis. Our results thus reveal a conserved mechanism unifying known and newly identified pause events.

Abstract

Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D.

Abstract

We describe an assay for transposase-accessible chromatin using sequencing (ATAC-seq), based on direct in vitro transposition of sequencing adaptors into native chromatin, as a rapid and sensitive method for integrative epigenomic analysis. ATAC-seq captures open chromatin sites using a simple two-step protocol with 500-50,000 cells and reveals the interplay between genomic locations of open chromatin, DNA-binding proteins, individual nucleosomes and chromatin compaction at nucleotide resolution. We discovered classes of DNA-binding factors that strictly avoided, could tolerate or tended to overlap with nucleosomes. Using ATAC-seq maps of human CD4(+) T cells from a proband obtained on consecutive days, we demonstrated the feasibility of analyzing an individual's epigenome on a timescale compatible with clinical decision-making.

Abstract

We developed a simple, compact microfluidic device to perform high dynamic-range digital polymerase chain reaction (dPCR) in an array of isolated 36-femtoliter microreactors. The density of the microreactors exceeded 20000/mm(2). This device, made from polydimethylsiloxane (PDMS), allows the samples to be loaded into all microreactors simultaneously. The microreactors are completely sealed through the deformation of a PDMS membrane. The small volume of the microreactors ensures a compact device with high reaction efficiency and low reagent and sample consumption. Future potential applications of this platform include multicolor dPCR and massively parallel dPCR for next generation sequencing library preparation.

Abstract

We developed a multiplex sequencing-by-synthesis method combining terminal phosphate-labeled fluorogenic nucleotides (TPLFNs) and resealable polydimethylsiloxane (PDMS) microreactors. In the presence of phosphatase, primer extension by DNA polymerase using nonfluorescent TPLFNs generates fluorophores, which are confined in the microreactors and detected. We immobilized primed DNA templates in the microreactors, then sequentially introduced one of the four identically labeled TPLFNs, sealed the microreactors and recorded a fluorescence image after template-directed primer extension. With cycle times of <10 min, we demonstrate 30 base reads with ∼99% raw accuracy. Our 'fluorogenic pyrosequencing' offers benefits of pyrosequencing, such as rapid turnaround, one-color detection and generation of native DNA, along with high detection sensitivity and simplicity of parallelization because simultaneous real-time monitoring of all microreactors is not required.

Abstract

We present details of the design, construction, and testing of a single-beam optical tweezers apparatus capable of measuring and exerting torque, as well as force, on microfabricated, optically anisotropic particles (an "optical torque wrench"). The control of angular orientation is achieved by rotating the linear polarization of a trapping laser with an electro-optic modulator (EOM), which affords improved performance over previous designs. The torque imparted to the trapped particle is assessed by measuring the difference between left- and right-circular components of the transmitted light, and constant torque is maintained by feeding this difference signal back into a custom-designed electronic servo loop. The limited angular range of the EOM (+/-180 degrees ) is extended by rapidly reversing the polarization once a threshold angle is reached, enabling the torque clamp to function over unlimited, continuous rotations at high bandwidth. In addition, we developed particles suitable for rotation in this apparatus using microfabrication techniques. Altogether, the system allows for the simultaneous application of forces (approximately 0.1-100 pN) and torques (approximately 1-10,000 pN nm) in the study of biomolecules. As a proof of principle, we demonstrate how our instrument can be used to study the supercoiling of single DNA molecules.

Abstract

Transcription termination by bacterial RNA polymerase (RNAP) occurs at sequences coding for a GC-rich RNA hairpin followed by a U-rich tract. We used single-molecule techniques to investigate the mechanism by which three representative terminators (his, t500, and tR2) destabilize the elongation complex (EC). For his and tR2 terminators, loads exerted to bias translocation did not affect termination efficiency (TE). However, the force-dependent kinetics of release and the force-dependent TE of a mutant imply a forward translocation mechanism for the t500 terminator. Tension on isolated U-tracts induced transcript release in a manner consistent with RNA:DNA hybrid shearing. We deduce that different mechanisms, involving hypertranslocation or shearing, operate at terminators with different U-tracts. Tension applied to RNA at terminators suggests that closure of the final 2-3 hairpin bases destabilizes the hybrid and that competing RNA structures modulate TE. We propose a quantitative, energetic model that predicts the behavior for these terminators and mutant variants.

Abstract

Riboswitches regulate genes through structural changes in ligand-binding RNA aptamers. With the use of an optical-trapping assay based on in situ transcription by a molecule of RNA polymerase, single nascent RNAs containing pbuE adenine riboswitch aptamers were unfolded and refolded. Multiple folding states were characterized by means of both force-extension curves and folding trajectories under constant force by measuring the molecular contour length, kinetics, and energetics with and without adenine. Distinct folding steps correlated with the formation of key secondary or tertiary structures and with ligand binding. Adenine-induced stabilization of the weakest helix in the aptamer, the mechanical switch underlying regulatory action, was observed directly. These results provide an integrated view of hierarchical folding in an aptamer, demonstrating how complex folding can be resolved into constituent parts, and supply further insights into tertiary structure formation.

Abstract

Single-molecule techniques have advanced our understanding of transcription by RNA polymerase (RNAP). A new arsenal of approaches, including single-molecule fluorescence, atomic-force microscopy, magnetic tweezers, and optical traps (OTs) have been employed to probe the many facets of the transcription cycle. These approaches supply fresh insights into the means by which RNAP identifies a promoter, initiates transcription, translocates and pauses along the DNA template, proofreads errors, and ultimately terminates transcription. Results from single-molecule experiments complement the knowledge gained from biochemical and genetic assays by facilitating the observation of states that are otherwise obscured by ensemble averaging, such as those resulting from heterogeneity in molecular structure, elongation rate, or pause propensity. Most studies to date have been performed with bacterial RNAP, but work is also being carried out with eukaryotic polymerase (Pol II) and single-subunit polymerases from bacteriophages. We discuss recent progress achieved by single-molecule studies, highlighting some of the unresolved questions and ongoing debates.

Abstract

Interdisciplinary work in the life sciences at the boundaries of biology, chemistry and physics is making enormous strides. This progress was showcased at the recent Single Molecule Biophysics conference.

Abstract

Many biologically important macromolecules undergo motions that are essential to their function. Biophysical techniques can now resolve the motions of single molecules down to the nanometer scale or even below, providing new insights into the mechanisms that drive molecular movements. This review outlines the principal approaches that have been used for high-resolution measurements of single-molecule motion, including centroid tracking, fluorescence resonance energy transfer, magnetic tweezers, atomic force microscopy, and optical traps. For each technique, the principles of operation are outlined, the capabilities and typical applications are examined, and various practical issues for implementation are considered. Extensions to these methods are also discussed, with an eye toward future application to outstanding biological problems.

Abstract

We present a method for sequencing DNA that relies on the motion of single RNA polymerase molecules. When a given nucleotide species limits the rate of transcription, polymerase molecules pause at positions corresponding to the rare base. An ultrastable optical trapping apparatus capable of base pair resolution was used to monitor transcription under limiting amounts of each of the four nucleotide species. From the aligned patterns of pauses recorded from as few as four molecules, we determined the DNA sequence. This proof of principle demonstrates that the motion of a processive nucleic acid enzyme may be used to extract sequence information directly from DNA.

Abstract

During transcription, RNA polymerase (RNAP) moves processively along a DNA template, creating a complementary RNA. Here we present the development of an ultra-stable optical trapping system with ångström-level resolution, which we used to monitor transcriptional elongation by single molecules of Escherichia coli RNAP. Records showed discrete steps averaging 3.7 +/- 0.6 A, a distance equivalent to the mean rise per base found in B-DNA. By combining our results with quantitative gel analysis, we conclude that RNAP advances along DNA by a single base pair per nucleotide addition to the nascent RNA. We also determined the force-velocity relationship for transcription at both saturating and sub-saturating nucleotide concentrations; fits to these data returned a characteristic distance parameter equivalent to one base pair. Global fits were inconsistent with a model for movement incorporating a power stroke tightly coupled to pyrophosphate release, but consistent with a brownian ratchet model incorporating a secondary NTP binding site.

Abstract

Optical traps are useful for studying the effects of forces on single molecules. Feedback-based force clamps are often used to maintain a constant load, but the response time of the feedback limits bandwidth and can introduce instability. We developed a novel force clamp that operates without feedback, taking advantage of the anharmonic region of the trapping potential where the differential stiffness vanishes. We demonstrate the utility of such a force clamp by measuring the unfolding of DNA hairpins and the effect of trap stiffness on opening distance and transition rates.