The wisdom of cellular crowds

Once again, an interesting Theory Lunch talk has inspired me to write a blog post. Last Friday’s talk was from Mike White, who described (among other things) his lab’s efforts to understand the transcriptional behavior of the prolactin gene. This gene is primarily expressed in the pituitary, and controls the production of milk in breastfeeding mothers. On a cell-by-cell level, its expression is very variable in pituitary tissue; neighboring cells express the gene to very different extents. And yet the random expression patterns in individual cells together add up to a coordinated response at the tissue level. If we hope to build from an understanding of how cells behave to an understanding of how organisms behave, we need to know what underlies this kind of “wisdom of crowds” effect. And so White and colleagues set out to determine why this gene shows variable expression (Harper et al. 2011. Dynamic analysis of stochastic transcription cycles. PLoS Biologydoi:10.1371/journal.pbio.1000607) and how this expression might be coordinated on a population level.

Cell-to-cell variability in the levels of proteins and mRNAs has been much studied in bacteria, where at least two factors are likely to be important: first, key regulatory molecules may be present in the cell at very low numbers, leading to randomness in gene activation; and second, unequal partitioning of components at cell division may create additional variation that is pretty much indistinguishable from the fluctuations caused by sporadic gene activation. There have been fewer studies in eukaryotes, so far, but people are already speculating about additional sources of differences: perhaps genes are moved in and out of “transcription factories” at different times in different cells, or perhaps the differences are caused by chromatin remodeling.

Whatever the reason behind the stochasticity, the first priority in studying anything is to find a way to see and measure it. To get accurate measurements of protein production, Harper et al. used luciferase reporter genes under the prolactin promoter. In the pre-GFP days, luciferase was more widely used than it is now, and in his Theory Lunch talk White spent a few moments discussing why he thinks luciferase is due for a comeback. The reason relates to the difference in the way the light you see is generated. What you’re measuring when you look at GFP is excited fluorescence: you put in light at one wavelength (395 nm for traditional GFP) and get out light at a different wavelength (509 nm). The transformation of one wavelength into the other is accomplished by a fluorophore that the protein cleverly assembles out of amino acids. In contrast, luciferase is an enzyme that produces light by catalyzing the oxidation of luciferin, in an ATP-dependent reaction. The big advantage of luciferase, White pointed out, is that cells don’t glow of their own accord, though they do fluoresce. This means that the signal to noise ratio in luciferase assays is extraordinary, many fold better than that in GFP-based assays. On the other hand, he conceded, you need to be very careful about stray sources of light in order to take advantage of the extremely low background luciferase makes possible. There are also a number of other technical issues you need to worry about, including getting the luciferin substrate into the cell. Still, if you’re looking for accurate measurements of potentially rare events, which many of you may be, the advantages of luciferase are worth considering.

Using luciferase, then, Harper et al. set out to measure cycles of transcription driven by the prolactin promoter in individual cells of a rat pituitary cell line. Live cell imaging showed dramatic oscillation in protein levels that were not synchronized in any obvious way from cell to cell. To find out what was driving these pulses, they added a second reporter gene (GFP this time), again under the prolactin promoter. The idea was that the degree of correlation between the pulses of two different reporter genes in the same cell, under the same promoter, would provide some insight into whether the pulses resulted from pulses in upstream signaling events (as they do, for example, in the p53 pulses Galit Lahav studies) or the availability of some regulatory molecule, or from changes in the ability of the gene to be transcribed.

To compare the outputs of the two reporter genes, the authors developed a statistical model relating what they could observe (light production by proteins) to the events they wanted to analyze (production of mRNAs). The model includes terms for production and degradation of mRNA, production and degradation of protein, and activation of the enzyme (in the case of luciferase) or maturation of the fluorophore (in the case of GFP). Using this, they could look at whether there’s any correlation between the activation of the two different reporter genes in the same cell. And there isn’t. The behavior of the two genes is completely uncorrelated. This rules out a whole raft of potential explanations, including pulses in upstream signaling or other differences in cell state, and focused the authors on the possibility that the pulses might be due to an intrinsic property of the transcription process itself.

An interesting clue to the underlying mechanism came from measurements of the length of time each gene stays on, once transcription has been activated, and the length of the time it stays off, once it’s been turned off. The “on” state is unremarkable: it lasts an average of 4 hours, similar to other estimates of transcriptional burst sizes. The “off” state, however, almost always lasts at least 3 hours, with an average of 6.5 hours. What sets this minimum “off” duration? To test the possibility that chromatin remodeling is involved, Harper et al. treated the cells with trichostatin A, a drug that inhibits histone deacetylases. This led to a shorter delay in turning on for the first pulse, and gave good synchronization between the two reporter genes in the first pulse. It also increased the length of the “on” state; and the number of cells that showed transcriptional oscillations dropped dramatically. This seems like strong evidence that both the delay period and the lack of synchronization between the two reporter genes result from something to do with chromatin remodeling.

The information they gathered on the dynamics of on/off switching provided another insight. To write about this one, I need to steal part of a figure from the paper. What you see when you look at the duration of the “on” state (shown in A) is exponential decay. This is diagnostic of a random process; however long the gene has been on, it has the same chance of switching to the off state. When you look at the duration of the “off” state, however (B), it fits the black exponential curve very badly. But if you ignore the first three hours (E), then you get an excellent fit to an exponential. In other words, the the process of switching to “on” is random if and only if the off state has already lasted 3 hours. This refractory period is actually what creates the appearance of oscillations in gene expression: random process | defined pause | random process looks more like an oscillation than random process | random process.

How does the gene “know” whether it’s been off for long enough to be allowed to switch on again? We don’t know yet. The picture that comes to my mind, personally, is that of a very complicated bit of origami. This analogy will probably turn out to be entirely wrong, so don’t take it too seriously, but here it is in case it helps. Imagine the off state as a folded crane, and the on state as a flat sheet of paper. Switching between them in either direction takes many steps, and each step takes time. Perhaps it takes 1.5 hours to unfold the “off” state all the way to the “on” state, and 1.5 hours to fold it all back up. Total, 3 hours. The key point is that whatever molecular machinery does the unfolding can’t start until the crane is completely folded up again, so you can’t switch directions in the middle. The fact that trichostatin A reduced the number of cells showing oscillations is consistent with this picture: if you can’t complete the folding process and get all the way to the crane, you can’t start the second pulse.

What does all this have to do with the “wisdom of crowds” effect we set out to understand? We now know that the genes in any given population of cells can be in three different states: on but capable of turning off, off and refractory, and off but capable of turning on. Harper et al. argue that this mosaic of states provides the perfect substrate for a graded response. A brief stimulus will activate some subset of the genes that are off but capable of turning on, resulting in a modest increase in gene expression. A stimulus that goes on longer will catch some of the genes that are exiting the “off and refractory” state, and give a larger increase in gene expression. The average level of expression across the whole population is determined by the probability that a gene will turn on or off, which is modified by the strength and duration of the stimulus. So a response that is 30% of maximum is made up of 30% of the population expressing at maximum level, not 100% of the population expressing at 30% level. To me this is rather reminiscent of the Sorger lab’s finding that graded responses in apoptosis are also caused by a variable chance of cell death, not a variable level of death (after all, you can’t be 30% dead). Now that we know how to look for such situations, my bet is that we’re going to find a lot more of them.