NSF funds RNA-Seq of “precocious cells”

MADISON — Anthony Gitter remembers the mental spark when listening to a recent talk about the discovery of so-called “precocious cells” — a tiny group of cells that lead a furious advance charge against infection.

“What really struck me was this small number of cells seem to randomly jump out and race through this response to an infection,” says Gitter, a Morgridge Institute for Research investigator and University of Wisconsin-Madison assistant professor of biostatistics and medical informatics. “Then they signal back to all the other cells and encourage them to respond the same way. An entire immune response depends somehow on these random cells racing ahead.”

Gitter used that inspiration as the basis for a 2016 Faculty Early Career Development Award from the National Science Foundation (NSF). Known as the CAREER award, the program supports junior faculty who exemplify the role of teacher-scholars through outstanding research, education and the integration of both endeavors. Gitter will receive approximately $900,000 over five years.

Gitter will use the award to advance a central research challenge about the dynamic nature of biology. Cellular and genetic processes unfold over time, and piecing together that cause-and-effect timeline is one of the biggest challenges in understanding complex biology.

Gitter will work to create novel time-series algorithms that improve the prediction of gene and protein interactions between cells, which is currently a daunting task. He contrasts biology data with the temporal economic data provided by the stock market. The market offers daily outputs of rich performance information recorded over nearly two centuries.

“Unlike a daily stock price, biology isn’t always uniform, and the sampling is not constant and consistent,” Gitter says. “With single cell gene expression, we have thousands of cells giving us data in an unsynchronized way, and you can sample too much from one temporal window and miss others entirely.”

However, with the massive influx of data coming from technology such as single cell RNA sequencing, scientists have some powerful new tools to address the time-series challenge.

The “precocious cells” research, conducted by scientists at Harvard and the Broad Institute, used single cell RNA sequencing data to identify this surprising early-stage reaction to infection. But more often, Gitter says these powerful high-throughput experiments can obscure the question of what genetic signaling is happening when — and what processes trigger another.

“With traditional RNA seq, you often average out the responses of thousands of different cells, and you have basically washed away these fascinating and important outliers,” he says.

Gitter says he hopes to produce analysis techniques that can make better sense of temporal and dynamic information coming from massive data sets. The idea is mapping out a “Point A” and “Point B” in a biological process, such as immune response, and be able to plot in when and why important actions are taking place over time.

Collaborations are under way with the Morgridge regenerative biology team, which is working on methods to track oscillating genes whose functions turn on and off like clockwork. The team also is actively involved in single cell RNA-seq and may have potential applications for Gitter’s work.

Gitter cautions that this process has limitations, and it is very difficult to impose a narrative timeline upon information that is inherently scattershot. Each cell’s gene expression state can be thought of as a single frame in a video. “But now that we have ways to approximate what this broken movie looks like,” he asks, “how are we going to learn something from it?”

Another part of Gitter’s CAREER award focuses on creating new training programs that will reduce barriers for graduate-level and postdoctoral biologists to venture into bioinformatics, allowing them to “take ownership” of all the information their lab work is generating. Unlike many current programs, which help biologists “plug and play” through accessible software, Gitter will emphasize the concepts of “computational thinking,” and how different abstractions, modeling assumptions and methods are considered to get the best results.

Gitter says the general public also might underestimate its ability to engage in “computational thinking.” Gitter intends to work with the Morgridge Institute Discovery Outreach team to develop fun, accessible ways to simulate a bioinformatics challenge. People who enjoy games like crossword puzzles and Sudoku already have a head start.

“Sudoku solving was one of my first forays as an undergrad into artificial intelligence,” he says. “You have these chain reactions where knowing one thing rules out future possibilities, and you work your way toward some simple consistent answer. That’s what I actually do in my research when building models of cellular signaling.”

Categories

Categories

Archives

Archives

What is RNA-Seq?

long RNAs are first converted into a library of cDNA fragments through either RNA fragmentation or DNA fragmentation. Sequencing adaptors (blue) are subsequently added to each cDNA fragment and a short sequence is obtained from each cDNA using high-throughput sequencing technology. The resulting sequence reads are aligned with the reference genome or transcriptome, and classified as three types: exonic reads, junction reads and poly(A) end-reads. These three types are used to generate a base-resolution expression profile for each gene. Nat Rev Genet 10(1):57-63 (2009)