Minding the Gaps in the Genome: An Interview with Mitch Guttman

Mitchell Guttman is a new assistant professor of biology on campus. He just arrived last month, having recently completed a fellowship at the Broad Institute of MIT and Harvard. Originally from Brooklyn, New York, Guttman received both his BS and MS degrees in 2006 from the University of Pennsylvania and completed his PhD at MIT in 2012. Since then, he has received an NIH Early Independence Award and was included on Forbes magazine's 30 Under 30: Science and Healthcare list.

While still a graduate student at the Broad Institute, Guttman led the team that first described a special class of genes called lncRNAs (large noncoding RNAs, pronounced "link RNAs"). These pieces of genetic material fall between the genes that code for proteins, and therefore had been largely overlooked previously. However, researchers are now finding that these lncRNAs are important players in genome regulation and cellular organization. Guttman's lab at Caltech will continue to study lncRNAs—how they work, why they are needed, and what makes them special. A recent paper in Science Express shares the latest.

Guttman recently took a break from setting up his lab to answer a few questions.

Do you remember how you first became interested in science?

I've always kind of been interested in science, ever since high school. I had a really great chemistry teacher who recognized my love of chemistry and biology and introduced me to some researchers at Mount Sinai in New York. I started doing research there at the end of my sophomore year of high school and worked there through my senior year. It was cancer research—mostly looking at breast cancer, and migration and adhesion patterns. I was doing very basic molecular biology, and I learned a ton.

When I was an undergraduate at Penn, the person I had worked with back in high school introduced me to one of his colleagues—a pathologist at Penn who was starting to do a lot of work on cancer genomics, which I knew nothing about but which sounded very fascinating. I started working with her my freshman year. During that time, it became very clear to me that to understand this work, I had to delve into the quantitative and computational aspects. I eventually helped develop computational methods to look at cancer mutation patterns and identify the "driver" mutations in the cancer genome versus the passengers—things that just come along for the ride but don't really have a direct effect in causing cancer. At the time, there weren't any methods to do that.

When I started graduate school, I wanted to work on cancer. That's how I met Eric Lander, the director of the Broad Institute, who was my graduate advisor.

How did you end up working on lncRNAs?

At the time I joined Eric's lab, there was kind of a revolution going on in genomics. Next-generation sequencing—ultrahigh-throughput, massively parallel sequencing—was starting to come online. There were very few institutions in the world that had instruments to do this—the Broad was one, Caltech was one. These were the first instruments that allowed us to sequence DNA at unparalleled depth. Eric's lab was using these instruments to look at chromatin modifications—how DNA wraps around different proteins, or histones, in the nucleus—and they had all of this new data. It hadn't been published. So Eric said, "I bet there's something here to be found. You're a computational guy; why don't you play around with it?"

The first thing I did, as a good computational guy, was to try to figure out a good algorithm to make sense of it. Once I did, it kind of hit me in the face.

What did you find?

Until then, we hadn't been able to look at anything but genes. But when we were able to look at the whole genome, we saw all of these regions of intergenic space—things that were between genes—that looked like genes. They had chromatin modifications with patterns that looked identical to genes. That suggested that there were thousands of unannotated genes. What became clear immediately was that although they had the same patterns as protein-coding genes, they didn't code for proteins. They did not have evolutionary signatures that looked like proteins. They were very different. We called them lncRNAs.

That finding basically led me on what has now been a seven-year stretch of trying to figure out what they are, what they do, and how they work.

Had no one previously looked at histone modifications?

They had, but they were mostly looking at very specialized regions—they were looking at promoters, which are regions that control transcription, or they were looking at proteins, or they were looking at the genes themselves. But only 1 percent of the genome encodes proteins, so 99 percent is really a no-man's-land, if you will. There had been no methods to pick out and classify these patterns across the entire genome, because there had been no data.

We wrote a computer program to search for these regions. We never named the original program, but its successor was called Scripture.

Why was all of this so exciting?

It made me realize that there were in fact thousands of these large noncoding RNAs that looked like proteins but didn't act like proteins—they did something else. What that something else was, I didn't know, but it was new and unexplored and was clearly important. The potential was huge.

And as a scientist, you can't just turn away when you find something like this. You've got to figure it out. The idea was exciting: there were all these thousands of genes that had previously been missed and unappreciated that could play really important roles in ways that we didn't understand. I wanted to know how they work. What are they doing? We're still figuring it out. Every time we find something, it is more exciting than I would have anticipated. That's what I love about this: it's never been obvious; it's never been dull.

Why did you choose to come to Caltech?

Caltech's an amazing place. I love the faculty. I love the small size. I love how interactive and not overlapping but collaborative it is. No other place that I had been to was like this—this seamless—and in no place did I feel as comfortable talking with chemists and engineers as I did with biologists. The breadth of the institution and the vision and the interactions were pretty unique and exciting.