RICHLAND, Wash. –
Researchers sequencing the DNA of blue-green algae found a linear chromosome harboring genes important for producing biofuels. Simultaneously analyzing the complement of proteins revealed more genes on the linear and the typical circular chromosomes than they'd have found with DNA sequencing alone.

The team reported the cyanobacterium Cyanothece 51142's genome the week of September 15 in the Proceedings of the National Academy of Sciences Early Edition. Overlaying protein data let the researchers pinpoint about 16 percent more genes than by DNA sequencing alone. The collaboration included a proteomics team from the Department of Energy's Pacific Northwest National Laboratory, a gene sequencing team from the Washington University Genome Sequencing Center, and researchers from Washington University, Saint Louis University, and Purdue University.

"This is the first time anything like this has been found in photosynthetic bacteria. It’s extremely rare for bacteria to have a linear chromosome," said team leader Himadri Pakrasi from WUSTL. "Nearly 100 percent of them do not."

Cyanobacteria are unique among bacteria because they seem part plant-like and part microbe-like. They use the sun's energy to make sugar via photosynthesis like plants do. And like bacteria, Cyanothece 51142 has other key life-sustaining functions, such as doctoring atmospheric nitrogen so other species can use it. This so-called nitrogen fixation is performed by a handful of bacterial species in water and soil. Cyanothece also makes ethanol and hydrogen, activities that drew the attention of the DOE and others looking for new ways to make fuel.

But unlike most bacteria, Cyanothece has a day-night schedule for performing work. It makes sugar in the daylight, but then spends its nights breaking down that sugar to fix nitrogen and to produce different compounds. And bacteria generally store their DNA in circular chromosomes. Linear chromosomes are usually found in more complex creatures such as plants and animals.

Photosynthesis and nitrogen fixation are incompatible, leading this microbe to separate the activities both physically within the cell and temporally, via night and day. While not incompatible, scientists sequencing DNA and those identifying proteins often do their work in separate groups as well.

Proteomics analysis examines almost the whole complement of proteins in a cell, but requires a gene sequence with which to pair up protein shards for identification. On the other hand, DNA sequencing can't always identify potential genes or unmask which of those really function, and could benefit from knowing which proteins the cell actually makes.

Instead of waiting on one analysis to do the other, the collaborators simultaneously sequenced the bacteria's DNA and determined proteins that the microbe produced at different times of its life cycle. They then compared the information to determine which of the DNA sequences that looked like genes actually made proteins. In this way, they could better determine where genes lie along the length of its genome, as well as find ones that might otherwise be missed.

"This was an excellent example of using proteomics to guide initial genomic annotation," said protein chemist Jon Jacobs of PNNL. "We're helping to set a precedent if we can do the proteomics work while they're doing the genomics work."

Overall, Cyanothece 51142 carries one large circular chromosome and four small chromosomes called plasmids, and the linear one. On these, the team found 2,735 genes that looked like genes in other organisms, suggesting they are actual proteins. One important finding was that the unexpected linear chromosome was more than just a pretty face. It contained the only copy of a key protein that lets the bugs produce lactate, called lactate dehydrogenase, during fermentation.

DNA sequencing revealed the linear chromosome to be 430 kilobases long and contain a cluster of nine genes that code for other enzymes involved in pyruvate metabolism. These allow Cyanothece 51142 to make ethanol, hydrogen, acetate, and other compounds. Oddly, the linear chromosome was missing some features that linear chromosomes in complex organisms display. Without obvious protective caps called telomeres, for example, Cyanothece must use an unidentified way to preserve the integrity of its linear chromosomes when it reproduces.

In addition to the 2,700-plus real genes, the DNA sequence contained more than 2,500 would-be genes. These had architectural features common to genes but didn't look like recognized genes from other organisms. The team found about 500 of these that produced proteins, so the researchers re-classified these genes as functioning. Lastly, the scientists also found 38 proteins out of another 12,000 sequences that were gene longshots.

"Using proteomics, we always suspected we'd be able to detect genes not called out in the genome, but it was surprising how many hypothetical genes actually produced proteins," said Jacobs.

For the next round, additional DOE resources will enable the sequencing and analysis of the genomes of six other Cyanothece strains in a quest to find the best one to produce hydrogen.

"The goal is to find the hydrogen-producing workhorse of these seven," Pakrasi said. "Work is ongoing, and I expect in a year or so we will learn a lot more."

This work was supported by the Department of Energy's Basic Energy Sciences, part of the Office of Science, and the Danforth Foundation at Washington University. This work is also part of the Membrane Biology Scientific Grand Challenge project at EMSL.

EMSL, the Environmental Molecular Sciences Laboratory, is a national scientific user facility sponsored by the Department of Energy's Office of Science. Located at Pacific Northwest National Laboratory in Richland, Wash., EMSL offers an open, collaborative environment for scientific discovery to researchers around the world. Its integrated computational and experimental resources enable researchers to realize important scientific insights and create new technologies. Follow EMSL on Facebook, LinkedIn and Twitter.

Interdisciplinary teams at Pacific Northwest National Laboratory address many of America's most pressing issues in energy, the environment and national security through advances in basic and applied science. Founded in 1965, PNNL employs 4,300 staff and has an annual budget of more than $1 billion. It is managed by Battelle for the U.S. Department of Energy's Office of Science. As the single largest supporter of basic research in the physical sciences in the United States, the Office of Science is working to address some of the most pressing challenges of our time. For more information on PNNL, visit the PNNL News Center, or follow PNNL on Facebook, Google+, LinkedIn and Twitter.