The reproduction rates of the bacteria in one's gut may be a good indicator of health or disease

It is increasingly clear that the thousands of different bacteria living in our intestinal tract - our microbiome - have a major impact on our health. But the details of the microbiome's effects are still fairly murky. A Weizmann Institute study that recently appeared in Science suggests approaching this topic from a new angle: Assess how fast the various bacteria grow. This approach is already revealing intriguing links between bacterial growth rates and such conditions as type II diabetes and inflammatory bowel disease. The new supercomputational method can illuminate a dynamic process such as growth from a static "snapshot" of a single sample, and thus it may have implications for both diagnostics and new avenues of research.

Tal Korem and David Zeevi, research students in the lab of Prof. Eran Segal of the Computer Science and Applied Mathematics Department, led this research and collaborated with Jotham Suez, a research student in the lab of Dr. Eran Elinav in the Immunology Department, and Dr. Adina Weinberger, a research associate in Segal's lab. The study began with the advanced genomic sequencing techniques used in many current microbiome studies, which sequence all of the bacterial DNA in a sample. From the short sequences, they construct a picture of the types of bacteria and their relative abundance. But the Weizmann Institute team realized that this sequencing technique held another type of information.

"The sample's bacteria are doing what bacteria do best: making copies of their genomes so they can divide," says Segal. "So most of the bacterial cells contain more than one genome - a genome and a half, for example, or a genome and three quarters." Since most bacterial strains have pre-programmed "start" and "finish" codes, the team was able to identify the "start" point as the short sequence that was most prevalent in the sample. The least prevalent, at the other end of the genome, was the DNA that gets copied last. The researchers found that analyzing the relative amounts of starting DNA and ending DNA could be translated into the growth rate for each strain of bacteria.

The group tested this formulation experimentally, first in single-strain cultures for which the growth rate could be controlled and observed, then in multiple animal model systems, and finally in the DNA sequences of human microbiomes, in their full complexity.

Their method worked even better than expected: The estimated bacterial growth rates turned out to be nearly identical to observed growth rates. "Now we can finally say something about how the dynamics of our microbiome are associated with a propensity to disease. Microbial growth rate reveals things about our health that cannot be seen with any other analysis method," says Elinav.

In their examination of human microbiome data, for example, the group found that particular changes in bacterial growth rates are uniquely associated with type II diabetes; others are tied to inflammatory bowel disease. These associations were not observed in the static microbiome "population" studies. Thus the method could be used in the future as a diagnostic tool to detect disease or pathogen infection early on, or to determine the effects of probiotic or antibiotic treatment. In addition, the scientists hope this new understanding of the microbiome will spur further research into the connections between the complex, dynamic ecosystem inside of us and our health.

Time-lapse imaging can make complicated processes easier to grasp--think of a stitched-together sequence of photos that chronicles the construction of a building. Now, scientists from the Department of Energy's Lawrence Berkeley National Laboratory (Berkeley Lab) are using a similar approach to study how cells repair DNA damage.

They developed a computerized way to measure DNA repair in thousands of human mammary epithelial cells before and after they're exposed to ionizing radiation. Microscopy images are acquired about every thirty minutes over a span of up to two days, and the resulting sequence of images shows ever-changing hotspots inside cells where DNA is under repair. The approach even tracks individual cells as they move about a petri dish, a leap in automation that has been difficult to achieve.

Their new time-lapse technique is already yielding insights into how cells repair DNA strand breaks, which is key to understanding how people respond to ionizing radiation. Scientists study DNA damage for a number of reasons, from learning how to protect astronauts from long-term exposure to cosmic rays to refining radiotherapy protocols that are designed to kill tumors.

Before this approach, researchers could track DNA repair in only about ten cells simultaneously over time. Another method tracks DNA repair in thousands of cells, but it requires removing and studying subsets of cells at different time intervals. It can't track DNA repair in the same cells over time.

"Our approach combines the best of both worlds," says Sylvain Costes of Berkeley Lab's Life Sciences Division. "We're analyzing the same cells over many hours, and we're studying thousands of them, which allows us to arrive at statistically significant findings."

At the heart of their technique are algorithms that lock onto and track individual cells as they move about a cell culture. The algorithms can also follow daughter cells that are created when cells divide. Another component of their approach is the use of human mammary epithelial cells that are modified so that DNA repair proteins, called 53BP1, are fluorescently labeled. This modification, when combined with algorithms that can analyze thousands of cells simultaneously, enables the technique to scan multiple cells and classify areas inside each one where 53BP1 proteins cluster at DNA damage sites. These clusters are called "radiation induced foci."

The Berkeley Lab scientists have used their cell-tracking and DNA-damage-classification algorithms to analyze human mammary epithelial cells beginning 24 hours before exposure to high and low doses of radiation, and continuing until 24 hours after exposure.

Among their findings is a newly discovered phenomenon: Although DNA damage occurs in random areas throughout a cell, the DNA repair process, as evidenced by the clustering of 53BP1 proteins, is localized in very specific regions of the cell nucleus.

"This could lead to problems," says Costes. "If the repair process is constrained to specific domains, then there is more of a chance that some of the breaks will meet and get merged together. This would increase the risk of chromosomal rearrangements, such as translocation, where pieces of chromosomes get mixed together, which is considered a precursor to cancer."

The scientists also found a big difference in how cells respond to DNA damage relative to radiation dose. Some of the differences are well known, such as the fact that high doses of radiation cause cells to stop dividing, whereas low doses don't arrest cell division. But they also found new processes. At high doses, for example, they discovered that small clusters of 53BP1 proteins merge into larger clusters. This further confirms the risk of chromosomal rearrangement at high doses.

A paper describing their imaging method, with movies of their work, was recently published in the journal PLOS One.

For more than a decade, gene sequencers have been improving more rapidly than the computers required to make sense of their outputs. Searching for DNA sequences in existing genomic databases can already take hours, and the problem is likely to get worse.

Recently, Bonnie Berger's group at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) has been investigating techniques to make biological and chemical data easier to analyze by, in some sense, compressing it.

In the latest issue of the journal Cell Systems, Berger and colleagues present a theoretical analysis that demonstrates why their previous compression schemes have been so successful. They identify properties of data sets that make them amenable to compression and present an algorithm for determining whether a given data set has those properties. They also show that several existing databases of chemical compounds and biological molecules do indeed exhibit them.

Given measurements for those properties, the researchers can also calculate the improvements in search efficiency that their compression techniques afford. For the data sets they analyze, those efficiencies scale sublinearly, meaning that the larger the data set, the more efficient the search should be.

"This paper provides a framework for how we can apply compressive algorithms to large-scale biological data," says Berger, a professor of applied mathematics at MIT. "We also have proofs for how much efficiency we can get."

The key to the researchers' compression scheme is that evolution is stingy with good designs. There tends to be a lot of redundancy in the genomes of closely related -- or even distantly related -- organisms.

That means that of all the possible sequences of the four DNA letters -- A, T, C, and G -- only a very small subset is represented by the genomes of real organisms. Moreover, within the space of possible genomes, those of real organisms are not distributed randomly. Instead, they trace out continuous patterns, which represent the relatively slow rate at which species diverge.

Birds of a feather

To make searching more efficient, the Berger group's compression algorithms cluster together similar genomic sequences -- those that diverge by only a few DNA letters --then choose one sequence as representative of the cluster. A search can concentrate only on the likeliest clusters; most of the data never has to be examined.

If genomic data is envisioned as tracing a continuous path through a much larger space of possibilities, then the clusters can be envisioned as spheres superimposed on the data. Data points that fall within a single sphere are closely related.

Berger and her colleagues -- first authors Noah Daniels, a postdoc in her group, and William Yu, a graduate student in applied mathematics, and David Danko, an undergraduate major in computational biology -- show that data sets are amenable to their compressive search techniques if they meet two criteria. The first they refer to as metric entropy. This means that the data inhabits only a small part of the larger space of possibilities.

The second is low fractal dimension. That means that the density of the data points doesn't vary greatly as you move through the data. If your search requires you to explore three spheres rather than one, it takes only three times as long -- not 10 times, or 100 times.

In their paper, the MIT researchers analyze three data sets. Two describe proteins -- one according to their sequences of amino acids, the other according to their shape -- and the third describes organic molecules. In a separate paper, now under submission, the researchers apply the same types of analysis to DNA segments between 32 and 63 letters in length.

Time's arrow

The efficiency of their search algorithm scales sublinearly, not with the number of data points, but with the metric entropy of the data set, which is a formal measure of the continuity of the data and their sparseness, relative to the space of possibilities. Because evolution is conservative, the metric entropy of genomic data should increase as new genomes are sequenced. That is, the addition of new genomes will not, in all likelihood, add new branches to the pattern traced out in the space of possibilities; rather, it will fill in gaps in the existing pattern, increasing the metric entropy.

Many other large data sets, however, could prove to be conservative in the same way. The range of behaviors exhibited by Web users, for instance, may, relative to the entire space of possibilities, be constrained by biology, by cultural history, or both. The MIT researchers' compression techniques could thus be applicable to a wide range of data outside biology.

An international research team led by University of Otago scientists has documented prehistoric "sanctuary" regions where New Zealand seabirds survived early human hunting.

The researchers used ancient-DNA analysis, radiocarbon dating and supercomputational modeling to reconstruct population histories for prehistoric seabirds around coastal New Zealand.

Dr Nic Rawlence, who carried out the genetic study, says the team found a very distinctive pattern, where shag/mapua (Leucocarbo chalconotus) populations from the Stewart Island region were little affected by human hunting, but mainland populations were rapidly decimated.

"There was a loss of more than 99% of their population size within 100 years of human arrival. These once heavily-hunted mainland populations now occupy only a fraction of their prehistoric range, having never really recovered," Dr Rawlence says.

The study suggests that the mainland populations survived on just a few rocky islands off the South Island's east coast.

"Interestingly, recent archaeological studies suggest that human numbers declined in the Stewart Island region around 1500 AD, a factor which seems to explain why wildlife persisted in this region," he says.

Project leader Professor Jon Waters says that scientists have long argued about the causes of prehistoric wildlife declines and extinctions--some pointing the finger at humans, and others attributing the shifts to climate change.

"By showing drastically different wildlife histories--between regions that are climatically similar--we can start to understand the major impact of prehistoric human hunting, which differed across space and time," Professor Waters says.

Supercomputer models of developing cancers reveal how tiny movements of cells can quickly transform the makeup of an entire tumour.

The models reinforce laboratory studies of how tumours evolve and spread, and why patients can respond well to therapy, only to relapse later.

Researchers used mathematical algorithms to create three-dimensional simulations of cancers developing over time. They studied how tumours begin with one rogue cell which multiplies to become a malignant mass containing many billions of cells.

Their models took into account changes that occur in cancerous cells as they move within the landscape of a tumour, and as they replicate or die. They also considered genetic variation, which makes some cells more suited to the environment of a tumour than others.

They found that movement and turnover of cells in a tumour allows those that are well suited to the environment to flourish. Any one of these can take over an existing tumour, replacing the original mass with new cells quickly - often within several months.

This helps explain why tumours are comprised mostly of one type of cell, whereas healthy tissue tends to be made up of a mixture of cell types.

However, this mechanism does not entirely mix the cells inside the tumour, the team say. This can lead to parts of the tumour becoming immune to certain drugs, which enables them to resist chemotherapy treatment. Those cells that are not killed off by treatment can quickly flourish and repopulate the tumour as it regrows. Researchers say treatments that target small movements of cancerous cells could help to slow progress of the disease.

The study, a collaboration between the University of Edinburgh, Harvard University and Johns Hopkins University, is published in the journal Nature. The research was supported by the Leverhulme Trust and The Royal Society of Edinburgh.

Dr Bartlomiej Waclaw, of the University of Edinburgh's School of Physics and Astronomy, who is the lead author of the study, said: "Computer modelling of cancer enables us to gain valuable insight into how this complex disease develops over time and in three dimensions."