Here the whole group in the lab of Computational conSequences during the Spring/Summer of 2015. I’d say that this is the best group ever.

Gustavo leaves today, going back to Michoacán, Mexico after spending his sabbatical here. Julie left a few weeks ago, also back to Michoacán. She might come back for the Spring/Summer 2016.

The only locals are Brigitte, Kissa, Thomas, César and me. Brigitte and Kissa being honorary members who have been in the lab for collaborative reasons, but work for their M.Sc. degrees with other faculty members at Laurier (Michael Suits and Geoff Horsman, respectively).

The earlier article dealt with several experimentally-confirmed functional interactions determined in Escherichia coli: genes in operons, genes whose products physically interact, genes regulated by the same transcription factor (regulons), and genes coding for transcription factors and their regulated genes. In that study we found that the associations involving transcription factors tend to be much less conserved than any of the other associations studied. Our work is not the first to suggest this lack of conservation, but is the first to compare conservation across different kinds of associations, and thus show that those mediated by transcriptional regulation are the least conserved.

The most recent article was an expansion of the association between genes coding for transcription factors and other genes. The idea being to extend the study towards as many other prokaryotes as possible. But how could we determine conservation between genes coding for transcription factors and other genes without experimentally-determined interactions? We knew that at least some transcription factors could be predicted from their possessing a DNA binding domain. But what about their associations? Our prior experience has been that target genes are hard to predict even when there’s information on some characterized binding sites (sites that we like calling operators for tradition’s sake). So what to do if we have only the transcription factors? Well, to answer that we should first explain how we measured relative evolutionary conservation.

To measure evolutionary conservation we used a measure of co-occurrence called mutual information. For any two genes, the higher the mutual information, the less the observed co-occurrence looks random. Since we obtained mutual information scores for all gene pairs in the genomes we analyzed, we decided that instead of something as hard as predicting operators, and matching them to predicted transcription factors, we could use top scoring gene pairs as representatives of the most conserved interaction between our predicted transcription factors and anything else. This allowed us to compare the most conserved interactions involving transcription factors against the conservation of other interactions. Our findings suggest that interactions involving transcription factors evolve quickly in most-if-not-all of the genomes analyzed.

The article twists the normal use of phylogenetic profiles, which is that of predicting functional interactions. The idea for phylogenetic profiles is that if we observe that two genes co-occur their products might work together. What does this mean? Well, to co-occur means to appear both in the same genome, and to be both absent whenever the either one would be absent. A most excellent idea. A most difficult one to use for actual predictions. OK then, hard to use for predictions? Why? Not sure, but, for starters, we can see that genes that work together in one organism do not co-occur that much across organisms. So I thought, maybe functional interactions are not well conserved. Maybe partners in functional crime are exchanged with ease. How would we know? Well, maybe if we look at the phylogenetic profiles of collections of genes whose products functionally interact we could see something of a rate of exchange, maybe the rates would be difficult to estimate, so what about comparing against the whole background of co-occurrence? What about finding some “gold standards”? … and that was like an eureka moment. What about comparing different kinds of interactions in terms of their conservation? So, I tried a few, and lo and behold, interactions via co-regulation (regulons) looked worse than a “gold negative,” namely transcription unit boundaries (adjacent genes in the same strand, but different transcription units).

So there you go. The most surprising result was the low levels of conservation for interactions mediated via regulons. The best part was that the most conserved interactions were those among genes found in the same transcription unit (in operons). Why best? Because a lot of my research has been about using operon predictions for predicting networks of functional interactions. Since these interactions are the most conserved, we might expect them to be the most useful to infer functional interactions. Right? Well, maybe. Still lots of research needed. I hope you enjoy the article.