The postdoc gets back from his Christmas break tomorrow, and I plan to have a revised version of his uptake-bias manuscript waiting for him. I've been working through the last part of the Results, and it's looking pretty good. Not finished of course, as there are still two analyses to add. One is the experimental test of the positional interactions predicted by his analysis, comparing uptake of DNA fragments containing single or double mismatches from the consensus USS. I think he's going to do this experiment as soon as he gets back. The other is analysis of out-of-alignment uptake sequences in fragments that were taken up despite lacking a good in-alignment uptake sequence. This issue arises because some fragments contained small insertions or deletions that caused their uptake sequence to be misaligned in the original analysis, and some other fragments may have substitutions that created an uptake sequence at a new location in the fragment. There aren't very many of these out-of-alignment uptake sequences but sorting them out helps clarify some other issues. He's already done the analysis (at least most of it) but we still need to incorporate it into the Results.

The bigger problem is what to say in the Discussion section. Right now it's a shambles, with lots of interesting points and good sentences and well-written paragraphs all jumbled together. I need to get a better perspective on this - to think about what actually should be discussed.

One place to start is the Introduction. Issues we raised there should be addressed and ideally resolved in the Discussion. Unfortunately, our lovely work doesn't really resolve these.

Well, so much for working on the Discussion... I'm now back to asking myself (and, in absentia, the postdoc) questions about how we interpret his analysis of interaction effects: Do interaction effects explain the discrepancy between genomic motif and uptake motif? Do interaction effects support the hypothesis that uptake bias is intrinsic to the mechanism of uptake, and not the effect of a single dedicated recognition protein? And more questions I've added to his lovely figure, so I remember to ask him them tomorrow.

And now, I've got two manuscript reviews (both overdue) and a book review to do.

Just before he left for a brief Christmas vacation the postdoc did a detailed analysis of the genomic uptake sequences identified by (i) the genomic USS motif identified by the GibbsMotif Sampler and (ii) the DNA uptake motif identified by his sequencing experiment. The two motifs look quite different, and if we applied them both to the same long random DNA sequences we expect that they would pick out different sub-sequences that more-or-less correspond to the motif. But what will they identify in the H. influenzae genome?

We expect the sequences picked out by the genomic USS motif to resemble the search motif (because that's how the motif was identified in the first place). But what will the uptake motif find? It's much simpler, so will it find mainly sequences that just have the four-base inner core GCGG motif?
The analysis is done by sliding the motif across the genome, at each position using the motif to calculate a score for the 32 bases lined up with the motif. This is done with each strand of the 1,830,138 bp genome, so a total of 3,660,276 scores are generated with each motif. The postdoc then plotted a histogram of the scores for each motif.

At this resolution it's no different than you would get for a random DNA sequence. But if we zoom in on the bottom right corner of each graph, we see little blips of about 2000 high-scoring positions. As expected, the sequences of the 1793 positions in the genomic-motif scoring blip give a motif that looks just like the genomic motif we searched with. Unexpectedly, the sequences of the 1892 positions found with the much simpler uptake motif also give a motif a lot like the genomic motif, much more complex than the uptake motif. In fact, the two searches found mostly the same positions; 1689 of the positions in each blip were also present in the other blip.

One of the explanations we were considering for the differences between the two motifs is that the Gibbs Motif Sampler might have unrecognized biases that caused the sequences it identified to not be properly representative of the sequences in the genome. (The most likely candidate is the way we specified the search frame for the Gibbs analysis.) We were going to test this possibility by simulating the evolution of some genomes using each of the motifs in turn, and then test whether Gibbs searches of these evolved genomes gave the original motifs.

But this new result tells us that this possibility is not the explanation for the discrepancy between the two motifs. The uptake sequences in the genome really do look like the full genomic motif, even though the bias of the uptake machinery only cares strongly about the four inner-core bases. I confess that I like this result partly because it saves me from having to run a bunch of USS-evolution simulations to generate sequences for Gibbs analysis.

We suggest three other explanations. First, the steps leading from uptake to recombination might have sequence biases, so that only sequences with the complex motif efficiently recombine. Second, there might be functional constraints on the sequences after they've recombined, so that the complex ones are more likely to become fixed in the population. Although it's certainly likely that some sequence biases and functional constraints do exist, to me it seems very unlikely that they would generate such a complex motif. Thus I prefer our final possibility, that the uptake motif produced by the data is incomplete because it neglects the effects of interactions between the different positions that contributed to uptake (that is, because it incorrectly assumes that each base in the motif acts independently of the others).

We then go on to describe the interaction analysis we've done and the tests we've made (well, the postdoc's about to make) using defined sequences.

The postdoc and I are back at it yet again, working on his paper about the sequence specificity of DNA uptake. I'm beginning to think there's something pathologically wrong, either with us or with this piece of research, because we never seem to get closer to finishing it. Instead, we just keep discovering more analyses that need to be done. (The part that's done gets better and better, but we seem to be no closer to submission.)

This time it's that we need a more rigorous comparison of the uptake-specificity motif his data has produced with the old 'genomic' motif we derived by analyzing the genome with the Gibbs Motif Sampler. Both motifs consist of numbers representing the probability of finding each of the four bases (A, G, C T) at each position in a 32 bp segment. We've been saying and writing that, although these motifs have the same consensus, they are very different in the importances they ascribe to different positions. We have a list of four possible explanations for the differences, but before we discuss these we need to test whether the motifs actually pick out different subsets of the genome. Maybe all of the ~2500 sequences that would be found by searching for the genomic motif would also be found by searching for the less-constraining uptake motif. If so, we might then focus on what other sequences the uptake motif found, or, if it didn't find any, why not.

Yesterday I made 8 preps of competent cells, all to further our phenotyping of our new competence-gene mutants. Four of them needed to have transformation assays done, three were to be frozen for later DNA-uptake assays by the postdoc*, and one was a knockout of the competence-regulator sxy, to be used as a negative control in the uptake assays.

I didn't include a wild-type positive control strain for my transformation assays, because I've done this lots of times before. But I did assay the sxy mutant, just to confirm that I had the right strain. The assay is simple: mix 1 ml of competent cells (~10^9 cells) with 1 µg of NovR chromosomal DNA, incubate for 15 min, add 10 µg DNase I, incubate for 5 min, dilute and plate on plain sBHI agar and on sBHI agar with 2.5 µg novobiocin/ml.

Most of the strains I assayed had been tested before, but some with not-very-consistent results, and I was expecting to see a wide range of transformation frequencies (maximum about 5 x 10^-3 and minimum less than 10^-8 (the detection limit)). Because I wasn't sure what I would find, I made a point of plating 100 µl of undiluted culture from each assay. BUT, there were absolutely no NovR colonies on any of the plates.

So I'm pretty sure I screwed something up. But what? I used the same DNA stock tube I've used many times before, and I definitely remember putting 3 µl of DNA into each assay tube. I made fresh sBHI + novobiocin plates using pre-made BHI agar,, and I definitely remember adding the hemin (4 ml), NAD (80 µl) and novobiocin (40 µl) to the melted agar before I poured the plates. The DNaseI should be fine; I've used this tube before. And the cells aren't dead, as the plain sBHI plates had the expected numbers of colonies. Oh how I wish I'd included the positive control! Luckily I froze one tube of competent culture of each of the strains I transformed, 'just in case', so I can redo the transformations without having to make now competent preps.

To check if I somehow screwed up the agar plates despite my 'definite' memory, I've streaked a test known NovR strain on them, with and without more NAD or hemin. Before I go home I should set up some overnight cultures of the strains I'm going to test tomorrow... Wait, will I have time to do this tomorrow? It takes time and planning to get the cells into the right growth stage for the competence treatment, and then a couple of hours for competence development and the transformation assays. I have a meeting in the middle of the day, and then we're going to finalize the grades for the big genetics course... Yes, sure, I can always work late.

* The postdoc is getting interesting results from these assays. First, all of the mutants he's tested, even those lacking genes thought to be essential for DNA uptake bind/take up at least fourfold more DNA than the noncompetent log-phase cells he's using as a negative control. The competent sxy- cells I've just made won't induce any competence genes in this medium, so the amount of DNA they associate with will tell us whether this binding is just a property of cells that have been incubated in the starvation medium, or represents induction of some competence genes.

For a couple of the mutants, he's found much higher DNA association levels than we expected. One mutant should lack the secretin pore through which the pseudopilus contacts the DNA; the other lacks PilF2, an 'accessory' pilin that's absolutely required for transformation, presumably because it's absolutely required for DNA uptake. The next step is to repeat the DNA uptake assays, this time comparing treatments with and without DNase I in the wash step. If the mutant cells are binding to DNA but not taking it up, the DNase I should remove the DNA.

UPDATE: My novobiocin plates had no NovR colonies because I had forgotten to add the required hemin supplement to the agar! How embarrassing - I haven't made that mistake in years.

ABSTRACT: Life requires a key set of chemical elements to sustain growth. Yet, a growing body of literature suggests that microbes can alter their nutritional requirements based on the availability of these chemical elements. Under limiting conditions for one element microbes have been shown to utilize a variety of other elements to serve similar functions often (but not always) in similar molecular structures. Well-characterized elemental exchanges include manganese for iron, tungsten for molybdenum and sulfur for phosphorus or oxygen. These exchanges can be found in a wide variety of biomolecules ranging from protein to lipids and DNA. Recent evidence suggested that arsenic, as arsenate or As(V), was taken up and incorporated into the cellular material of the bacterium GFAJ-1. The evidence was interpreted to support As(V) acting in an analogous role to phosphate. We will therefore discuss our ongoing efforts to characterize intracellular arsenate and how it may partition among the cellular fractions of the microbial isolate GFAJ-1 when exposed to As(V) in the presence of various levels of phosphate. Under high As(V) conditions, cells express a dramatically different proteome than when grown given only phosphate. Ongoing studies on the diversity and potential role of proteins and metabolites produced in the presence of As(V) will be reported. These investigations promise to inform the role and additional metabolic potential for As in biology. Arsenic assimilation into biomolecules contributes to the expanding set of chemical elements utilized by microbes in unusual environmental niches.

The work it describes is new, as it was done in John Tainer's lab at Lawrence Berkeley. Unfortunately there's not much meat. That's not surprising, since poster abstracts typically have to be submitted months in advance and the deadline for AGU seems to have been August 4. I can't find any tweets or other information about this poster - did anyone see it?

My collaborators have taken pity on me and sent me some of their control analysis data. This is mass-spectrometry analysis of a control DNA sample I sent several months ago.

The GFAJ-1 cells this DNA was purified from were grown in medium without arsenic, so we don't expect to find any arsenic in the DNA. This DNA was just used to test the methods they will use, but it also provides some measure of the purity of the DNA I sent them. That's because this DNA hasn't been put through the CsCl-density-gradient purifications step that they'll use for the new DNA samples I sent a few weeks ago.

I'm going to have to put in a bit of work before I have any idea of what I'm looking at. But I'm not complaining - this is exactly the part of science I like best.

I've just gotten an update from my collaborators. They're still polishing up their purification methods, and waiting for the mass spectrometer machine-time that will open up when other researchers take time off over Christmas.

So with luck we'll have a preliminary answer in a few weeks. Then we'll have to get busy generating the final data and writing our paper!