3.GTA particles come in two sizes.Small particles contain 4 kb DNA fragments.The hypothetical large particles contain
fragments that must be at least 14 kb (the size of the GTA gene cluster) but
could be as big as 50 kb.

4.The number of GTA particles a cell produces does
not depend on the proportion of small and large particles.

5.DNA packaging by GTA is random; all parts of the
cell’s genome are equally represented.But in this model we only consider the particles containing the
full-length GTA cluster.

6.This is the killer: If the cell’s
chromosome is 5 MB and the large-particle capacity is 15 kb, only 2x10-4
of large particles will contain complete GTA gene clusters (will be G+
particles).If we change the
large-particle capacity to 20 kb, then about 1x10-3 of large
particles will contain a complete cluster.A 50 kb capacity and a 3 MB chromosome would probably get it up to about
10-2.(And this
ignores the recombination machinery’s need for homologous DNA flanking the GTA
cluster to promote recombination.)

Transduction:

7.GTA- cells completely lack the main GTA gene
cluster.They can only be converted to
GTA+ by homologous recombination with GTA-containing DNA from G+ particles.

8.GTA particles cannot tell the difference between
GTA+ and GTA- recipients.Particles
capable of transducing GTA- cells to GTA+ can also ‘transduce’ GTA+ cells to
GTA+.

9.All GTA particles produced in one cycle are
taken up by and transduce cells in that cycle.(The efficiency of infection and recombination is 1.)

10.The
model ignores large and small GTA particles that don’t transduce GTA+.

11.Each
cell takes up only one G+ particle (or none).This is reasonable, since the number of G+ particles is always going to
be much smaller than the number of cells.

bNumber of GTA particles produced by each burst.Default value is 100.(We have no actual measurements.)

µFraction of GTA particles that are large.(We expect this fraction to be small, since
large particles have not been observed.)

TFraction of large GTA particles that are G+ particles (able to
transduce GTA).(This is limited by
genome size, GTA gene cluster size, and the DNA capacity of these hypothetical
particles.Plausible values are between
10-2 and 10-4.)

Gµ
* T Fraction of GTA particles that contain complete GTA genes.

What happens in one generation:

GTA production and cell lysis:

NProportion of GTA particles to cells remaining in the medium after
GTA+ cells have burst.

=
(Fcb)/(1 – Fc)(Note: Fcb
is the GTA production per original cell. 1 – Fc
normalizes this to the number of cells remaining after lysis.)

N+Proportion of GTA particles, per remaining cell,
that carry the complete GTA gene cluster (are ‘G+’ particles able to transduce
the GTA-production genotype to GTA- cells).

If the culture were more dilute or
poorly mixed, some GTA particles would not find new cells to attach to.This would reduce the amount of transduction
(effectively reducing W).

GTA
production:

3.GTA particles come in two sizes.Small particles contain 4 kb DNA fragments.The hypothetical large particles contain
fragments that must be at least 14 kb (the size of the GTA gene cluster) but
could be as big as 50 kb.

This is the central assumption of the
model.The size of the small particles
is known.The hypothesized large
particles could be as small as 15 kb (allows a bit of homologous sequence on
each side of the cluster to promote recombination).Phage capsids can in principle be very large,
but it’s parsimonious to assume a modest size.

4.The number of GTA particles a cell produces
does not depend on the proportion of small and large particles.

Large capsids will require more
capsid protein molecules.

5.DNA packaging by GTA is random; all parts of
the cell’s genome are equally represented.But in this model we only consider the particles containing the
full-length GTA cluster.

Experimental results show slightly less
packaging of GTA sequences.If this
applies to the hypothetical large particles it would reduce production of G+
particles.If particles preferentially
package GTA, GTA would be a phage.

6.This is the killer: If the cell’s chromosome is 5 MB and the
large-particle capacity is 15 kb, only 2x10-4 of large particles
will contain complete GTA gene clusters (will be G+ particles).If we change the large-particle capacity to
20 kb, then about 1x10-3 of large particles will contain a complete
cluster.A 50 kb capacity and a 3 MB
chromosome would probably get it up to about 10-2.(And this ignores the recombination
machinery’s need for homologous DNA flanking the GTA cluster to promote
recombination.)

See point 3 above.

Transduction:

7.GTA- cells completely lack the main GTA gene
cluster.They can only be converted to
GTA+ by G+ particles.

Transduction depends on homologous
recombination.Small GTA particles can
transduce functional alleles of individual GTA genes, replacing versions that
became mutated or even deleted in an ancestor of the recipient cell.But they cannot introduce GTA genes into
cells that completely lack the GTA cluster, because there will be no homologous
sequences to recombine with.

8.GTA particles cannot tell the difference
between GTA+ and GTA- recipients.Particles capable of transducing GTA- cells to GTA+ can also ‘transduce’
GTA+ cells to GTA+.

I think some phages and conjugative
plasmids may be able to detect whether potential hosts/recipients already have
the element, but we have no evidence that transduction frequencies differ
between GTA+ and GTA- recipients.Wall
et al (1975) surveyed 33 strains and found wide variation in both GA production
and transduction, but no correlation between these abilities.

9.All GTA particles produced in one cycle are
taken up by and transduce cells in that cycle.(The efficiency of infection and recombination is 1.)

This is unlikely to be true, but assuming
this increases the chance that each G+ particle successfully transduces a GTA-
cell to GTA+.

If we were to relax this assumption the
model would need to include an explicit uptake process and to specify what
happens to particles that are not taken up.

10. The model ignores large and small
GTA particles that don’t transduce GTA+.

This should be OK, since these should
not interfere with transduction by G+ particles, especially because their total
number per cell will be small. Removing this assumption would make GTA + spread
less likely.

11. Each cell takes up only one G+
particle (or none).

This is a reasonable assumption, since
the number of G+ particles is always going to be much smaller than the number
of cells.If the number of G+ particles
were high, sometimes two G+ particles might inject their DNAs into the same
s=cell, which would reduce the efficiency of transduction.

I'm at Dartmouth for three months, working with Olga Zhaxybayeva's group to improve our evolutionary understanding of Gene Transfer Agent. I'm writing an R-script simulation of the genetic exchange it causes (finally learning R), but my control runs with epistasis don't give the expected results. So I'm writing this post and creating a Powerpoint deck to clarify my thinking.

First, what's Gene Transfer Agent? A number of different kinds of bacteria produce 'transducing particles' called Gene Transfer Agents. These look line small phage capsids but they don't usually contain phage DNA; instead they contain random fragments of chromosomal DNA. In the best-characterized GTA ('RcGTA'), these are all 4.4 kb in length, which appears to be the DNA capacity of the tiny GTA heads. Like phage, GTA particles inject their DNA into recipient cells (usually of the same species), where it often recombines with the chromosome and can change the cell's genotype.

GTA particles aren't infectious like phages are, both because they don't preferentially package the DNA that encodes them and because their heads are too small to contain this DNA. The RcGTA head and tail proteins are encoded by a 14 kb gene cluster. The sequences and organization of these genes strongly resemble that of homologous phage genes, so the known GTA systems are generally thought to have descended from what were integrated prophages.

In lab cultures of cells with the RcGTA genes (Rhodobacter capsulatus cells), GTA is produced mainly after exponential growth has ceased, and only produced by a small subset of cells. Like release of phage particles from infected cells, release of GTA requires lysis of the cell, and the genes for the holin and endolysin proteins are encoded separately from the main RcGTA cluster.

There are good reasons to think that GTAs are not simply defective prophages that still can package small DNA fragments:

The main RcGTA gene cluster has been somewhat stably
inherited over a very long time, maybe a more than a billion years. Some descendants have lost all the genes, but about 25% of the 225 alpha-proteobacterial genomes examined have retained versions of a single large cluster, typically containing 14-17 co-transcribed genes, most of which encode capsid head and tail proteins.

Expression of this gene cluster is at least partly controlled by cellular regulatory mechanisms.

Other genes, at other chromosomal locations, are also needed for efficient RcGTA production.

I just crunched some numbers from a detailed phylogenetic tree for the alpha-proteobacteria showing which taxa have GTA. The large GTA cluster is only found in a subclade (148 taxa, 109 distinct species names); the authors estimate that this subclade is 1.0 - 1.4 billion years old. 57% of the taxa in this subclade have the large GTA gene cluster.

My goal for these three months is to generate models of GTA evolution (probably computer simulations) that evaluate the following candidate explanations for its persistence:

Flawed model for Explanation 1: Nobody has seen the large heads postulated by Explanation 1, but nobody has explicitly looked for them. The Zhaxybayeva lab already has an unpublished mathematical model that addresses this exp lanati on, created by a mathematically-inclined former post-doc. It asks how frequent such heads would need to be in order to maintain GTA-producing cells in a mixed population of GTA+ cells and GTA- cells lacking the gene cluster. The model assumes that large heads are produced at frequency µ, and that these inject the GTA gene cluster into GTA- cells, converting them into GTA+ cells. Only a small fraction of GTA+ cells are activated to produce GTA in any one generation, and these lyse after GTA production.

The conclusion from this model is that GTA+ cells can persist at high frequency even if they only make large particle for every 10^5 normal small particles. Because the model assumed a reasonable 'burst size' of 100 GTA particles per producer cell, this means that GTA+ can persist if only one cell in a thousand produces a single large particle.

But I didn't think this result could be correct. Since each cell lysis destroys a GTA+ cell and only one in a thousand creates a new GTA+ cell from a GTA- cell, the GTA+ population should be continually decreasing. Production of new GTA+ cells only compensates for 0.1% of the loss of GTA+ cells.

I initially had a hard time fully understanding the mathematics of this model. It included expressions for logistic growth, which complicated the math without adding anything to its utility. So I created my own version of this model, which gave a very different answer.

New model for Explanation 1: I'm going to put the description of this model into another post, because here I want to get on to my beneficial recombination model. Bottom line: the model's result is that transduction of the GTA gene cluster by large-head GTA particles can't come close to maintaining GTA+ cells in a mixed population even if every cell produces a large-head particle. This is because:

All cells that produce GTA die;

Only a small fraction of large-head particles will contain a complete gene cluster (maybe 0.1 to 1%);

Except when GTA+ cells are rare, many particles will attach to GTA+ cells rather than to GTA- cells;

In a natural environment many GTA particles will fail to find recipients. (This issue isn't part of the model.)

To overcome these obstacles each GTA-producing cell would need to produce more than 1000 (10,000? 100,000?) large-head particles.

Finding the flaw in the lab's model: Assuming that I understand the lab's model correctly, the main error is that it 'corrects' for the probability that a GTA particle will attach to a GTA+ cell rather than a GTA0- cell by multiplying by the number of GTA- cells rather than by their frequency. Since the model assumes populations of 10^7 to 10^9 cells, this overestimates the amount of transduction by orders of magnitude, leading to a comparable underestimate of the frequency of large heads needed to maintain GTA+.

Model for Explanation 2: I modified the basic structure of my Explanation 1 model to consider a related hypothesis. Defective alleles of GTA genes are expected to arise by random mutation. At least some of these will also prevent the cell from lysing when GTA production is induced. These cells can still receive functional alleles of their defective genes from GTA particles produced by 'wildtype' cells, but they can't transmit their defective alleles to the wildtype cells because they can't produce GTA. This asymmetry favours spread of functional alleles, and might be able to maintain GTA, although it wouldn't allow GTA+ to spread to cells that completely lack the GTA genes.

Like the model for Explanation 1, the result is a strong NO. Because the models are very similar, it's not surprising (in retrospect) that spread of functional alleles faces the same obstacles

All cells that produce GTA die;

Only a small fraction (about 0.1%) of particles will contain whatever GTA gene is mutated in a recipient cell;

Except when GTA+ cells are rare, many particles will attach to cells with the functional allele rather than to those with mutated allele;

In a natural environment many GTA particles will fail to find recipients. (This issue isn't part of the model.)

To overcome these obstacles each GTA-producing cell would need to produce more than 1000 (10,000? 100,000?) large-head particles.

Models for Explanation 3: Most microbiologists assume that GTAs are maintained in their genomes by selection for presumed benefits of chromosomal recombination. They implicitly assume that randomizing the combinations of chromosomal alleles in a population creates a benefit strong enough to overcome the cost of the cell death associated with GTA production. They don't explicitly assume this, because they're not used to thinking rigorously about evolutionary processes. Instead their explanation usually relies on GTA-mediated recombination creating some specific beneficial new combination, and ignores the selective costs associated with other combinations.

In fact, many very smart people have spent many years looking for conditions where random chromosomal recombination creates benefits strong enough to maintain the genes that cause it. These 'evolution of sex' models have identified some conditions, but usually these benefits are small and occur only under special circumstances. Most of the time recombination appears to be a waste of time at best.Recombination Model 1: Way back when I was a new post-doc spending a year in Dick Lewontin's lab, I developed a computer-simulation model of recombination by natural transformation (Redfield 1988, Evolution of bacterrial transformaiton: Is sex with dead cells ever better than no sex at all?). In this model I applied a relatively simple model of the evolution of sex to a population of naturally competent bacteria. My first goal for addressing Explanation 3 is to adapt this model so it applies to recombination caused b GTA rather than by natural transformation. I'll describe my progress (and current deadlock) in the next post.

Recombination Model 2: Model 1 is 'deterministic'; it ignores random ('stochastic') events, effectively assuming that the population is infinitely large. But the strongest benefits of recombination are now thought to arise from precisely the stochastic effects Model 1 ignores. So I also want to make a stochastic model that tracks individual cells, or at least a model that takes stochastic processes into account. I haven't started writing this model yet, but I might pattern it on the transformation model described by Takeuchi et al, 2014.