''In vitro'' selection refers to the pursuit of a biopolymer (protein or nucleic acid) with a new, desired function. Whether beginning from scratch, or starting with an existing sequence, the first step is to create a diverse library of variants, each with a different "fitness" (with respect to the desired function). The next step to enrich the library for variants with a higher fitness, thus increasing the overall fitness of the pool. Often the most difficult facet of a selection is the confinement of function such that a fit variant increases its own representation, without non-specifically helping its neighbors. The last step of a "round" of selection is the amplification of the recovered product and the regeneration of starting material. This allows for further cycles of enrichment and increased fitness. After several rounds of selection, individual variants are assayed for the desired functionality.

===Library Generation===

===Library Generation===

+

In order for ''in vitro'' selection to occur, there must be diversity in the starting material.

=====Randomized Oligodeoxynucleotides=====

=====Randomized Oligodeoxynucleotides=====

Randomized oligodeoxynucleotides are generated using traditional solid-phase phosphoramidite chemistry. At the "N" positions, an equimolar mixture of the 4 bases are used, thus giving an equal probability of each base at that position on a given nascent oligodeoxynucleotide.

Randomized oligodeoxynucleotides are generated using traditional solid-phase phosphoramidite chemistry. At the "N" positions, an equimolar mixture of the 4 bases are used, thus giving an equal probability of each base at that position on a given nascent oligodeoxynucleotide.

If there is a known, functional sequence that can serve as a starting point, then a pool can be mildly randomized (yielding a "doped" pool). In a doped pool, a mixture of all four bases are used, but the molarity is skewed to favor the wild-type base at each given position.

If there is a known, functional sequence that can serve as a starting point, then a pool can be mildly randomized (yielding a "doped" pool). In a doped pool, a mixture of all four bases are used, but the molarity is skewed to favor the wild-type base at each given position.

+

+

Randomized oligodeoxynucleotides be assembled into the final template, most often by PCR.

=====Mutagenic PCR=====

=====Mutagenic PCR=====

+

A wild-type starting sequence can be concomitantly mutated and amplified using mutagenic PCR. Infidelity can be achieved using an error-prone DNA polymerase, increasing the magnesium content, including manganese, using non-canonical dNTPs (e.g. 8-oxo-GTP), or altering the ratio of the dNTPs.

+

=====Gene Shuffling=====

=====Gene Shuffling=====

-

=====Neutral Drift=====

+

Multiple, related genes can be randomly recombined to form novel genes. Genes are randomly fragmented with DNaseI. Fragments from multiple genes are mixed together and thermocycled (with PCR reagents). Fragments from different parents anneal at regions of homology and are extended by a DNA Polymerase. Full-length, shuffled constructs are then amplified by normal PCR.

+

+

=====Mutator Strains=====

+

A plasmid can be propogated in a mutator strain of bacteria, which usually lack some DNA proofreading machinery. Alternatively, dNTP analogs can be added to the media to increase the mutation rate.

+

+

This method of library generation has the drawback that the entire plasmid is subject to mutation (ie mutation is not confined to the gene under selection).

===Increased Representation===

===Increased Representation===

Line 16:

Line 28:

The easiest function to enrich for is affinity for a ligand. To select for binders (aptamers for nucleic acids; antibodies and others for proteins,) one exposes the pool of potential binders to a fixed ligand. The best binders affix to the ligand while weaker binders are washed away. Those that remain can be amplified for further round of selection.

The easiest function to enrich for is affinity for a ligand. To select for binders (aptamers for nucleic acids; antibodies and others for proteins,) one exposes the pool of potential binders to a fixed ligand. The best binders affix to the ligand while weaker binders are washed away. Those that remain can be amplified for further round of selection.

-

For nucleic acids, the binder itself can be subject to amplification, as it is both the information-carrying and function-carrying molecule. In Ellington and Szostak (1990), A DNA library was created such a T7 promoter drives expression of an N100 RNA pool, flanked by constant regions. The pool is passed through a column with a bound ligand. Species that bind the ligand are retained on the column and weaker binders are washed away. Binders are eluted from the column and amplified to begin a new round.

+

For nucleic acids, the binder itself can be subject to amplification, as it is both the information-carrying and function-carrying molecule. In Ellington and Szostak<cite>Ellington1990</cite>, A DNA library was created such a T7 promoter drives expression of an N100 RNA pool, flanked by constant regions. The pool is passed through a column with a bound ligand. Species that bind the ligand are retained on the column and weaker binders are washed away. Binders are eluted from the column and amplified to begin a new round.

For protein binders, the scheme must include linking of the information-carrying nucleic acid to the function-carrying protein. Some examples of this linking are phage display, cell-surface display, and ribosome display.

For protein binders, the scheme must include linking of the information-carrying nucleic acid to the function-carrying protein. Some examples of this linking are phage display, cell-surface display, and ribosome display.

The successful selection of aptamers from random sequence and the existence of ribozymes led to Bartel and Szostak<cite>Bartel1993</cite> to address whether ribozymes could be selected from random sequence as well. A large random RNA pool was created with a constant semi-hairpin region at the 5'end and a constant 3' primer binding region on the 3' end. The pool was incubated with a substrate oligonucleotide that could complete the semi-hairpin and also as a 5' primer binding region. Only RNA molecules capable of covalently linking the substrate to itself would have both primer binding sites and thus be available as a PCR template. Thus, active sequences are selectively amplified at the expense of inactive ones. The result was several ribozyme ligases capable of forming 5'-3' or 5'-2' phosphodiester bonds.

+

=====Protection=====

=====Protection=====

-

The first ''in vitro'' evolved protein functions involved modification of the nucleic acid species that encoded it. Perhaps the first such function was protection of the DNA template. In Tawfik and Griffiths (1998) The template encodes HaeIII methyl transferase. Upon transcription and translation in bacterial lysate, active methyltransferase methylate HaeIII recognition sequences in the gene. The methylated genes are then protected from digestion by HaeIII endonuclease. Undigested templates are then amplified before the next round. The key to this experiment is the use of ''in vitro'' compartmentalization (described below)which allow an active methyltransferase to methylate its parent template, but not other templates in the pool.

+

The first ''in vitro'' evolved protein functions involved modification of the nucleic acid species that encoded it. Perhaps the first such function was protection of the DNA template. In Tawfik and Griffiths<cite>Tawfik1998</cite> The template encodes HaeIII methyl transferase. Upon transcription and translation in bacterial lysate, active methyltransferase methylate HaeIII recognition sequences in the gene. The methylated genes are then protected from digestion by HaeIII endonuclease. Undigested templates are then amplified before the next round. The key to this experiment is the use of ''in vitro'' compartmentalization (described below)which allow an active methyltransferase to methylate its parent template, but not other templates in the pool.

For much of the history of ''in vitro'' selection of protein function, the selection was for affinity or DNA modification. Griffiths and Tawfik<cite>Griffiths2003</cite> used ''in vitro'' compartmentalization and FACS to select for phosphotriesterase activity. In an elegant scheme, biotinylated template and biotinylated anti-HA antibody were attached to streptavidin beads and emulsified with cell lysate. Upon transcription and translation, the HA-phosphotriesterase protein bound to the antibody. The bead-template-antibody-enzyme complex is stable upon emulsion breaking, and is re-emulsified with biotinylate substrate molecules. Active enzyme converts substrate to product, and the product-biotin complex binds to the bead. The bead-template-antibody-enzyme-product complexes are stable upon breaking the emulsion and the complex is incubated with anti-product antibodies. Thus, beads harboring genes encoding active enzymes are preferentially labeled by the anti-product antibody. The use of fluorescent antibodies and FACS allows enrichment of fluorescent beads and therefore active genes.

+

+

=====Host Growth=====

+

A plasmid expressing some gene that confers a growth advantage to its host will become over-represented relative to plasmids containing less functional versions of the gene.

===Confinement of Function===

===Confinement of Function===

+

Early ''in vitro'' selection experiments focused on binding (Ellington 1990) or cis-action (in which the species itself is modified;Bartel 1993). However, more complicated function require the confinement of activity such that an active variant does not increase the proportion of other members of the library. Thus, selections often employ compartmentalization, such that an active species (usually a protein) and its template (usually DNA) are confined to a compartment and segregated from other members of the library

-

=====Binding=====

-

=====Cis-action=====

=====Cellularization=====

=====Cellularization=====

-

=====''In vitro'' Compartmentalization=====

+

Transformation of a plasmid into bacteria usually results in one plasmid per surviving cell. The bacteria can then serve as a compartment containing the DNA template and the protein product.

+

=====''In vitro'' Compartmentalization=====

+

Mechanical mixing of water in oil and surfactant leads to an emulsion of stable water droplets in a continuous oil phase. The size of the droplets is inversely proportional to the amount of energy put into the system. The compartments prevent functional molecules from affecting the templates in other compartments.

Overview of in vitro selection

In vitro selection refers to the pursuit of a biopolymer (protein or nucleic acid) with a new, desired function. Whether beginning from scratch, or starting with an existing sequence, the first step is to create a diverse library of variants, each with a different "fitness" (with respect to the desired function). The next step to enrich the library for variants with a higher fitness, thus increasing the overall fitness of the pool. Often the most difficult facet of a selection is the confinement of function such that a fit variant increases its own representation, without non-specifically helping its neighbors. The last step of a "round" of selection is the amplification of the recovered product and the regeneration of starting material. This allows for further cycles of enrichment and increased fitness. After several rounds of selection, individual variants are assayed for the desired functionality.

Library Generation

In order for in vitro selection to occur, there must be diversity in the starting material.

Randomized Oligodeoxynucleotides

Randomized oligodeoxynucleotides are generated using traditional solid-phase phosphoramidite chemistry. At the "N" positions, an equimolar mixture of the 4 bases are used, thus giving an equal probability of each base at that position on a given nascent oligodeoxynucleotide.

If there is a known, functional sequence that can serve as a starting point, then a pool can be mildly randomized (yielding a "doped" pool). In a doped pool, a mixture of all four bases are used, but the molarity is skewed to favor the wild-type base at each given position.

Randomized oligodeoxynucleotides be assembled into the final template, most often by PCR.

Mutagenic PCR

A wild-type starting sequence can be concomitantly mutated and amplified using mutagenic PCR. Infidelity can be achieved using an error-prone DNA polymerase, increasing the magnesium content, including manganese, using non-canonical dNTPs (e.g. 8-oxo-GTP), or altering the ratio of the dNTPs.

Gene Shuffling

Multiple, related genes can be randomly recombined to form novel genes. Genes are randomly fragmented with DNaseI. Fragments from multiple genes are mixed together and thermocycled (with PCR reagents). Fragments from different parents anneal at regions of homology and are extended by a DNA Polymerase. Full-length, shuffled constructs are then amplified by normal PCR.

Mutator Strains

A plasmid can be propogated in a mutator strain of bacteria, which usually lack some DNA proofreading machinery. Alternatively, dNTP analogs can be added to the media to increase the mutation rate.

This method of library generation has the drawback that the entire plasmid is subject to mutation (ie mutation is not confined to the gene under selection).

Increased Representation

Once diversity is created, the selection must must allow function variants to become a larger percentage of the pool. This step is often the most difficult to design.

Affinity

The easiest function to enrich for is affinity for a ligand. To select for binders (aptamers for nucleic acids; antibodies and others for proteins,) one exposes the pool of potential binders to a fixed ligand. The best binders affix to the ligand while weaker binders are washed away. Those that remain can be amplified for further round of selection.

For nucleic acids, the binder itself can be subject to amplification, as it is both the information-carrying and function-carrying molecule. In Ellington and Szostak[1], A DNA library was created such a T7 promoter drives expression of an N100 RNA pool, flanked by constant regions. The pool is passed through a column with a bound ligand. Species that bind the ligand are retained on the column and weaker binders are washed away. Binders are eluted from the column and amplified to begin a new round.

For protein binders, the scheme must include linking of the information-carrying nucleic acid to the function-carrying protein. Some examples of this linking are phage display, cell-surface display, and ribosome display.

Selective Amplification

Bartel and Szostak, 1993. Science

The successful selection of aptamers from random sequence and the existence of ribozymes led to Bartel and Szostak[2] to address whether ribozymes could be selected from random sequence as well. A large random RNA pool was created with a constant semi-hairpin region at the 5'end and a constant 3' primer binding region on the 3' end. The pool was incubated with a substrate oligonucleotide that could complete the semi-hairpin and also as a 5' primer binding region. Only RNA molecules capable of covalently linking the substrate to itself would have both primer binding sites and thus be available as a PCR template. Thus, active sequences are selectively amplified at the expense of inactive ones. The result was several ribozyme ligases capable of forming 5'-3' or 5'-2' phosphodiester bonds.

Protection

The first in vitro evolved protein functions involved modification of the nucleic acid species that encoded it. Perhaps the first such function was protection of the DNA template. In Tawfik and Griffiths[3] The template encodes HaeIII methyl transferase. Upon transcription and translation in bacterial lysate, active methyltransferase methylate HaeIII recognition sequences in the gene. The methylated genes are then protected from digestion by HaeIII endonuclease. Undigested templates are then amplified before the next round. The key to this experiment is the use of in vitro compartmentalization (described below)which allow an active methyltransferase to methylate its parent template, but not other templates in the pool.

FACS

Griffiths and Tawfik, 2003. EMBO

For much of the history of in vitro selection of protein function, the selection was for affinity or DNA modification. Griffiths and Tawfik[4] used in vitro compartmentalization and FACS to select for phosphotriesterase activity. In an elegant scheme, biotinylated template and biotinylated anti-HA antibody were attached to streptavidin beads and emulsified with cell lysate. Upon transcription and translation, the HA-phosphotriesterase protein bound to the antibody. The bead-template-antibody-enzyme complex is stable upon emulsion breaking, and is re-emulsified with biotinylate substrate molecules. Active enzyme converts substrate to product, and the product-biotin complex binds to the bead. The bead-template-antibody-enzyme-product complexes are stable upon breaking the emulsion and the complex is incubated with anti-product antibodies. Thus, beads harboring genes encoding active enzymes are preferentially labeled by the anti-product antibody. The use of fluorescent antibodies and FACS allows enrichment of fluorescent beads and therefore active genes.

Host Growth

A plasmid expressing some gene that confers a growth advantage to its host will become over-represented relative to plasmids containing less functional versions of the gene.

Confinement of Function

Early in vitro selection experiments focused on binding (Ellington 1990) or cis-action (in which the species itself is modified;Bartel 1993). However, more complicated function require the confinement of activity such that an active variant does not increase the proportion of other members of the library. Thus, selections often employ compartmentalization, such that an active species (usually a protein) and its template (usually DNA) are confined to a compartment and segregated from other members of the library

Cellularization

Transformation of a plasmid into bacteria usually results in one plasmid per surviving cell. The bacteria can then serve as a compartment containing the DNA template and the protein product.

In vitro Compartmentalization

Mechanical mixing of water in oil and surfactant leads to an emulsion of stable water droplets in a continuous oil phase. The size of the droplets is inversely proportional to the amount of energy put into the system. The compartments prevent functional molecules from affecting the templates in other compartments.