As everyone who has studied biology in the last
30 years must know, proteins are made from mRNA which is made from
DNA, and this is performed by a simple coding mechanism; a three
base segment of DNA, called a codon, is translated into a particular
amino acid. Since there are 4 bases there are 4 x 4 x 4 = 64 possible
codons. 3 of these codons in humans code for "stop", the end of a
polypeptide chain. The other 61 code each for particular amino acids.
Since there are only 20 genetically encoded amino acids, most amino
acids are coded for by multiple codons. Bacteria and mammals prefer
to use different codons, so that mammalian genes frequently use codons
which are rarely employed in bacteria and vice
versa. So it can happen that
a mammalian gene may be expressed very poorly in bacteria,
a significant problem. A way to deal with this is to look at the
mammalian sequence, figure out which codons are optimal for bacterial
expression, and synthesize an appropropriate DNA sequence specifically
to efficiently express the mammalian gene in bacteria. To do this
you find out which codons are the most widely used in the
species of interest, and synthesize a DNA sequence made up of these.
However, figuring this out is time consuming to do by hand. An alternative
is to set up software on your computer to do this, some of which
is available for free on the internet. Or you could just use our
simple on-line program which will not clog up your hard disk.

The program will ignore numbers, spaces
or characters like B or Z which do not correspond to one of the amino
acids. This program
can also deal with FASTA format sequences (see here for
info on that); it ignores any line of text which is started by a ">" character.

Paste or type your protein sequence in box below,
can be upper or lowercase, program will read either and both.

How This Works: We
made use of the codon tables which can downloaded from the excellent Codon
Usage Database, maintained by the Department of Plant Gene Research
in Kazusa, Japan. This database tabulates codon usage in a stunning variety
of species; we extracted the values of Homo
Sapiens, Spodoptera frugiperda, Saccharomyces
cerevisiae and Escherichia coli strain CFT073. These values
were used for mammalian, insect, yeast and bacteria respectively.
The mammalian values should be appropriate for expression in HeLa, Hek293,
Cos7 and other mammalian cells, the insect for expression in SF9 cells
(which were derived from S. frugiperda, a type of moth), the yeast
values for expression in S. cerevisiae, and the bacterial values
for expression in E. coli. The program does not take into account the possible negative effects of palidromic or other self complementary sequences, which can cause the mRNA transcript to form hairpin structures which may affect expression.

Disclaimer: This
program was constructed primarily to save time for busy researchers
around the world. We hope you find it useful, and we are confident
that it is accurate and reliable. However, we cannot be held responsible
for any problems which may arise as a result of the use of this program.