Bio-Molecular Computing

Published on Dec 02, 2015

Abstract

Molecular computing is an emerging field to which chemistry, biophysics, molecular biology, electronic engineering, solid state physics and computer science contribute to a large extent. It involves the encoding, manipulation and retrieval of information at a macromolecular level in contrast to the current techniques, which accomplish the above functions via IC miniaturization of bulk devices.

The biological systems have unique abilities such as pattern recognition, learning, self-assembly and self-reproduction as well as high speed and parallel information processing. The aim of this article is to exploit these characteristics to build computing systems, which have many advantages over their inorganic (Si,Ge) counterparts.

DNA computing began in 1994 when Leonard Adleman proved thatDNA computing was possible by finding a solution to a real- problem, a Hamiltonian Path Problem, known to us as the Traveling Salesman Problem,with a molecular computer. In theoretical terms, some scientists say the actual beginnings of DNA computation should be attributed to Charles Bennett's work.

Adleman, now considered the father of DNA computing, is a professor at the University of Southern California and spawned the field with his paper, "Molecular Computation of Solutions of Combinatorial Problems." Since then, Adleman has demonstrated how the massive parallelism of a trillion DNA strands can simultaneously attack different aspects of a computation to crack even the toughest combinatorial problems.

Adleman's Traveling Salesman Problem:

The objective is to find a path from start to end going through all the points only once. This problem is difficult for conventional computers to solve because it is a "non-deterministic polynomial time problem" . These problems, when they involve large numbers, are intractable with conventional computers, but can be solved using massively parallel computers like DNA computers. The Hamiltonian Path problem was chosen by Adleman because it is known problem.

The following algorithm solves the Hamiltonian Path problem:

1.Generate random paths through the graph.

2.Keep only those paths that begin with the start city (A) and conclude with the
end city (G).

3.If the graph has n cities, keep only those paths with n cities. (n=7)

4.Keep only those paths that enter all cities at least once.

5.Any remaining paths are solutions.

The key was using DNA to perform the five steps in the above algorithm. Adleman's first step was to synthesize DNA strands of known sequences, each strand 20 nucleotides long. He represented each of the six vertices of the path by a separate strand, and further represented each edge between two consecutive vertices, such as 1 to 2, by a DNA strand which consisted of the last ten nucleotides of the strand representing vertex 1 plus the first 10 nucleotides of the vertex 2 strand.

Then, through the sheer amount of DNA molecules (3x1013 copies for each edge in this experiment!) joining together in all possible combinations, many random paths were generated. Adleman used well-established techniques of molecular biology to weed out the Hamiltonian path, the one that entered all vertices, starting at one and ending at six.

After generating the numerous random paths in the first step, he used polymerase chain reaction (PCR) to amplify and keep only the paths that began on vertex 1 and ended at vertex 6. The next two steps kept only those strands that passed through six vertices, entering each vertex at least once. At this point, any paths that remained would code for a Hamiltonian path, thus solving the problem.

How do they work?

DNA is the major information storage molecule in living cells, and billions of years of evolution have tested and refined both this wonderful informational molecule and highly specific enzymes that can either duplicate the information in DNA molecules or transmit this information to other DNA molecules.

Instead of using electrical impulses to represent bits of information, the DNA computer uses the chemical properties of these molecules by examining the patterns of combination or growth of the molecules or strings. DNA can do this through the manufacture of enzymes, which are biological catalysts that could be called the 'software' used to execute the desired calculation.

DNA computers use deoxyribonucleic acids—A (adenine), C (cytosine), G (guanine) and T (thymine)--as the memory units, and recombinant DNA techniques already in existence carry out the fundamental operations. In a DNA computer, computation takes place in test tubes or on a glass slide coated in 24K gold. The input and output are both strands of DNA, whose genetic sequences encode certain information. A program on a DNA computer is executed as a series of biochemical operations, which have the effect of synthesizing, extracting, modifying and cloning the DNA strands. Their potential power underscores how nature could be capable of crunching number better and faster than the most advanced silicon chips.

The only fundamental difference between conventional computers and DNA computers is the capacity of memory units: electronic computers have two positions (on or off), whereas DNA has four (C, G, A or T). The study of bacteria has shown that restriction enzymes can be employed to cut DNA at a specific word(W). Many restriction enzymes cut the two strands of double-stranded DNA at different positions leaving overhangs of single-stranded DNA. Two pieces of DNA may be rejoined if their terminal overhangs are complementary. Complements are referred to as 'sticky ends'. Using these operations, fragments of DNA may be inserted or deleted from the DNA.

As stated earlier DNA represents information as a pattern of molecules on a strand. Each strand represents one possible answer. In each experiment, the DNA is tailored so that all conceivable answers to a particular problem are included. Researchers then subject all the molecules to precise chemical reactions that imitate the computational abilities of a traditional computer. Because molecules that make up DNA bind together in predictable ways, it gives a powerful "search" function. If the experiment works, the DNA computer weeds out all the wrong answers, leaving one molecule or more with the right answer. All these molecules can work together at once, so you could theoretically have 10 trillion calculations going on at the same time in very little space.

DNA computing is a field that holds the promise of ultra-dense systems that pack megabytes of information into devices the size of a silicon transistor. Each molecule of DNA is roughly equivalent to a little computer chip. Conventional computers represent information in terms of O's and 1's, physically expressed in terms of the flow of electrons through logical
circuits, whereas DNA computers represent information in terms of the chemical units of DNA. Computing with an ordinary computer is done with a program that instructs electrons to travel on particular paths; with a DNA computer, computation requires synthesizing particular sequences of DNA and letting them react in a test tube or on a glass plate. In a scheme devised by Richard Lipton, the logical command "and" is performed by separating DNA strands according to their sequences, and the command "or" is done by pouring together DNA solutions containing specific sequences, merging.

By forcing DNA molecules to generate different chemical states, which can then be examined to determine an answer to a problem by combination of molecules into strands or the separation of strands, the answer is obtained.

Most of the possible answers are incorrect, but one or a few may be correct, and the computer's task is to check each of them and remove the incorrect ones using restrictive enzymes. The DNA computer does that by subjecting all of the strands simultaneously to a series of chemical reactions that mimic the mathematical computations an electronic computer would perform on each possible answer. When the chemical reactions are complete, researchers analyze the strands to find the answer ~ for instance, by locating the longest or the shortest strand and decoding it to determine what answer it represents.

Computers based on molecules like DNA will not have a vonNeumann architecture, but instead function best in parallel processing applications. They are considered promising for problems that can have multiple computations going on at the same time. Say for instance, all branches of a search tree could be searched at once in a molecular system while vonNeumann systems must explore each possible path in some sequence.

Information is stored in DNA as CG or AT base pairs
with maximum information density of 2bits per DNA base location. Information on a solid surface is stored in a NON-ADDRESSED array of DNA words(W) of a fixed length (16 mers). DNA Words are linked together to form large combinatorial sets of molecules. DNA computers are massively parallel, while electronic computers would require additional hardware, DNA computers just need more DNA. This could make the DNA computer more efficient, as well as more easily programmable.