GeneticsDNA II: The Structure of DNA

Did you know that the precise combinations of just four nitrogen bases form the billions of nucleotides that make up our own unique DNA molecules? The information stored in the base sequence of a single DNA strand stores all of the genetic information in your body and gives us our individual genetic traits.

Summary

Exploration of the structure of DNA sheds light on fascinating properties of the molecule. This module, the second in a series, highlights major discoveries, from the parts of a nucleotide - the building blocks of DNA - to the double helix structure of the DNA molecule. The module describes scientific developments that led to an understanding of the mechanism by which DNA replicates itself.

Terms you should know

pair (noun) = a set of two; two similar things that form a unit; two similar things that are used together

pair (verb) = to arrange in a set of two; to become grouped together with one other similar thing

strand = a long, thin piece of something; a length of something thin like string

Look around you. Most objects you are familiar with will eventually fall into ruin if not constantly maintained: a car will eventually rust and fall to pieces; a house will spring leaks in the roof and fall to the ground; even mountain ranges are eroded by wind and rain. Yet, life on Earth continues to flourish. Your children are no weaker or more likely to fall to pieces than you are. This is because living things have a fascinating and somewhat unique ability to reproduce and make "copies" of themselves. To do this, they must first copy their genetic material, their DNA (see our DNA I module for more information). And it is the unique chemical properties of DNA that allow it to generate copies of itself. As we all know, living things do eventually age and deteriorate, much like the old house and rusty car, but by making copies of our DNA and passing it to our offspring, life continues.

The building blocks of DNA

Scientists first began to investigate the unique chemical properties of DNA long before the structure of the molecule was understood, and even before DNA was discovered to be the genetic material. In the late 1800s, J. Friedrich Miescher, a Swiss chemist working in Germany, was studying white blood cells (leukocytes). Because white blood cells are the principal component of pus, Miescher would go to the nearby hospital and collect pus from used bandages. He found that the nucleus of these cells was rich in a then-unknown substance that contained several elements, among them phosphorous and nitrogen. He called this substance "nuclein" because it was found in the nucleus of the cells. We now know that Miescher's "nuclein" (later renamed nucleic acid, for its acidic chemical properties) contained DNA.

In the early 1900s, the Lithuanian-American biochemist Phoebus Levene, probed deeper into the chemical composition of nucleic acid and was able to further purify the material. Although Levene was not the first scientist to successfully purify DNA, he was uniquely qualified to correctly determine its composition – he had extensive expertise in the area of carbohydrate and sugar chemistry. When Levene analyzed the chemical properties of nucleic acid, he discovered that DNA was abundant in three things: five-carbon sugars (pentoses), phosphate (as Miescher had previously found), and nitrogen bases. Thus, Levene correctly deduced that the DNA molecule was made of smaller molecules linked together, and these smaller molecules, which he named nucleotides, were made of three parts – a five-carbon sugar, a phosphate group (PO4), and one of four possible nitrogen bases – adenine, cytosine, guanine, or thymine (often abbreviated A, C, G, and T).

Levene was correct in identifying the three parts of a nucleotide, and determining that nucleotides were linked together to make DNA; however, in 1928, he also incorrectly proposed that one of each of the four nucleotides was linked together in a small circular molecule and that these "tetranucleotides" were the basis of DNA (Levene and London,1928) (Figure 1).

Because he thought DNA was a simple circular structure, Levene rejected the notion that it could be the genetic material and sided firmly with those who believed that proteins contained the genetic code of organisms. However, much later, in the 1940s, Austrian-American scientist Erwin Chargaff reported that DNA from various species of life forms had different amounts of the four nucleotides (Vischer and Chargaff,1948). This strongly argued against Levene's hypothesis that DNA was simply a circular tetranucleotide, and scientists began to propose other possible structures of the DNA molecule. Despite what he got wrong, Levene's contributions to our understanding of the DNA molecule were substantial.

Thanks to the work of Levene and several others, the chemical structure of the individual nucleotides was established by the early 1910s. Below are diagrams of the three parts of a nucleotide (Figure 2).

Figure 2: A nucleotide. The five-carbon sugar deoxyribose forms the center of the molecule. Attached to carbon #1 is the nitrogen base, and attached to carbon #5 is the phosphate group (there may be 1, 2, or 3 phosphates in a nucleotide)

The sugar deoxyribose gets its name because when it was discovered (by Levene), it was found to lack one oxygen atom when compared to another sugar he discovered called ribose (Figure 3).

Figure 3: Ribose vs. Deoxyribose. These two pentoses, or five carbon sugars, differ only in the presence of an oxygen on ribose at the #2 carbon. At the #2 carbon of deoxyribose, a H exists in place of the ?OH group on ribose; however, lone hydrogens are often omitted from drawings of organic molecules, as above.

The oxygen missing from deoxyribose is on carbon #2, thus the full name of the sugar is 2'-deoxyribose. (In biochemistry, the carbons in sugar groups are often numbered with the "prime" symbol (as in 2'), to clarify that the carbon referred to is in the sugar and not another part of the molecule.)

Levene correctly deduced the connections between the nucleotides, and the chemical name for these connections are "phosphodiester bonds." These bonds are often casually referred to as "5' to 3' connections" because a phosphate molecule (PO4) serves as the bridge between the 5' carbon of one nucleotide and the 3' carbon of the next (Figure 4).

Although Levene originally thought that four nucleotides were connected together in a circular molecule, we now know that the individual nucleotides are connected to form a very long linear structure (Figure 5).

The four nucleotides of DNA are grouped into two "families" based on their chemical structure: the purines, adenine and guanine, have a structure with two rings; and the pyrimidines, cytosine and thymine, have only one ring (Figure 6).

Figure 6: The nitrogen bases. Shown here are the four different nitrogen bases found in DNA nucleotides. Note that guanine and adenine, the purines, have two rings, while cytosine and thymine, the pyrimidines, have only one ring.

Thus, the strands of DNA inside our cells are polymers of repeating units of nucleotides. It is the precise order, or sequence, of the billions of nucleotides – As, Cs, Gs, and Ts – that make up our own unique DNA molecules and give us our individual genetic traits.

Comprehension Checkpoint

Nucleotides are

The discovery of the double helix

Once the building blocks of DNA were fully understood, by the late 1940s and early 1950s, scientists began to study the larger structure of DNA by taking X-raydiffraction pictures of purified DNA molecules. However, the pictures they took were not consistent with a simple linear strand of nucleotides, as depicted in Figure 5. Instead, the pictures argued that DNA is even more complex and has a very regular and symmetrical shape.

A number of scientists began to propose possible structures for the DNAmolecule based on this research. Because the pictures argued for a symmetrical shape and chemical evidence argued that DNA was a polymer of nucleotides, many scientists thought that multiple strands wrapped around each other, like a braid or a rope. In fact, Linus Pauling, a prominent American scientist, had envisioned that DNA might be a triple helix – three strands of nucleotides wrapping around each other. Pauling, who would later win a Nobel Prize for correctly deducing the "alpha-helix" structure of proteins, even published a paper proposing a triple helix model of DNA in 1953 (Pauling and Corey,1953). Pauling's practice of building models of molecular structures caught on with many biochemists of the day, and this time period has been referred to as the era of model building.

Several variants of a helix-shaped DNA were proposed by other scientists. In 1951, the English molecular biologists Francis Crick and James Watson had published their own incorrect version of a triple helix model. However, the diffraction pictures at the time were all relatively poor quality and resolution. As the technique was further refined, a brilliant chemist named Rosalind Franklin (Figure 7), working at King's College in England, was able to take much higher-resolution X-ray diffraction pictures.

Franklin's high quality pictures confirmed that DNA is actually a double helix - two strands wrapped around each other. However, the first double-stranded molecule built by Watson and Crick had the sugar-phosphate backbones of two strands wrapped around each other and the nitrogen bases pointing outward. It was Rosalind Franklin who pointed out the error in this model. She reminded Watson and Crick that the nitrogen bases are not very soluble in water and thus they would not be pointed outward where they would be surrounded by nearby water molecules in the cell. Instead, she argued, the sugars and phosphates, which are soluble in water, would be pointed outwards, towards the water, and the nitrogen bases would likely be tucked into the interior of the molecule, away from the water molecules, and perhaps interacting with each other.

Comprehension Checkpoint

The double helix structure of DNA was confirmed by

Chargaff's Law

This was a vital piece of advice for Watson and Crick, leading them to take their model apart and begin to build a new one. This time, they built the double helix with the sugar-phosphate backbones on the outside of the helix and the nitrogen bases facing inward. They realized that the nitrogen bases of the two strands would now be in proximity of one another and would likely interact. A crucial piece of evidence that helped them figure this out came from Erwin Chargaff's studies. In addition to demonstrating that different organisms had different amounts of the four nitrogen bases of DNA, in 1951, Chargaff also reported that the amount of adenine (A) always equals the amount of thymine (T) and the amount of cytosine (C) always equals the amount of guanine (G). This is now known as "Chargaff's law."

With Chargaff's law in mind, Watson and Crick had a revelation. They reasoned that if the molecule is double-stranded, perhaps every time that an A was on one strand of the molecule, a T appears in the complimentary position on the opposite strand (and vice versa); further, every time a C was on one side, a G would be on the other. This would explain why Chargaff's law held true. But, there was one problem. The nitrogen bases did not "fit together" in this configuration. Franklin had taken very good pictures of the DNA molecule that demonstrated that it was a tightly packed, narrow structure. When large molecules interact tightly, the smaller constituent molecules that closely pack together must be "complementary" like two inter-locking pieces of a puzzle. For example, a negative charge will be closely associated with a positive charge, etc. Watson and Crick knew that their model wasn't quite right, because the nitrogen bases were not fitting together very well.

Comprehension Checkpoint

"Chargaff's Law" has to do with

Anti-parallel configuration of DNA strands

The final revelation that allowed Watson and Crick to complete their model came in a moment described as "a stroke of inspiration" when Watson realized that the nucleotides would fit together if one was "upside down" relative to the other. (According to Watson, he saw this possibility as he sat across a small table from Crick, both of them working with small models of nucleotides.) This upside down orientation would occur if the two strands that wrap around each other are not pointed in the same direction, but in opposite directions. Thus, these two strands are said to be anti-parallel, like the traffic on a two-lane highway (Figure 8).

Suddenly, everything made sense! With the two strands wrapping around each other in an anti-parallel configuration, Watson and Crick were able to fit the strands very close together, as Franklin's picture shows them to be, and the structure is regular and symmetrical. Most importantly, the nitrogen bases fit perfectly together through a type of chemical attraction called a hydrogen bond. Hydrogen bonds hold the two strands together stably, but not permanently. Specifically, an adenine–thymine "base pair" has two hydrogen bonds and a cytosine–guanine base pair has three hydrogen bonds. (See Figure 8 above.)

Given this anti-parallel structure, to distinguish the two strands of DNA, scientists say that one strand is oriented "5' to 3' " and the other strand is "3' to 5'." This is in reference to the 5'-3' connections in the phosphate-sugar backbone. The machinery of the cell also uses this orientation to select which direction to read the genetic information contained in the nucleotide sequence. Imagine trying to read an English sentence going from right to left. This would make no sense because the proper direction of reading English is left to right. Similarly, the DNA code must be read in the correct direction, which is 5' to 3'.

The beauty of the double-stranded anti-parallel configuration is found in the complementary base pairing according to Chargaff's law. If we know the sequence of nucleotides on one strand, we can accurately predict the nucleotides on the other. An adenine on one side of the DNAmolecule would be paired with a thymine on the other side, and so on. Thus, if the two strands are separated, we could look at either strand and know exactly what was on the complementary strand. In fact, this is precisely what happens during DNA replication: the DNA double helix is pried apart or "unzipped" and both of the single strands then serve as copy templates for synthesizing a new strand. The result is two new DNA double helixes, both of which are identical to each other and to the original strand (Figure 9).

Once Watson and Crick had built the correct model, all could see that the anti-parallel configuration and the hydrogen bond base-pairing allowed this simple and effective means of DNA self-replication. In fact, the final sentence of their 1953 research article announcing the structure of DNA was, "It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material." Watson and Crick published their model of DNA in the journal Nature in 1953, a model which earned them the Nobel Prize in 1962.

There has been much debate about whether Rosalind Franklin, as a rare female scientist in the 1950s, received enough credit for her crucial contributions to this important discovery. Unfortunately, she died from ovarian cancer just five years after the model was built and Nobel Prizes are not given posthumously. In the 1950s, scientists were not aware of the cancer risks involved with repeated X-ray exposure and did not properly protect themselves from the radiation given off by these instruments. Thus, it is conceivable that Franklin's premature death was a direct result of her dedication to scientific research and her pursuit of the structure of the DNAmolecule.

Comprehension Checkpoint

From the sequence of nucleotides on one DNA strand, we can predict

But how does DNA store information?

With the discovery of the structure of DNA, a number of fascinating properties of the molecule were revealed. Not only can the molecule replicate itself, but the information stored in the base sequence of a single DNA strand stores all of the genetic information in your body. Think of the phone numbers stored in your cell phone. Each digit by itself means nothing. But when strung together in a precise sequence (e.g. 6-4-6-5-5-7-4-5-0-4), these numbers form a code for contacting another specific telephone. The same is true for DNA. The bases T, C, A and G mean nothing by themselves. However, a long sequence such as ATGGCTAGCTCGATCGTACGT...can form the code for building an important molecule in your body. This molecule may then perform a function in your body that allows your heart to beat, your stomach to digest, your muscle to flex, or your brain to think. Thus, because these sequences of nucleotides provide the information for the cell to build proteins and other molecules, DNA is often called the "blueprint of life." How this blueprint is actually used by cells to build other molecules is explored in additional modules.

Key Concepts

DNA consist of two strands of repeating units called nucleotides; each nucleotide is made up of a five-carbon sugar, a phosphate group, and a nitrogen base.

The specific sequence of the four different nucleotides that make up an organism's DNA gives that organism its own unique genetic traits.

The four nitrogen bases are complementary – adenine is complementary to thymine, cytosine is complementary to guanine – and the pairs form hydrogen bonds when the 5'/3' ends of their attached sugar-phosphate groups are oriented anti-parallel to one another.