In the second drawing, it looks like the alpha helix is copied or multiplied out several times in the drawing. Its drawn like as if the scientist drew his protein and copied the same image into the picture several times creating a much thicker looking protein. In the case of that second image, it looks like there are about 6 or 7 copies of the same protein, all wrapping around each other!

My Question;
Is there a special name for this method of "copying" or doubling up the proteins in the image? And is it because, in reality, these proteins actually do wrap around each other several times, or something like that?

(Maybe i should have posted this message in the chemistry forum)
Thank,
John.

The structure for the AT-rich interaction domain (PDB ID 1IG6) was determined via NMR spectroscopy. With NMR, one determines an ensemble of structures, and they are typically represented in that manner. For regions where the protein is well-ordered - such as the alpha-helices in that graphic - it looks as if it's been just copied and pasted. For the disordered regions - such as a terminal end - it's all over the place.

Why would they just copy and paste the same protein on top of itself several times. In the case of that image, its roughly copied about 6 or 7 times. But why? And what does it mean?

Could it be that, in reality, maybe there really is 6 or 7 of that same protein all wrapped around each other. Or could it be that there are 6 or 7 of the proteins that the NMR detected but they just can't yet say how the overall protein folds together so they just show the same protein multiply times on top of itself rathar than speculate on the actual true position of the molecules?

And its not an isolated example. There are hundreds of other protein images i have seen that the scientist done the same thing.

In NMR spectroscopy, the data you collect defines a series of distance and angular relationships betweeen atoms in the protein structure. For example, it may tell you that hydrogen X from residue #23 is close to hydrogen Y from residue #150 or it may say that the N-H bond from residue #30 is at a 10o angle relative to the N-H bond from residue #31.

Determining the structure of a molecule from these distance and angular restraints is not straightforward. Structural biologist employ Monte Carlo methods to perform a somewhat random search for protein structures that fit these distance restraints best. These random search methods do not always converge on the same structure and the result of the search often depends on the starting position for of the search, so structure determination programs will perform many of these searches in parallel generating many different models of the protein's structure. These models are then scored by how well they satisfy the restraints generated from the NMR data. Structural biologists will generally report the ensemble of best scoring models from the data set (e.g. the top 20 hits).

Ideally, if enough distance restraints have been collected and the protein is well behaved, the top scoring hits from the structure determination will all converge on a similar structure. If regions of the protein appear to adopt very different folds in the final ensemble, this could be a sign that there was insufficient data to define the structure of that region or that the region does not fold into a defined structure.

It's worth noting that many find NMR spectroscopy to be superior to x-ray crystallography in this respect as it is generally incorrect to think of proteins as having a single defined structure. Proteins are very dynamic molecules whose parts move and "breath" at a variety of timescales. Therefore, folded proteins do not exist as a single native state but as an ensemble of similar structures that interconvert.

Thank you Ygggdrasil for your detailed reply! That does explain a lot!

So from what you just said, after doing our tests, we might come back with, lets say 6 or 7 closely matching protein structures when we Analise a protein. And even though the 6 or 7 structures are very similar, they might well have small differences.

Looking at the image, all 6 or 7 structures look almost identical. So i don't get it? If the differences between the 6 or 7 results are such a close match, why not average them out and just draw 1 image of the protein? I could understand if the difference in the structures in that image was big, but its not. They are all almost identical.

As Ygggdrasil noted, proteins are dynamic entities - representing them in this manner reflects that they are not static and unchanging. Drawing a single averaged structure can mask the natural variation that is important for understanding enzyme function or protein-protein interactions, for example. In the 1IG6 structure, it appears as if at least one structure's loop region - in purple - on the right side between the helices that has adopted a different conformation than the rest of them. It could very well be that is a region of the protein that is critical for function, and knowing that this different conformation appeared in the structure could be very helpful.

So is it a fair statement to say that in the 1IG6 structure, there is NOT 6 or 7 proteins all wrapped around each other. That the scientist drew the 6 or 7 structures on top of each other to show that this protein can take several possible confirmations?

And can i then apply that to other protein images that are drawn in the same way? Like the Gag knuckle which appears to have about 3 of the same protein structures on top of each other. Again the researcher done this to show that the Gag knuckle could bend into several different confirmations. Is that correct?

That would be correct - it is not 6 or 7 proteins wrapped around each other. These sorts of diagrams are typically "reduced" representations - what you are looking at is the trace along the backbone of the polypeptide chain (following the path along the amide nitrogen, Cα carbon, and carbonyl carbon of each amino acid residue). There is no explicit information about the side chains or even an atom directly bound to the backbone, such as the amide proton or carbonyl oxygen.

So, yes, if you see structures presented in this manner - and that they were determined by NMR spectroscopy - it is a safe bet to figure that you are seeing the best 5, 6, or however many structures of the entire set that were generated. If they were not determined by NMR spectroscopy, you will almost certainly have to dig deeper as to why they are being presented in such a way.

Often NMR papers will show an averaged structure along with a figure showing the overlay of some of the best structures. One reason for showing an ensemble of structures is that seeing the individual data points gives more information than simply showing the average. As an example, consider the following sets of measurements of the distance between two points in a protein: {1,2,4,6,8,9} nm, {3,3,3,7,7,7} nm, and {5,5,5,5,5,5} nm. If one were to look only at the average, these three sets of measurement would look identical, yet the three different ensembles show very different behaviors. The first ensemble suggests that one or both of the points is on a flexible part of the protein structure that is free to move about and adopt a variety of different conformations, the second ensemble suggests that the protein may switch between two different conformations, and the third ensemble suggests that the distance between the two points is very rigid. Thus, to the trained structural biologist, seeing the full ensemble of models generated from the NMR data gives much more information that just seeing the averaged structure.

So even if the ensemble of structures obtained after structure determination are nearly identical, it is good practice to show the ensemble to give other confidence that your structure determination converged on a single structure.

As an aside, the idea that the full probability distribution of a measurement contains much more information than the average of that distribution is very important to keep in mind in many areas of science. Researchers perform single molecule measurements and single cell measurement for just this reason.

Ygggdrasil,
Now i understand exactly what your saying. As a general rule, when i read papers about protein structure and look at these types of images, i always give the author some slack to allow for the fact that this particular science is new and the technology is still emerging. So any author is presenting the best science data that is available to him, even if the data is complex.