Proteomics, being relatively new as a scientific discipline, uses a wide variety of old and new methods to achieve its aims. The techniques used in proteomics focus on revealing structure and conformation, as well as measuring protein concentrations in varying conditions. Structural data can be used to determine the function of various proteins, based on comparison to similar proteins with known functions. This generally requires massive protein conformation and function databases, like PRIDE[1] and SwissProt[2]. Oftentimes these comparisons require sophisticated software and algorithms themselves, simply due to the sheer volume of comparisons and possible matches that exist. Programs like PROMPT[3] are used for this purpose. Protein concentrations can be compared to biological conditions allowing researchers to find correlations between a certain protein or proteins and a disease or biological state.

X-ray crystallography[4] is a method of protein structure determination that uses crystallized proteins and high intensity X-ray exposure to visualize the molecular structure. This collection of visualization data is analyzed in comparison to atomic and molecular arrangements and used to create a chemically relevant model of the protein's structure.[5] X-ray crystallography can be time-consuming due to the crystal growing procedures, the volume of data needed and the computational needs for analysis. A successful visualization however, has a high degree of accuracy as well as an ease of reading and application.

The X-ray diffraction pattern of DNA. X-ray crystallography is used for many macromolecules, not just proteins.

Nuclear Magnetic Resonance spectroscopy[6] or NMR is a technique that detects atomic properties of various atoms within the molecular structure based on their chemical shift[7]. Magnetization transference between nuclei is detected as well and used with the chemical shift to calculate the structure of the protein in question. The procedures can take a relatively long time depending on the size of the sample and a variety of other factors related to magnetic fields.

Mass spectrometry[8] and electrophoresis[9], as applied to proteomics, are both techniques used for determining the mass of proteins or protein fragments. Mass spectrometry uses ionization of the molecules and measurement of the mass to charge ratio to determine mass. Electrophoresis generally uses density gradients and charged particles in an electrical current to create a more visual mapping of mass. Common electrophoresis types used in proteomics include Isoelectric Focusing or IEF, 2D electrophoresis methods like SDS Page and standard horizontal gel electrophoresis to name a few. Directed methods for mass spectrometry involve using previous knowledge to either include or exclude targeted mass ranges, thus saving time when given a complex mixture of proteins such as an entire proteome.

Ab initio is a label for a set of techniques designed to predict protein structure based entirely on the chemical theory and physics theory as it applies to the molecule and its parts. This particular set of techniques are considered a future goal of proteomics but are currently lacking in viable success.[10]

Liquid chromatography[11] is a technique used for protein purification. Oftentimes samples come from large mixtures of various molecules and need to be separated from them before even the most basic analysis can be done.

Microfluid separation is a technique that combines the purification and detection of proteins using the principles of Microfluidics[12].

Western Blot[13] is a technique for tracking relative amounts of proteins in sample. Preparations are done with preventing degradation and separating by mass (through SDS-page) in mind. The proteins are transferred to a membrane that allows for detection by using labelled antibody.

Chemical microarrays[14] are, in relation to proteomics, pieces of glass that proteins can be affixed to in ordered microscopic arrays. They can be used for a variety of methods but in proteomics tend to be used for global observations of biochemical and proteomic activity.[15]

More proteomic technology advances include Free Flow Electrophoresis (a higher affinity size separation technique)[16] and ProteomeLab’s PF2D system (separation based on charge and hydrophobicity)[17]. Nanotechnology could benefit proteomics and some scientists are already producing nano-particles as labeling devices, such as SERRS[18] tags.

The challenges that proteomics faces come mostly from the sheer volume of proteins in the human body at any given time. Measurement of such a large data set is difficult and time consuming. This problem especially shows itself in computational issues where the time needed to analyze the data found goes beyond reasonable thresholds.[19] Other issues can arise with differing concentrations of proteins causing inaccuracies and detection problems. More specifically, certain proteins are found in very low concentrations in a cell and the more concentrated proteins can mask their presence. This is often shown with the dynamic range[20] value. The resolution of current methods is not high enough to visualize these proteins below certain concentration thresholds.

Another challenge arises from high complexity of many proteins and the fact that the number of proteins in the body interacting in different ways is so high. Post translational modifications of proteins can lead to even more complexity, not only for their structure, but also for their function and interactions. This creates computational challenges that could have a normal computer running for decades to solve problems for one protein. While new methods in distributed computers and parallel computing can help to account for this, the fact remains that the computational needs of proteomics are staggering. Another solution called directed mass spectrometry aims to reduce the amount of redundant information processed by mass spectrometry analyzers by targeting only proteins of interest, being able to to ignore the most intense signals in a sample to examine only those defined by a researcher.

A source of many of the problems that proteomics faces are simply how new it is. Any newer field will lack a good groundwork of technology, application methods and trained researchers to handle its needs. Proteomics has problems especially with large and uncurated databases and requires better organized, more streamlined and more efficient replacements.[21] While new resources for storage and presentation are being developed, many of them require many improvements, especially in the area of distributing results of protein investigation[22]

This article focuses on inclusion list driven mass spectrometry (MS) as a technique for directed MS. The technique involves determining target proteins of interest using chromatography, selecting for only those proteins in the first MS of Tandem MS, and then analyzing them in the second MS. The result is an analysis of only the target proteins regardless of intensity.

Mass spectrometry is used for sequencing of proteins by charging molecules and passing them through a strong magnetic field. Determining their mass to charge ratio allows for the makeup of the proteins to be determined. It also allows for analysis of post-translational modifications and quantification of the proteome. Picking a protein from the entire proteome of an organism or cell can be very difficult. Non-directed methods involve stochastically selecting an ion for sequencing from a complex mixture. This paper reviews a directed method called inclusion list driven mass spectrometry.

Inclusion list driven mass spectrometry calls for a list of mass ranges to be included from the chromatography separation. These lists are generated from elution time standards and allow from a specific size protein to be included in a mass spectrometry analysis. Target protein information is then passed to a tandem MS (MS/MS) where the first quadrupole restricts for the desired target and the second quadrupole analyzes that target. An opposite strategy called exclusion list driven mass spectrometry is used to prevent a specific size from being included and allows for a certain, previously determined protein to be ignored. The information gained from directed mass spectrometry can be used for comparative analysis of differentially expressed proteins. It allows for quantification of a certain protein from a certain proteome. This information can be used to differentiate disease-state cells from normal cells.

Directed mass spectrometry can also be used to determine co-translational and post-translation modifications. It is used to determine cross-linked peptide pairs, or regions of similar structure, which are used to determine related proteins. Similar strategies can be used to determine multiple phosphorylation state changes.

All of the work in mass spectrometry will continue to improve as more and more information is gained. The list of known sequences and databases of mass to charge ratios will allow for quick and easy analysis of future samples. It will also allow inclusion lists to become much smaller and exclusion lists to become much larger. This will enable analysts to quickly differentiate missing or additional proteins from a cell in a given state leading to determination of disease states much quicker and with more accuracy. All of this work will hopefully lead to hypothesis driven mass spectrometry where a theory can be formulated ahead of time as to the state of a cell and then mass spectrometry simply becomes a means to proving that hypothesis.

This text is an in-depth description of current methods in mass spectrometry. It discusses technology and processes used in labs for the study of proteins given an entire proteome. Information is provided on where these techniques originated, why there is need for altering the techniques and what these techniques are striving to eventually achieve in the future. This information is important to the field of proteomics because mass spectrometry is one of the most important tools used in this field and the paper describes how the method is adapting as more and more data is being produced in it.