Friday, July 20, 2012

This week I was introduced to a free MS/MS search database system operated out of the University of Ohio called Mass Matrix. This algorithm has evolved over the past several years and has been the subject of at least 7 publications. I will have to explore this further, but this algorithm is particularly interesting to me due to its extreme bias toward HCD fragmentation data that is read in high resolution in an Orbitrap FT scan mode. We discovered this bias when we took two files that were generated on an Orbitrap Elite from the same sample and conditions that were interpreted using three separate database searching algorithms. The files were generated Top25 based methods. The first method was a Top 25 CID method and the second was a Top 25 CID High-High method, where the FT MS/MS scans were accumulated at 12,000 resolution. Both methods generated around the same number of total scans ~75,000.
Both Sequest (PD 1.3) and Mascot (searched through the mysterious-to-me Elucidator platform) generated roughly the same number of peptide and protein matches for the two files (within 3%)
Mass matrix, however, demonstrated an extreme bias toward the HCD generated high resolution fragment ions, with >38% more IDs with the high-high method.
This is just another example of the importance of using multiple search algorithms to interpret results. It would be interesting to see why this bias exists in this software, because I feel that you really should be seeing more positive IDs when employing higher resolution MS/MS scans.

Sunday, July 15, 2012

I've been reserving my opinion on the new MS/MS technique called SWATH for several weeks now. I like to think about things for a long time before I commit to them, sometimes. And I felt like I really needed to learn more about this procedure, particularly when my initial reaction was so overwhelmingly negative. After weeks of thinking about it, I'm convinced that I don't think SWATH is a good idea at all.

The SWATH technique is a data independent fragmentation method. Every ion is fragmented. The good, the bad, the intense, the weak -- every ion. In order to narrow this down, small mass ranges (or swaths) are chosen, often in 25 amu windows. Every ion coming through with a m/z of 350-375 is then fragmented and MS/MS spectra collected. The machine then goes to fragmenting all ions from 375-400. To be clear, the MS1 mass is never recorded. You simply do not know the mass of your parent ion.

In order to get over this minor obstacle of not actually knowing what you fragmented (other than its m/z, plus or minus 25 amu!), the processing program compares the MS/MS spectra generated to a pre-generated spectral library.

Okay, so maybe I'm being old fashioned, and I have been doing this a while. But not that long ago, we looked at the parent mass of every ion we fragmented and made a hypothesis of what it might be. Then we examined each ion generated in MS/MS individually and matched them to see if our hypothesis was correct. I know, mass spectrometry has been growing at an insane rate. It is almost impossible to keep up with the advances in hardware, software, and processing technologies. But at the heart of it, I personally believe that it still breaks down to: can we make a structural hypothesis based on the MS1 and support it with the MS2?

Besides these perceived shortcomings, there are some other concerns. Moving sequentially through mass ranges takes a long time. A method that requires such intense scan times can not possibly be compatible with rapid chromatography systems. The real limitation of most proteomics labs is the amount of time it takes to get data from a sample. This is most often limited by the amount of time required to get a good chromatographic separation. Ideally, these would be getting shorter all the time. I know I was thrilled when I found a superior column manufacturer that could trim 15 minutes off of my gradient time -- and here we are looking at a slower method?

3000 views!!!

This is probably beeping my own horn, if you want, but this site has now been viewed >3,000 times since I moved it from its old domain in January of this year! Thank you so much for stopping by. Keep the questions coming, and I'll try to keep on writing!

Monday, July 2, 2012

This is a continuation of experiments that I began to describe in April of this year
Please see Part 1 and Part 2, for more information.

Since I finished part 2 of this short experiment, 3 separate readers have written me with this exact same question: how many unique peptides were discovered in each experiment?

As (I hope) you can see in the grainy .JPEG from Excel above, the peptide OFFGEL separation provided the largest number of both unique protein groups and peptides of the 4 methods evaluated. Surprisingly, the SCX came in second, suggesting that (at least for this very simply study) methods that separated digested plasma proteins were superior to separation of plasma proteins at the un-digested level.

Disclaimer: This study was not for the purpose of discovering large numbers of proteins or peptides. This method was purely for the evaluation of 4 different separation techniques for the identification of depleted plasma proteins. All 4 methods began with the same amount of protein, either pre- or post- digestion, with the assumption that the loss during digestion whether in-gel or in-solution would be roughly equivalent. The method was a standard Top10 high/low with a dynamic exclusion at 2 events and a mass width of 0.01 Da.

As always, please email me directly if you would like further information (and sorry for the delay!). Keep those questions coming!