We recently introduced a novel scheme combining electron-transfer and higher-energy collision dissociation (termed EThcD), for improved peptide ion fragmentation and identification. We reasoned that phosphosite localization, one of the major hurdles in high-throughput phosphoproteomics, could also highly benefit from the generation of such EThcD spectra. Here, we systematically assessed the impact on phosphosite localization using EThcD in comparison to ETD and HCD using a defined synthetic phosphopeptide mixture and also on a larger dataset of Ti4+-IMAC enriched phosphopeptides from a tryptic human cell line digest. In combination with a modified version of phosphoRS we observed that in the majority of cases EThcD generated richer and more confidently identified spectra, resulting in superior phosphosite localization scores. Our data demonstrates the distinctive potential of EThcD for PTM localization, also beyond protein phosphorylation.

Reversible phosphorylation of proteins
is a key regulatory mechanism
in living cells.1 Protein phosphorylation
can modulate protein activity, turnover, subcellular localization,
complex formation, folding and degradation. Dynamic phosphorylation
plays a pivotal role in almost all biological processes including
cell division, differentiation, polarization and apoptosis.2 Moreover, it is an important switch in cellular
signal transduction.3 The importance of
this post-translational modification (PTM) for cell biology has driven
the development of novel mass spectrometric tools for sensitive and
global detection of phosphorylation.4,5 However, the
analysis of phosphorylated peptides by mass spectrometry is still
not as straightforward as for “regular”, unmodified
peptides. One of the major challenges in phosphoproteomics is to improve
MS level representation since phosphopeptides are usually present
at substoichiometric levels. Hence, an enrichment step is necessary
to enable deeper penetration of the phosphoproteome. Enrichment is
typically performed by chromatography,6 antibodies7 or metal-ion/metal oxide
affinity-based8,9 techniques. Two other main challenges
are the identification of phosphopeptides and confident localization
of the corresponding phosphosite.10 The
challenge is caused by the higher lability of the phosphate group
when compared to the amide bond. A number of strategies have been
proposed to circumvent poor fragmentation and improve sequence and
site diagnostic fragmentation, including the use of neutral loss-triggered
MS/MS/MS11 and multistage activation (MSA)12 in ion traps, the use of beam type CID fragmentation,13 and electron capture/transfer dissociation14 or a combination of some of these approaches.9,15

Once phosphopeptide identification is feasible through sufficient
peptide backbone fragments, it can still be challenging to pinpoint
the true phosphosite. This becomes more difficult as the number of
potential phosphorylation sites within the peptide sequence increases.
In principle, unambiguous phosphosite localization requires site-determining
fragment ions.16 Direct validation is feasible
through detection of a fragment ion that carries the phosphate group.
Neutral loss fragment ions can be used as well; however, since they
exhibit the same mass as a water loss from an unmodified residue they
do not directly confirm the correct site.17 Diagnostic phosphosite-specific fragments facilitate pinpointing
the correct phosphosite.18−20 Several algorithms and programs
have been developed to enable automatic phosphosite localization.3,16,21−26 These software tools are based on distinct but similar approaches
and they all aim to provide a metric that allows for assessment of
the confidence in phosphosite localization. Recently, Taus et al.
have reported on a new algorithm, coined phosphoRS,27 which presently is uniquely compatible with CID, HCD and
ETD fragmentation and was optimized for both low- and high-resolution
MS/MS spectra. phosphoRS provides individual localization probabilities
for all potential phosphosites in a given peptide.

Generally,
all scoring tools depend on the quality of the MS/MS
spectra. The more site-determining ions are detected, the higher the
confidence in phosphosite localization. We have recently introduced
a novel fragmentation scheme combining electron-transfer and higher-energy
collision dissociation, termed EThcD.28 This method employs dual fragmentation to generate both b/y and
c/z ions which leads to very fragment ion- and thus data-rich MS/MS
spectra. Compared to HCD and ETD, we found a substantial increase
in peptide backbone fragmentation, which translated into a remarkable
average peptide sequence coverage of ∼94% for tryptic peptides.
We reasoned that localization of post-translational modifications
could also highly benefit from EThcD spectra. Here, we systematically
assessed the impact on phosphosite localization using EThcD. In this
work we evaluate the performance of EThcD in comparison to ETD and
HCD using a defined synthetic phosphopeptide mixture and also on a
larger data set of Ti4+-IMAC enriched phosphopeptides,
all in combination with a modified version of phosphoRS.

Experimental Section

Materials

All chemicals were purchased from Sigma-Aldrich
(Steinheim, Germany) unless otherwise stated. Formic acid and ammonia
were obtained from Merck (Darmstadt, Germany). Acetonitrile was purchased
from Biosolve (Valkenswaard, The Netherlands).

Sample Preparation

Protein from HeLa cells was harvested
and digested with trypsin, as previously described.29 Ti4+-IMAC beads were prepared as reported elsewhere.30,31 Phosphopeptides were enriched as previously described.32 Briefly, Gel-loader tips that were plugged with
C8 material (3M, Zoeterwoude, The Netherlands) were filled up to 1
cm with Ti4+-IMAC beads. Columns were equilibrated with
loading buffer (80% ACN, 6% TFA). Peptides were reconstituted in loading
buffer, loaded onto the columns and washed with washing buffer 1 (50%
ACN, 0.5% TFA, 200 mM NaCl) and subsequently washing buffer 2 (50%
ACN, 0.1% TFA). Phosphopeptides were eluted with elution buffer 1
(10% NH3 in H20) followed by elution buffer
2 (80% ACN, 2% FA). Eluate was acidified and diluted with formic acid
to a final acetonitrile concentration of <5%, split into three
equal amounts and directly analyzed by single run LC–MS/MS
utilizing ETD, HCD and EThcD, respectively.

Mass Spectrometry

All data was acquired on an ETD enabled
Thermo Scientific LTQ Orbitrap Velos mass spectrometer (Thermo Fisher
Scientific, Bremen, Germany). A Thermo Scientific EASY-nLC 1000 (Thermo
Fisher Scientific, Odense, Denmark) was connected to the LTQ Orbitrap
Velos mass spectrometer. ETD, HCD and EThcD methods were set up as
previously described.28 Briefly, all spectra
were acquired in the Orbitrap at a resolution of 7500. For HCD the
normalized collision energy was set to 40%. The ETD reaction time
was set to 50 ms for ETD and EThcD. Supplemental activation was enabled
for ETD. HCD normalized collision energy was set to 30% for EThcD
(calculation based on precursor m/z and charge state). The anion AGC target was set to 4e5 for both
ETD and EThcD.

Data Analysis

Peak lists were generated using Thermo
Scientific Proteome Discoverer 1.3 software (Thermo Fisher Scientific,
Bremen, Germany). The nonfragment filter was used to simplify ETD
spectra with the following settings: the precursor peak was removed
within a 4 Da window, charged reduced precursors were removed within
a 2 Da window, and neutral losses from charge reduced precursors were
removed within a 2 Da window (the maximum neutral loss mass was set
to 120 Da). MS/MS spectra were searched against a database containing
the synthetic phosphopeptide sequences and the human Uniprot database
(version v2010–12), respectively, including a list of common
contaminants using SEQUEST or Mascot (Matrix Science, UK). The precursor
mass tolerance was set to 10 ppm, the fragment ion mass tolerance
was set to 0.02 Da. Enzyme specificity was set to Trypsin with 2 missed
cleavages allowed. Data from the synthetic phosphopeptide mixture
was searched with no enzyme specificity. Oxidation of methionine and
phosphorylation (S,T,Y) were used as variable modification and carbamidomethylation
of cysteines was set as fixed modification. Percolator33 was used to filter the PSMs for <1% false-discovery-rate.
Phosphorylation sites were localized by applying a custom version
of phosphoRS27 (v3.0 – EThcD enabled)
that has been expanded to allow analysis of EThcD data.28 Briefly, the algorithm considers both HCD- and
ETD-type fragment ions at the same time. While singly and doubly charged
b- and y-type fragment ions including neutral loss of phosphoric acid
(H3PO4) are considered for site localization,
only singly charged c-, z-radical and z-prime ions are scored.

Results and Discussion

Increasing the confidence in
phosphosite localization is a key
challenge in phosphoproteomics. Site-determining fragment ions are
required to unambiguously pinpoint the correct phosphosite. Observing
all possible peptide backbone cleavages in a single MS/MS spectrum
substantially simplifies phosphosite localization. Recently, we showed
that EThcD enables complete peptide sequencing through dual fragmentation.28 In EThcD, the peptide precursor is initially
subjected to an ion/ion reaction with fluoranthene anions in a linear
ion trap, which generates c- and z-ions. However, the unreacted precursor
and the charge-reduced precursor remain highly abundant after ETD.
In the second step HCD all-ion fragmentation is applied to all ETD
derived ions. This generates b- and y-ions from the unreacted precursor
and simultaneously increases the yield of c- and z-ions by fragmentation
of the charge reduced precursor. Since the remaining unreacted precursor
population is higher charged than the ETD-derived fragment ions one
can apply a level of energy that fragments the precursor but does
not induce secondary fragmentation of c- and z-ions. Here, we continue
to explore the benefits of this novel fragmentation mode for the analysis
of phosphopeptides.

Evaluation of Phosphosite Localization by EThcD using a Defined
Phosphopeptide Mixture

To evaluate the potential added value
of phosphopeptide analysis by EThcD we initially used a defined mixture
of well-characterized synthetic phosphopeptides. This mixture consists
of 30 phosphopeptides of varying length with up to four phosphorylated
residues (see Supplementary Table 1 for
a complete list, Supporting Information). We analyzed this mixture by LC–MS/MS employing ETD, HCD
and EThcD fragmentation, respectively. We used identical instrument
settings with the only exception being the parameters for peptide
dissociation, which were set to the for each method optimized values.
The data was searched with SEQUEST and the PSMs were manually validated
and filtered (7 ppm peptide mass tolerance, search engine rank 1,
absolute Xcorr threshold 0.4). Additionally, we considered only PSMs
for which the injection time did not max out (<500 ms), that is,
the target number of ions was reached. Note that this precaution was
taken to exclude the number of ions as a variable that might impair
the quality of fragmentation. We calculated the average precursor
ion purity (PIP)34 for each data set and
found similar values, which were approximately 95% for all three techniques.
Together, these stringent criteria ensure that the activation technique
is the only variable that controls the fragmentation behavior. A summary
of the data from this direct comparison is given in Table 1. Similar numbers of PSMs were identified for all
three fragmentation techniques. We found that EThcD provided 248 PSMs
while these numbers were 237 and 216 for HCD and ETD, respectively.
Out of the 30 unique synthetic phosphopeptides injected ETD, HCD and
EThCD identified 21, 22 and 24, respectively. We found the average
SEQUEST Xcorr being highest for EThcD (2.5) followed by HCD (1.9)
and ETD (1.5), which is in line with our previous results for nonmodified
peptides.28 The SEQUEST algorithm correctly
annotated the known phosphosites in 79% of ETD and 78% of HCD data.
Significantly, for EThcD this was over 95% (of all PSMs), which directly
reflects the higher spectral quality, due to the generation of both
b/y and c/z ions. This initial data suggests that EThcD provides even
more extensive backbone fragmentation of phosphorylated peptides than
ETD or HCD alone, facilitating sensitive phosphosite localization
with very high confidence. It should be noted that the application
of a site localization algorithm would be prudent for real-life samples
since the true phosphorylation sites are unknown.

Recently, Taus et al. described phosphoRS, a novel
tool to improve
confident localization of phosphosites.27 The software is based on validated peptide identifications provided
by database search engines and calculates site probabilities for each
potential phosphosite in the peptide sequence. For this study we used
a modified version of phosphoRS that also enables assessment of individual
phosphosite probabilities for EThcD fragmentation. We analyzed each
data set using phosphoRS and found that it performs equally well for
all three fragmentation techniques. Of all true phosphosites, 96%
(ETD), 95% (HCD) and 97% (EThcD) were assigned a site probability >99%,
which corresponds to a very high confidence in site localization (Table 1). Together, these findings suggest that EThcD generates
MS/MS spectra that contain sufficient fragment ions for the unambiguous
and sensitive phosphorylation site localization.

Next, we assessed the performance of EThcD for phosphosite
localization on a larger data set. We used Ti4+-IMAC material
for the enrichment of phosphopeptides from a tryptic digest of HeLa
cells and analyzed equal amounts (corresponding to enriched phosphopeptides
from 100 μg of protein) by LC–MS/MS with ETD, HCD and
EThcD, respectively (Supplementary Figure 1A, Supporting Information). All three methods generated a similar
number of MS/MS spectra. All spectra were searched with SEQUEST. The
ETD data was also searched with Mascot because we found SEQUEST to
perform poorly for doubly charged phosphopeptides. Note that other
search engines such as OMSSA or SpectrumMill might provide larger
number of identifications for ETD data.35 However, these algorithms are currently not compatible with EThcD
data and phosphoRS analysis within the Proteome Discoverer software
environment. All identified PSMs were then filtered for <1% FDR
using percolator to ensure consistency. In total we identified 2217
(ETD), 4179 (HCD) and 3594 (EThcD) phospho-PSMs (Table 2). Our initial analysis of a defined synthetic phosphopeptide
mixture demonstrated that EThcD performs at least on the same level
as HCD in terms of peptide identification. However, the overall identification
success rate in the Ti4+-IMAC data set was slightly lower
for EThcD compared to HCD. This can be attributed to the rigid automatic
FDR filtering. The MS/MS spectra from the synthetic phosphopeptide
mixture were manually validated whereas the Ti4+-IMAC data
set was computationally filtered to <1% FDR. The application of
EThcD, in comparison to ETD or HCD alone, significantly increases
the number of fragment ions observed in the MS/MS scans. On the one
hand EThcD spectra contain more sequence information, which is beneficial
for inferring the peptide sequence and PTM localization. On the other
hand, these additional fragment ions may also match to random peptide
sequences, increasing their score and hampering the differentiation
between correct and incorrect matches. Consequently, the chance for
a high scoring random match will be elevated. Similar to the increased
average score of decoy hits also the true hits are likely to provide
on average higher scores. Depending on whether the distance between
the two score distributions decreases or increases, the identification
success rate will be higher or lower. Since the ID success rate is
slightly lower for EThcD compared to HCD alone, the negative effect
of higher-scoring random matches might be more pronounced. Thus, higher
score cut-offs need to be applied in order to reach the desired FDR.
A standard target-decoy approach36 against
a reversed concatenated database revealed the FDR for EThcD (2.6%)
being almost twice as high compared to HCD (1.4%), which provides
further evidence for this hypothesis.

Next, we calculated the average peptide sequence coverage
for all
PSM. As expected, EThcD provided a substantial increase in sequence
coverage (92%) compared to HCD (81%) and ETD (83%). Obtaining near-complete
peptide sequence coverage tremendously simplifies phosphosite localization.
We used the extended phosphoRS algorithm to validate our assumption.
Remarkably, EThcD provided for 95% of all phosphosites a confident
site localization probability of >99%. In the HCD data set we found
that 89% of all phosphosites were assigned with a confident site localization
probability >99%, while this was only 81% for ETD data set. We
recalculated
these number for all peptides that contain >2 residues that can
be
phosphorylated because singly phosphorylated peptides with only one
potential phosphorylation site could bias the results toward HCD.
Of all phosphosites from this subset of peptides 97% (ETcaD), 93%
(EThcD) and 87% (HCD), respectively, were assigned a localization
probability >99%.

For multiply phosphorylated peptides site
localization becomes
more challenging. Figure 1 shows an MS/MS spectrum
of a doubly phosphorylated peptide upon EThcD fragmentation. The overall
sequence coverage is 89% taking b/y- and c/z-ions into account. Six
out of 18 amino acid bond cleavages are represented by c- and b-ions
(referred to as “golden pairs”37). Additionally, we observed 11 z/y-ion pairs, which strengthens
the argument that EThcD provides extensive sequence information that
facilitates pinpointing the correct phosphorylation site. More than
95% of the phosphosites from all doubly phosphorylated peptides were
assigned with a site localization probability >99%, highlighting
that
EThcD performs equally well with singly and doubly phosphorylated
peptides. A known limitation of ETD is its inability to cleave the
N–Cα bond N-terminal to proline.38,39 This can hamper phosphosite localization for proline-rich peptides.
Generation of dual ion series in EThcD can overcome this issue. Figure 2 shows the EThcD spectrum of a singly phosphorylated
peptide that contains four serine residues. The c- and z-ions derived
from the ETD step cover only the N-terminal part of the peptide and
the site probability is only 50%. The additional y-ions derived from
the subsequent HCD activation provide supporting sequence information
and cover also the two serine residues next to the prolines which
enables unambiguous phosphosite localization.

Conclusions

Here we have evaluated the potential of
EThcD in improving the
analysis of phosphopeptides. Our data highlights the benefit of dual
ion series as generated by EThcD fragmentation. We observed for a
defined phosphopeptide mixture average higher SEQUEST Xcorr values,
higher peptide sequence coverage and more confident phosphosite localization
in EThcD compared to ETD and HCD. This finding was confirmed when
we analyzed a complex phosphopeptide sample resulting from a Ti4+-IMAC enrichment of peptides from a cellular lysate. This
is in line with recent reports that showed that confidence in phosphorylation
site localization increases when multiple separately acquired MS/MS
spectra (e.g., ETD/CID or MSA/ETD) are combined for scoring.25,26 For this larger data set, we observed that the identification success
rate was slightly lower for EThcD compared to HCD. This can be attributed
to the use of conventional database search engines that are not optimized
for spectra that contain dual ion series.40 However, the fact that both peptide sequence coverage and the percentage
of localized phosphosites are higher for EThcD than for HCD suggests
that once a peptide was identified, further analyses such as site
localization benefit from the more data-rich EThcD spectra. In EThcD
often c/b- and z/y-ion pairs are observed that increase the confidence
in a particular peptide backbone cleavage.41 We speculate that the identification success rate of EThcD for phosphopeptides
can be improved by novel or optimized data analysis tools. Finally,
we reason that EThcD can also be beneficial and used to improve the
localization of other post-translational modifications such as ubiquitination,
glycosylation or acetylation.

Supporting Information Available

Additional information as noted
in the text. This material is available free of charge via the Internet
at http://pubs.acs.org.

Supplementary Material

Notes

The
authors
declare no competing financial interest.

Acknowledgments

We thank Mathias Madalinski for peptide synthesis of the primeXS
phosphopeptide mixture. This research was performed within the framework
of the PRIME-XS project, grant number 262067, funded by the European
Union 7th Framework Program. Additionally, The Netherlands
Proteomics Centre, a program embedded in The Netherlands Genomics
Initiative, is kindly acknowledged for financial support as well as
The Netherlands Organization for Scientific Research (NWO) with the
VIDI grant (700.10.429). Work in the Mechtler lab was supported by
the European Commission via the FP7 projects MeioSys and PRIME-XS,
the Austrian Science Fund via the Special Research Program Chromosome
Dynamics (SFB-F3402).

EThcD spectrum of a proline-containing phosphopeptide.
This EThcD
spectrum of a doubly charged peptide that contains four serine residues,
one of which is phosphorylated. ETD does not cleave the N–Cα bond N-terminal to proline and the phosphorylation
site probability is only 50% based on c- and z-ions alone. Dual fragmentation
by EThcD generates complementary sequence information from c/z- and
b/y-ions (SEQUEST Xcorr 4.10). Here, the exact phosphosite is revealed
by y-ions that cover the corresponding phosphosite (phosphoRS site
probabilitis: S(1): 0.0; S(3): 0.0; S(8): 99.5; S(10): 0.5). SEQUEST
Xcorr 4.10.