Abstract

Protein tyrosine phosphorylation, which plays a vital role in a variety of human cellular processes, is coordinated by protein tyrosine kinases and protein tyrosine phosphatases (PTPs). Genomic studies provide compelling evidence that PTPs are frequently mutated in various human cancers, suggesting that they have important roles in tumor suppression. However, the cellular functions and regulatory machineries of most PTPs are still largely unknown. To gain a comprehensive understanding of the protein-protein interaction network of the human PTP family, we performed a global proteomic study. Using a Minkowski distance-based unified scoring environment (MUSE) for the data analysis, we identified 940 high confidence candidate-interacting proteins that comprise the interaction landscape of the human PTP family. Through a gene ontology analysis and functional validations, we connected the PTP family with several key signaling pathways or cellular functions whose associations were previously unclear, such as the RAS-RAF-MEK pathway, the Hippo-YAP pathway, and cytokinesis. Our study provides the first glimpse of a protein interaction network for the human PTP family, linking it to a number of crucial signaling events, and generating a useful resource for future studies of PTPs.

In response to extracellular stimuli, multiple signaling networks coordinate to determine physiological outcomes, although the complexities of the signal transductions mediated by various protein families have not been well elucidated. The well-known phosphorylation cascade that is controlled by kinases ensures a precise response to stimuli. These cascade events and the finely-tuned regulation of kinases have been recognized as the core of signaling pathways. Specifically, phosphorylation of tyrosine residues in proteins has been intimately associated with the etiology of many human diseases, such as cancer (1⇓–3).

Protein tyrosine phosphorylation is regulated in a coordinated manner by two enzyme families: protein tyrosine kinases and protein tyrosine phosphatases (PTPs)
1, which control a broad spectrum of fundamental signaling pathways and physiological processes (1). Understanding how the elegant balance between these two enzyme families is altered in human diseases will benefit the development of therapeutic strategies to improve treatment outcomes for patients (2, 3).

The PTP family, which is characterized by the presence of the signature motif HC(X)5R in the protein sequence, can be further divided into two subfamilies: the classic phosphotyrosine specific PTPs and the dual-specificity phosphatases (DSPs) (1, 2, 4, 5). The classic PTPs are further categorized as transmembrane receptor-like PTPs (rPTP) or non-rPTPs (nrPTP). rPTPs have been implicated in the regulation of ligand-dependent protein tyrosine dephosphorylation on the cell membrane, whereas nrPTPs mainly govern cytoplasmic protein tyrosine dephosphorylation. The DSP subfamily shows relatively less conservation and is structurally more diverse than the classic PTPs (6). The catalytically active domains of DSPs can access not only phosphotyrosine residues but also phosphoserine or phosphothreonine residues in their substrate proteins. The DSP family can be further categorized into several small groups: atypical phosphatases, mitogen-activated protein (MAP) kinase phosphatase (MKP), myotubularin (MTM) phosphatases, phosphatase and tensin homologs deleted on chromosome 10 (PTEN), cell division cycle 14 phosphatases (CDC14s), slingshots, and phosphatases of regenerating liver (PRLs) (6). Because atypical DSPs and MKP DSPs share similar functions (6), we grouped them together as a single DSP subfamily (atypical-MKP DSPs) for the analysis. Moreover, MTM-DSPs and their related DSPs (MTMR-DSPs) possess the conserved PTP domains but lack the residues that are critical for catalysis. However, not all components of this family are categorized as pseudophosphatases, because some of them have lipid phosphatase activity (7). Because the classic PTPs (rPTP and nrPTP), the atypical-MKP DSPs, and the MTM/MTMR-DSPs are the three largest subfamilies of human PTPs, we focused mostly on these three subfamilies for the analysis presented here.

Although the comprehensive characterization of PTPs has lagged behind that of protein tyrosine kinases, several recent studies have shed light on the central roles of PTPs in controlling cellular signaling events and cancer development. Moreover, because the prevalent hypothesis is that active protein tyrosine kinases are oncogenes, it is anticipated that many PTPs serve as tumor suppressors (1). Indeed, genomic deletions or mutations identified in some PTPs, such as PTEN (8), PTPRF (9), and PTPN12 (10), have been shown to contribute to breast cancer development. Recent studies have also highlighted the roles of PTPN12 (11) and PTPN23 (12) as key tumor suppressor genes in triple-negative breast cancer development and metastasis. In addition, PTPN13 was identified as a suppressor of HER2-positive breast cancers through antagonism of HER2 activation (13). Loss of PTPRO (14) and PTPN13 (15) has been observed in hepatocellular carcinoma. DUSP6, a DSP that targets the RAS-MAPK pathway, is lost in pancreatic cancers (16). PTPRK (17), PTPN7 (18), and PTPN13 (19) are either mutated or down-regulated in lymphoma, and depletion of DUSP1 (20) has been detected in ovarian cancers. Intriguingly, some PTPs have the capacity to regulate cellular signaling events independently of their PTP activities. For example, PTPN14 can suppress YAP oncogenic functions by retaining YAP cytoplasmic localization through a physical protein-protein interaction (21). PTPRM maintains tissue homeostasis through its physical interaction with E-cadherin (22) and protein kinase C (23). These results of these studies suggest that different PTPs use distinct mechanisms to control cellular signaling events. Although it is becoming apparent that PTPs play crucial roles in various cellular events and cancer development, the functions and regulation of many PTPs are still poorly understood.

The limited characterization of PTPs and their relevance to biological and pathological events prompted us to perform a global proteomic analysis of the human PTP family to uncover PTP-associated protein complexes and help us elucidate not only the cellular functions but also the regulatory mechanisms of this critical enzyme family. We isolated the protein complexes of 68 PTPs encoded by the human genome through tandem affinity purification followed by mass spectrometry analysis (TAP-MS). To identify high-confidence candidate interacting proteins (HCIPs), we used the unbiased Minkowski distance-based unified probabilistic scoring environment (MUSE) to eliminate nonspecific interactions, where 1626 parallel TAP-MS experiments were used as the control group. To address the subcellular roles of PTPs, we used gene ontology (GO) analysis and the systematic topology clustering method to enrich for key signaling pathways and functions within three major PTP subfamilies (classic PTPs, atypical-MKP DSPs and MTM/MTMR DSPs), which led to the discovery of multi-level crosstalk between the atypical-MKP DSP family and the RAS-MAPK pathway, key modules of the classic PTP subfamily in regulating the Hippo signaling pathway, and previously uncharacterized role of MTM/MTMR subfamily members in cell cycle regulation. Together, the findings of this global proteomic study not only helped us define the protein-protein interaction landscape of this critical enzyme family but also propose functional annotations for some previously uncharacterized PTPs in the context of cellular signaling transductions and regulations.

Constructs and Viruses

The plasmids encoding PTP family components were obtained from the Human ORFeome V5.1 library or purchased from Harvard Plasmids Resource (Boston, MA) and Open Biosystems, Lafayette, CO. All constructs were generated by polymerase chain reaction (PCR) and subcloned into a pDONOR201 vector using Gateway Technology (Invitrogen, Waltham, MA) as the entry clones. For the TAP analysis, all entry clones were subsequently recombined into a lentiviral gateway-compatible destination vector for the expression of C-terminal SFB-tagged fusion proteins. Gateway-compatible destination vectors with indicated SFB tag, HA tag, GFP tag or glutathione S-transferase (GST) tag were used to express various fusion proteins for PTPN21/WWC1/YAP and MTMR4/CEP55 studies. PCR-mediated site-directed mutagenesis was used to generate internal deletions or single site mutations. FERM domain (residues 1∼325), the linker region (residues 326∼890), and the PTP domain (residues 891∼1174) deletion mutants of PTPN21 were generated for this study. Two WW domains (WW1: residues12∼34; WW2: residues 59∼88) located at the N terminus of WWC1 were deleted as single-deletion mutants (WWC1-dWW1 and dWW2) or as a double-deletion mutant (WWC1-dWW1 + 2). The two WW domains of YAP (WW1: residues 130∼152; WW2: residues 236∼258) were singly deleted as YAP-dWW1 and YAP-dWW2 mutants.

pGIPZ lentiviral shRNAs targeting PTPN14, PTPN21, or MTMR4 were obtained from the shRNA and ORFeome core facility at the University of Texas MD Anderson Cancer Center (Houston, TX). The shRNA sequences were as follows:

PTPN14 shRNA (V2LHS_248245): 5′-CAGAGGAACCCAAAGAATA-3′;

PTPN21 shRNA (V3LHS_637583): 5′-CCCACCGCAGTTGCACTAT-3′;

MTMR4 shRNA (V2LHS_41104): 5′-GGCCGCTCTCTGGACAGAT-3′;

DUSP1 shRNA-1# (V2LHS_160994): 5′-CTGATTATTTATGACCTGA-3′;

DUSP1 shRNA-2# (V2LHS_160991): 5′-GGCTGGTCCTTATTTATTT-3′;

SNX17 shRNA (V2LHS_72595): 5′-CCTTCGGAGTCAAGAGTAT-3′;

DUSP2 shRNA (V2LHS_111457): 5′-GAAACTTAGCACTTTATAT-3′;

Control shRNA: 5′-TCTCGCTTGGGCGAGAGTAAG-3′.

All lentiviral supernatants were generated by transiently transfecting 293T cells with the helper plasmids pSPAX2 and pMD2G (kindly provided by Dr. Zhou Songyang, Baylor College of Medicine, Houston, TX) and harvested 48 h later. Supernatants were passed through a 0.45-μm filter and used to infect MCF10A cells with the addition of 8 μg/ml polybrene.

To synchronize HeLa cells at cytokinesis, cells were treated with 100 ng/ml nocodazole (Sigma) for 16 h and washed twice with phosphate-buffered saline solution (PBS). Cells were then released in normal growth medium for another 1.5 h.

Tandem Affinity Purification of SFB-tagged Protein Complexes

HEK293T cells stably expressing SFB-fused PTP proteins were selected by being cultured in medium containing 2 μg/ml puromycin and confirmed by immunostaining and Western blotting. For affinity purification, 293T cells were subjected to lysis in NETN buffer (with protease inhibitors) at 4 °C for 20 min. Crude lysates were subjected to centrifugation at 4 °C and 14,000 rpm for 15 min. Supernatants were incubated with streptavidin-conjugated beads (Amersham Biosciences, Pittsburgh, PA) for 1 h at 4 °C. The beads were washed three times with NETN buffer, and bound proteins were eluted with NETN buffer containing 2 mg/ml biotin (Sigma) for 90 min at 4 °C. The eluates were incubated with S protein beads (Novagen, Billerica, MA) for 1 h. The beads were washed three times with NETN buffer and subjected to sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Each pull-down sample was run into the separation gel; we excised the whole band as one sample and subjected it to in-gel trypsin digestion and MS analysis (performed by Taplin Mass Spectrometry Facility, Harvard Medical School, Boston, MA). Mass spectrometry data are available via ProteomeXchange with identifier PXD002462.

Mass Spectrometry Analysis

Excised gel bands were cut into ∼1 mm3 pieces. The gel pieces were then subjected to in-gel trypsin digestion (24) and dried. Samples were reconstituted in 5 μl of HPLC solvent A (2.5% acetonitrile, 0.1% formic acid). A nano-scale reverse-phase HPLC capillary column was created by packing 5 μm C18 spherical silica beads into a fused silica capillary (100 μm inner diameter × ∼20 cm length) with a flame-drawn tip. After the column had been equilibrated, each sample was loaded via a Famos autosampler (LC Packings, San Francisco, CA). A gradient was formed and peptides were eluted with increasing concentrations of solvent B (97.5% acetonitrile, 0.1% formic acid).

As peptides eluted, they were subjected to electrospray ionization and entered into an LTQ Velos ion-trap mass spectrometer (ThermoFisher, San Jose, CA). Peptides were detected, isolated, and fragmented to produce a tandem mass spectrum of specific fragment ions for each peptide. Peptide sequences (and hence protein identities) were determined by matching protein databases with the acquired fragmentation patterns using software program, SEQUEST (ver. 28) (ThermoFisher, San Jose, CA). Enzyme specificity was set to partially tryptic with two missed cleavages. Modifications included carboxyamidomethyl (cysteines, fixed) and oxidation (methionine, variable). Mass tolerance was set to 2.0 for precursor ions and 1.0 for fragment ions. The Human IPI databases, version 3.6, were searched because we used HEK293T cells. The number of entries in the database was 160,900, which included both the target (forward) and the decoy (reversed) human sequences. Spectral matches were filtered to contain a less than 1% false discovery rate (FDR) at the peptide level on the basis of the target-decoy method (25). To generate relatively stringent results, we also analyzed our MS data with a more stringent cutoff (0.5 Da mass tolerance for both the precursor and fragment ions) by searching the Swiss-Prot human data set (released by 2015–11, containing 20,193 reviewed human protein sequences) to generate HCIPs; all the data were applied to a 1% protein FDR filter. The protein inference was evaluated following the general rules reviewed in (26), with manual annotation based on experience when necessary. This same principle was used for isoforms when they were present in the database. The longest isoform was reported as the match.

Experimental Design and Statistical Rationale

For the evaluation of potential protein-protein interactions, spectra counts from the MS analysis were subjected to assessment in a large matrix, supplemented with control TAP-MS results using the same protocol. In total, 37,063 binary interactions were identified in 86 experiments carried out with 68 PTP family purifications and 18 duplications of all atypical-MKP DSP subfamily members. Because the MUSE is a data-driven algorithm, the data quantity will positively associate with the performance of MUSE algorithm. The more data we feed into our reference database, the better MUSE algorithm will perform. We used 1626 control TAP-MS results which were performed using the same protocol as the control group, 1,606 of them are TAP-MS with using random proteins as baits, and the other 20 are TAP-MS using empty vectors as baits. The total TAP-MS experiment number is 1712. To determine the data reproducibility for our TAP study, we chose 18 atypical-MKP DSP family members to perform biological replicates (duplication) of these TAP-MS analyses.

To assign probabilistic scores to individual protein-protein interactions and eliminate nonspecific interactions, we applied a statistical model called the Minkowski distance-based unified probabilistic scoring environment (MUSE) (please also see the Supplemental Methods). To determine data reproducibility for our TAP study, we chose 18 atypical- MKP DSP family members for a biological repeat of the TAP-MS analysis. The total protein number, total spectra counts, unique prey numbers and total precursor intensity were compared between two replicates and plotted. The correlation R values were 0.95, 0.92, 0.96, and 0.84, respectively (Supplemental Methods). The ratios of reproducible data were estimated by dividing the reproducible interactions with the total interactions. We compared our data reproducibility with that of several recently published interactomes studies and found that it was reliable (Supplemental Methods). To choose the HCIP cut-off, we plotted the receiver operating characteristic (ROC) of MUSE performance in PTP family studies. We considered preys recovered in 20 vector-only control experiments to be “true negative” and reported interactions in the BioGrid (http://thebiogrid.org/) (27) low-throughput data set to be “true positive”; we evaluated the performance of MUSE in PTP family studies by ROC curve. True-positive and false-positive rates were computed using various MUSE score cutoffs. The area under curve (AUC) was 0.9708, and PPIs with a MUSE score ≥ 0.9 generated 0.8% total false-positive results and covered 79.3% of the total interactions reported in the BioGRID low-throughput data set (Supplemental Methods). Interactions with a MUSE score ≥ 0.9 were kept for the follow-up analysis, which indicates that their specificities were among the top 10% of all 1712 TAP-MS experiments. In total, 1799 interactions passed this filtration and were designated as HCIPs.

To evaluate the specificity of our results, we overlapped our HCIPs with preys recovered from 20 vector only experiments and considered the overlapped ones to be “false positive”. Thirty-six of 1,799 interactions were overlapped, giving the overall false-positive rate of ∼2%. To evaluate the sensitivity of our PTP interaction data, we overlapped our HCIPs with the BioGrid low-throughput database which was experimentally confirmed. One hundred thirty-one of 455 (28.8%) previously reported interactions were covered by our data set, which is relatively high compared with most medium-to-large data sets.

We downloaded protein sequences and annotations from the UniProt Consortium (28). Whole PTP family and individual subfamily interactomes were generated by cytoscape (29). Pathway annotations and disease correlations were generated using HCIPs identified in our studies, weighted by spectra counts, and searched in the Knowledge Base provided by Ingenuity pathway software, which contains findings and annotations from multiple sources, including the GO database, to estimate the significance of these correlations. We used the -log(p value) of individual functions to create bar graphs and disease annotation networks. The heatmap for hierarchical clustering was generated by using Multi Experiment Viewer software, version 4.9.0. We used a two-color scheme and set the color lower limit to 1, midpoint value to 10.0, and upper limit to 20.0.

Each experiment was biologically repeated twice or more, unless otherwise noted. No samples were excluded from analysis. Differences between groups were analyzed using Student t test and Pearson chi-square analysis. A p value <0.05 was considered statistically significant. When calculating the p values for GO annotations, a Fisher's exact test was used.

RNA Extraction, Reverse-transcription, and Real-time PCR

RNA samples were extracted with TRIZOL reagent (Invitrogen). The reverse-transcription assay was performed using the iScript™ Reverse Transcription Supermix Kit (Bio-Rad, Hercules, CA) according to the manufacturer's instructions. Real-time PCR was performed using Power SYBR Green PCR master mix (Applied Biosystems, Waltham, MA). To quantify gene expression, we used the 2-ΔΔCt method. GAPDH expression was used for normalization. The sequence information for each primer used in the gene expression analysis was as follows:

GAPDH-Forward: 5′-ATGGGGAAGGTGAAGGTCG-3′;

GAPDH-Reverse: 5′-GGGGTCATTGATGGCAACAATA-3′;

CTGF-Forward: 5′-CCAATGACAACGCCTCCTG-3′;

CTGF-Reverse: 5′-GAGCTTTCTGGCTGCACCA-3′;

CYR61-Forward: 5′-AGCCTCGCATCCTATACAACC-3′;

CYR61-Reverse: 5′-GAGTGCCGCCTTGTGAAAGAA-3′;

PTPN14-Forward: 5′-ACTGGAATCAATTCCGTGGG-3′;

PTPN14-Reverse: 5′-GTGTGGCCCTCTGGAAGAT -3′;

PTPN21-Forward: 5′-CGAGTTTGTGGAGTTCACCC -3′;

PTPN21-Reverse: 5′-GCGCTGATTTTGCTTGTTGT-3′;

DUSP1-Forward: 5′-AGGACAACCACAAGGCAGAC -3′;

DUSP1-Reverse: 5′-CAGTGGACAAACACCCTTCC-3′;

DUSP2-Forward: 5′-AACAGGGGACAAAACCAGC -3′;

DUSP2-Reverse: 5′-CAGGTCTGACGAGTGACTGC-3′;

SNX17-Forward: 5′-GTGAATGGAGTCCTGCACTG -3′;

SNX17-Reverse: 5′-GAGAAAAGCTTCTTTGGGGG-3′.

GST pulldown assay

GST-fused YAP was expressed and purified in Escherichia coli BL21 cells. GST-YAP protein (2 μg) was immobilized on GST-Sepharose 4B beads and incubated with various cell lysates for 2 h at 4 °C. Beads were washed three times. Proteins bound to beads were eluted and subjected to SDS-PAGE and Western blotting analysis.

Immunofluorescence Staining

Immunofluorescence staining was performed as described previously (30). In brief, cells cultured on coverslips were fixed with 4% paraformaldehyde for 10 min at room temperature and then cells were extracted with 0.5% Triton X-100 solution for 5 min. After being blocked with TBST containing 1% bovine serum albumin, cells were incubated with the indicated primary antibodies for 1 h at room temperature. Cells were then washed and incubated with fluorescein isothiocyanate or rhodamine-conjugated second primary antibodies for 1 h. Cells were counterstained with 100 ng/ml 4′,6-diamidino-2-phenylindole (DAPI) for 2 min to visualize nuclear DNA. The cover slips were mounted onto glass slides with anti-fade solution and visualized under a Nikon ECLIPSE E800 fluorescence microscope with a Nikon Plan Fluor 60× oil objective lens (NA 1.30).

RESULTS

Proteomic Analysis of the Human PTP Family

To achieve a comprehensive understanding of the protein-protein interaction landscape of the human PTP family, we established HEK293T cells that stably expressed each of the 68 members of the human PTP family, which were fused with SFB triple tags (S protein tag-Flag tag-SBP tag) (Fig. 1A). After being validated by Western blotting and immunofluorescence staining, the stable PTP-293T cells were subjected to TAP, as described previously (32), and the associated proteins in the isolated complexes were identified by MS analysis and searched in the human IPI database (Fig. 1A). A complete list of the peptides and proteins identified is provided in supplemental Table S1. Raw data were also searched in the Swiss-prot protein database with protein FDR < 0.1%, and a protein identification list is provided in supplemental Table S2. We compared the two lists, and 87.6% of the HCIPs generated by the two methods overlapped. The results are presented in supplemental Table S1: “Compare with Swiss-Prot search.”

Proteomic analysis of the human protein tyrosine phosphatase family.A, Schematic illustration of the major steps of the tandem affinity purification-mass spectrometry (TAP-MS) analysis of the human protein tyrosine phosphatase (PTP) family. Sixty-eight PTP proteins were constructed into a C-terminal SFB-tag fused lentiviral vector using gateway technology. HEK293T cells stably expressing each bait protein were generated by lentiviral infection and puromycin selection. Through standard TAP steps, purified protein complexes were identified by MS analysis, and final interactive proteins were generated by the MUSE statistical model. Three major PTP subfamilies were indicated: atypical-MAPK phosphatase (A/MKP) dual-specificity phosphatases (DSPs), classic PTPs (transmembrane receptor-like phosphatases [rPTPs] and nontransmembrane receptor phosphatases [nrPTPs]), and myotubularin/myotubularin-related (MTM/MTMR) DSPs. B, The workflow of the MUSE algorithm for the TAP-MS data analysis. The diagram depicts the major steps of the MUSE algorithm in the AP/MS data analysis. C–D, Visualization of the example 3-D spaces of TAP-MS and the method of estimating Minkowski power parameter p. An example data set consisting of three independent experiments can be described by a three-dimensional space. In this space, the existence of one prey can be described as one point (C). Exp, experiment. The coefficient of variation of the human PTP data set was evaluated by random drawing and assigning raw spectra counts to random bait-prey combinations, choosing the P that caused minimal system disturbance. The MUSE algorithm simulates the P from 0 to 1, with 0.01 intervals; it then calculates the CV for each preys and combines them to generate the CV for the whole data set (D).

To achieve the best use of our PTP proteomics results, we developed a Minkowski-distance-based unified scoring environment (MUSE) that allows us to assign quality-associated scores for protein-protein interactions in a wide range (Fig. 1B and Supplemental Methods). Large variances of total identifications of label-free quantifications, based on either spectral counting or MS1/MS2 peak areas and intensities, occur between different AP/MS experiments. In the MUSE algorithm, the raw spectra counts are first normalized to unified distances to effectively compare the data from different experiments and different preys to achieve more accurate and biologically relevant results. These normalization steps include normalizing the individual identifications with total identifications to generate quantitative results from different comparable experiments, and normalizing the individual identifications with prey length to effectively compare different preys (Fig. 1B and Supplemental Methods).

With proper normalizations, every binary interaction identified in a single experiment can be converted to a 1-D distance, and every single experiment will be considered as one dimension. The existence of a given prey in all the experiments can be described as a vector in this m-dimensional space. For example, a data set of three TAP-MS experiments can be converted to a 3-D space, in which a prey can be described as a vector (Fig. 1C and Supplemental Methods). Its projection on one axis represents its existence in this experiment, and its included angle with this axis reflects its specificity (i.e. θ1 represents the specificity of prey in TAP/MS Experiment 1; θ2 represents the specificity of the same prey in TAP/MS Experiment 2; and θ3 represents the specificity of the same prey in TAP/MS Experiment 3). Small included angles indicate higher specificity, whereas large angles indicate lower specificity. For example, in a data set containing three experiments (MTMR4, PTPN21, and PTPN14), the HSPA8 vector formed large angles with all three axes, indicating that its existence in all three experiments was not specific. The CEP55 vector lay on the MTMR4 axis and formed right angles with the PTPN14 and PTPN21 axes, indicating that its interaction with MTMR4 is specific, whereas it does not interact with PTPN14 or PTPN21. The WWC1 vector forms small angles with the PTPN14 and PTPN21 axes but is perpendicular to the MTMR4 axis, indicating that it is specific in these two experiments, however, this finding requires needs more justifications.

With an increasing number of experiments, the angles of specific interactions quickly decreased and were easily distinguished from the nonspecific backgrounds. We then ranked the interactions by ranking the angles between prey vectors and experiment axes, which could be converted by dividing the experimental length with the estimated length (Fig. 1B and Supplemental Methods). However, this space is a non-Euclidian space because all proteins exist in a network and associate with other proteins. When each experiment is well controlled and normalized (i.e. performed in the same cell line with the same protocol), the distance between two points can be simplified to the Minkowski distance. One Minkowski power parameter P can be used to describe the curvature of the whole space. The p value describes the diversity and specificity of the experiments. Choosing the P parameter provides a balance between the protein abundance, identified in individual experiments, and the protein appearance, identified in all experiments. To estimate the best Minkowski power parameter P, we calculated the coefficient of variation of the whole data set by randomly drawing and assigning raw spectra counts to random bait-prey combinations; we chose the P that caused minimal system disturbance. This P will be the best choice to describe the whole system because it generates the best system stability. In this particular experiment, we found that the best p value was around 0.2 (Fig. 1D).

To filter out nonspecific interacting proteins, 1,626 unrelated TAP-MS experiments were used as the control group and subjected to the MUSE statistical model, which is described in detail in the Supplemental Methods. Each identified prey was assigned a MUSE score, and any interaction with a MUSE score ≥0.9 and raw spectra counts >1 was considered an HCIP. To determine the data reproducibility for our TAP study, we chose atypical-MKP DSP family members (n = 18) to perform biological replicates of these TAP-MS analyses. The overall correlation R value and HCIP correlation R value were estimated to be 0.64 and 0.72, respectively, suggesting that this proteomic study had high reproducibility (Fig. 2A). We identified 940 HCIPs from the total 4,213 identified unique preys of the human PTP family (Fig. 2B and 2C, supplemental Table S3). The GO analysis of these HCIPs indicated that they were widely distributed in the cell with different subcellular localizations (Fig. 2D), and involved in multiple cellular functions (Fig. 2E and supplemental Tables S4–S6). Given that the 293T cells were grown in normal culture conditions when they were collected for this study, the total spectral counts and HCIPs for each PTP family member identified should be considered as the basal state interactome of the human PTP family (Fig. 2C).

Summary of the human protein tyrosine phosphatase family proteomics study.A, Data reproducibility for the human PTP family proteomic study. Eighteen atypical-MKP DSP subfamily members were subjected to a biological repeat of the TAP-MS analysis. Overall identified preys and high-confidence candidate interacting proteins (HCIPs) were used to estimate data reproducibility. The bar graph represents raw data reproducibility. The correlation and coefficients were calculated on the basis of raw data. The “cutoff peptide number” meant that we only considered proteins with a certain number of peptides identified. B, The total peptide and protein numbers obtained from the MS analysis are listed. A MUSE score> 0.9 was used as the cutoff to identify HCIPs. C, The total spectral counts (TSCs) and corresponding number of HCIPs for each PTP bait protein are shown together. D, E, Gene ontology annotation for the identified PTP interactors. The cellular localization (D) and cellular functions (E) of the PTP family, based on a GO annotation of their HCIPs, are shown as pie graphs.

Placement of the Human PTP Family Within Key Signaling Pathways and Cellular Functions

Although it is widely accepted that the PTP family functions in various signaling pathways, to our knowledge, no systematic analysis of the signaling pathways or cellular functions regulated by the PTP family has been published. To achieve this, we conducted a GO analysis of signaling pathways for HCIPs of each PTP subfamily member (Fig. 3A and supplemental Table S4). Intriguingly, six key signaling pathways—PI3K-AKT, RAS-ERK, mTOR-S6K, Hippo-YAP, WNT, and TGFβ—were highly enriched for HCIPs of PTP subfamilies (Fig. 3A). Moreover, regulation of cell cycle and vesicle trafficking, two major cellular functions, was also identified as a generic cellular activity that involves the PTP family (Fig. 3A).

Gene ontology annotation of key signaling pathways and cellular functions of the protein tyrosine phosphatase family.A, B, Identification of key signaling pathways and cellular functions enriched in the protein tyrosine phosphatase (PTP) family. The percentage of identified pathways and functions for each subfamily is indicated (A). The relative enrichment of each pathway and function was compared within each PTP subfamily (A) and between each PTP subfamily (B). Only the statistically significant (p < 0.05) results are shown. C, The interactome of the atypical-MKP dual-specificity phosphatase (DSP) subfamily. D, The cytoscape-generated merged interaction network for the atypical-MKP DSP subfamily and the RAS-ERK pathway. E, The confidence of association between the RAS-ERK pathway and atypical-MKP DSP subfamily was estimated by using th MUSE score. F, The cytoscape-generated merged interaction network for the atypical-MKP DSP subfamily and the identified vesicle trafficking-related proteins.

The atypical-MKP DSP subfamily predominantly associated with the RAS-ERK pathway (Fig. 3A), which is consistent with the results of previous studies (6). Unexpectedly, the classic PTP and MTM/MTMR DSP subfamilies were relatively enriched in the Hippo-YAP pathway and cell cycle process, respectively (Fig. 3A). When a single signaling pathway or cellular function was compared among the different PTP subfamilies, the atypical-MKP DSP family was found to be primarily involved in RAS-ERK, PI3K-AKT, and vesicle trafficking pathways and function, whereas the classic PTP subfamily was still highly enriched in the Hippo-YAP pathway (Fig. 3B). These data suggest that the PTP subfamilies participate in distinct signaling pathways or cellular functions and that several key pathways or cellular functions are dominantly regulated by a specific PTP subfamily.

Involvement of the Atypical-MKP DSP Family in the RAS-ERK Pathway and Vesicle Trafficking

To confirm signaling pathway and function enrichment for each PTP subfamily, we first generated the protein interactome for the atypical-MKP DSP subfamily, the best characterized DSP subgroup (Fig. 3C). This subfamily has the ability to dephosphorylate the effector kinases from three major MAPK signaling pathways: ERK1/2, JNK and p38 kinases (6, 33). Because MAPKs play critical roles in cell growth, differentiation, and survival, these DSPs are key players in tissue homeostasis. Consistently, the RAS-ERK pathway was identified as a bona fide pathway that was enriched in HCIPs of the atypical-MKP DSP subfamily (Fig. 3A and 3B). Surprisingly, our HCIP lists not only uncovered a number of MAPK family components but also revealed many MAPK upstream components in the RAS-MAPK pathway including RAS GTPases (HRAS, KRAS, and NRAS), RAF family members (RAF1, BRAF, and ARAF), and MEK kinases (Fig. 3D). We also identified several MAPK family-related MAP3K and MAP4K kinases (i.e. MAP3K3, MAP3K5, MAP3K12, MAP4K4, and MAPKAPK3) (Fig. 3D). To profile atypical-MKP DSPs and their associations with the RAS-ERK (MAPK) pathway, we clustered the members of this subfamily for each component of the RAS-ERK (MAPK) pathway (Fig. 3E). The data suggest that the atypical-MKP DSP subfamily is extensively involved in the regulation of the RAS-RAF-MEK-MAPK (ERK) pathway, but is not restricted to MAPK kinases.

In the atypical-MKP DSPs subfamily, DUSP2 was found to associate with several components of the RAS-ERK (MAPK) pathway (Fig. 3D), and these interactions were assigned relatively high MUSE scores (Fig. 3E), suggesting that DUSP2 is a putative regulator of upstream components in the RAS-ERK (MAPK) pathway. Indeed, DUSP2 interacted with RAS small GTPase family member (HRAS), RAF kinases (ARAF and RAF1), and MEK kinase (MAP2K3), which were identified in our proteomic study, where MAPK14 and MAP4K2 served as positive and negative controls, respectively (supplemental Fig. S1A). Loss of DUSP2 (supplemental Fig. S1B) not only activated ERK1/2 kinases but also activated upstream RAF1 and MEK kinases in the RAS-ERK (MAPK) pathway (supplemental Fig. S1C). MAP2K3 MEK kinase was also activated in cells transduced by DUSP2 shRNA, which led to the activation of its downstream kinase p38MAPK and p38MAPK-downstream kinase MAPKAPK2 (supplemental Fig. S1C). On the other hand, overexpression of DUSP2, but not its catalytic phosphatase inactive mutant (C257S), suppressed the serum-induced activation of RAF, MEK, ERK1/2, MAP2K3, p38MAPK, and MAPKAPK2 kinases in the RAS-ERK (MAPK) pathway (supplemental Fig. S1D), suggesting that phosphatase activity is required for DUSP2-mediated RAS-ERK (MAPK) pathway inhibition. To rule out feedback regulation in this process, we pre-treated starved cells with either the MEK inhibitor PD98059 or the p38MAPK inhibitor SB203580 and found that overexpression of DUSP2 still inactivated the RAF and MAP2K3 kinases, respectively, upon serum stimulation (supplemental Fig. S1E and S1F), further confirming the roles of DUSP2 in the regulation of upstream components of the RAS-ERK (MAPK) pathway. Together, these results indicated that DUSP2 may have extensive functions in the regulation of the RAS-ERK (MAPK) pathway rather than only the MAPK kinases in this pathway.

The vesicle trafficking process was also enriched in HCIPs of the atypical-MKP DSP subfamily compared with that of HCIPs of other PTP subfamilies (Fig. 3B). As shown in Fig. 3F, the members of the atypical-MKP DSP subfamily were involved in several aspects of vesicle trafficking, including vesicle membrane sorting (SNX17 and SAMM50), the SNARE complex component (VTI1B), vacuolar protein sorting (VPS4A, VPS26A, and VPS29), Rab small GTPases (Rab family), and vesicle trafficking motors (DCTN family). The interactions between these identified vesicle trafficking components and atypical-MKP DSPs were confirmed (Supplemental Fig. S2A and S2B). Interestingly, some atypical-MKP DSPs co-localized with vesicle trafficking components in the cell (supplemental Fig. S2C). We showed that the interaction between DUSP1 and SNX17 was specific (supplemental Fig. S2D).

Because loss of SNX17 (supplemental Fig. S2E) did not have a dramatic effect on DUSP1's activity (supplemental Fig. S2F), we investigated the role of DUSP1 in the regulation of SNX17. SNX17 was previously shown to mediate cargo retrieval away from the lysosome-dependent degradation pathway and stabilize some transmembrane receptors, such as integrins (34⇓–36). Interestingly, loss of DUSP1 (supplemental Fig. S2G) resulted in the down-regulation of mature β1 integrin but not of its immature form (supplemental Fig. S2H). Lysosomal inhibition by bafilomycin prevented the degradation of β1 integrin in DUSP1 knockdown cells (supplemental Fig. S2H), suggesting that DUSP1 is involved in the regulation of lysosome-mediated β1 integrin degradation. Moreover, overexpression of DUSP1, but not of its phosphatase-inactive mutant (C258S), stabilized mature β1 integrin (supplemental Fig. S2I), indicating that the phosphatase activity of DUSP1 is required in this process. Intriguingly, loss of SNX17 reversed the DUSP1-mediated stabilization of mature β1 integrin (supplemental Fig. S2I), suggesting that DUSP1 regulates β1 integrin through SNX17. In addition, loss of DUSP1 attenuated integrin pathway activation (supplemental Fig. S2J) and suppressed cell migration (supplemental Fig. S2K). These data not only suggest that DUSP1 functions together with SNX17 to regulate vesicle-dependent β1 integrin turnover but demonstrated that the vesicle trafficking process is explicitly regulated by atypical-MKP DSP subfamily.

Overview of the Classic PTP Family Protein Interaction Landscape

Classic PTPs recognize and dephosphorylate phospho-tyrosine residues, which can be further grouped as receptor rPTP (21 members) and nrPTP (17 members) (2). The HCIPs for the rPTP group and the group identified through our proteomic analysis were generally different (Fig. 4A and 4B), suggesting that these two PTP groups have distinct cellular functions. The prey-oriented analysis identified five major clusters in the classic PTP subfamily—three in the nrPTP (clusters 2, 3, and 5), one in the rPTP group (cluster 4), and one that overlaps between the rPTP and nrPTP groups (cluster 1) (Fig. 4C). A key signaling analysis of each cluster showed that these five clusters are involved in multiple cellular signaling pathways and cellular functions (Fig. 4C). Consistently, the Hippo-YAP pathway, which was shown to be one of the major signaling pathways enriched in HCIPs of this subfamily (Fig. 3A and 3B), was also identified in three of the five clusters (Fig. 4C). These results suggest that members of the classic PTP subfamily show functional similarity within cells and that the Hippo-YAP pathway could be one node associated with the function of the classic PTP subfamily.

Crosstalk between the classic protein tyrosine phosphatase subfamily and the Hippo-YAP pathway.A, Protein interaction network for the classic protein tyrosine phosphatase (PTP) subfamily (transmembrane receptor-like phosphatases [rPTPs] and non-rPTPs [nrPTPs]). rPTP and nrPTP bait proteins were labeled in blue and red, respectively. B, Comparison of rPTP- and nrPTP-associated high-confidence candidate interacting proteins (HCIPs). Overlay of volcano plots of protein enrichments of HCIPs identified in the nrPTP subfamily over the rPTP subfamily, plotted against corresponding p values. The X axis indicates the protein enrichment in the log2 scale, and the Y axis indicates the significance of the changes in the -log10 (p value) scale. X<0 represents the prey enrichment for the rPTP subfamily, and X>0 represents the prey enrichment for the nrPTP subfamily. C, A heatmap was generated from the hierarchical clustering of HCIPs of the classic PTP subfamily. Five prominent HCIP clusters were manually selected, and their signaling pathway annotations are shown. The colors of squares in the heat map represent the number of identified HCIP peptides for each bait protein. D, The merged interaction network among PTPN14, PTPN21, and Hippo pathway components. Reciprocal identification between baits is indicated by a double-headed arrow, and unidirectional identification is indicated by a single-headed arrow. E, Schematic illustration of the domain structures of PTPN21, WWC1, and YAP. F, The association between PTPN21 and WWC1. A pulldown assay was performed with S protein beads, and the indicated proteins were detected by Western blotting. G, PTPN21 specifically interacted with WWC1 in the Hippo pathway. H, Two WW domains of WWC1 were required for its association with PTPN21. I, YAP directly interacted with PTPN14 and PTPN21. Bacterially purified GST-YAP was used for the pulldown experiment. The indicated proteins were detected by Western blot analysis. GST-YAP was shown by Coomassie blue staining. J, Two WW domains of YAP are required for its binding to PTPN21. K, The linker region of PTPN21 mediated its binding to YAP. L, PTPN21 translocated YAP from the nucleus into the cytoplasm. Immunofluorescence staining was performed to detect the localization of endogenous YAP in HeLa cells overexpressing SFB-PTPN5 or SFB-PTPN21. DAPI, nucleus; M, merged. M, Both PTPN14 and PTPN21 suppress YAP activity. Transcripts of YAP target genes were detected by quantitative PCR in HEK293A cells transduced by the indicated shRNAs. ** p < 0.01 and *** p < 0.001. N, A proposed model of Hippo pathway regulation by PTPN14 and PTPN21.

Regulation of the Hippo-YAP Pathway by PTPN14 and PTPN21

The Hippo pathway plays fundamental roles in tissue homeostasis and organ size control by restricting the activity of its downstream effector, YAP. Genetic mutations or deletions of Hippo pathway components lead to tumor formation, suggesting that Hippo is a tumor suppressor pathway (37, 38). The discovery that the Hippo-YAP pathway is functionally enriched in the classic PTP subfamily indicates that this pathway is regulated by this PTP subfamily. Indeed, one classic PTP family member, PTPN14, has been shown to negatively regulate YAP by interacting with YAP and WWC1 in the Hippo pathway (21, 32, 39). To explore the relationship between the classic PTP subfamily and Hippo-YAP pathway, we compared classic PTP interactome with human Hippo pathway interactome, generated in our previous study (32). Another classic PTP family member, PTPN21, was also found to associate with YAP and WWC1 in the Hippo pathway (Fig. 4D). These findings suggest that PTPN21 is another regulator of Hippo-YAP pathway.

PTPN21 is a FERM-containing PTP that has been shown to negatively regulate focal adhesion kinase and EGFR activation (40, 41). We confirmed the association between PTPN21 and WWC1 (Fig. 4F) and showed that their binding was specific among the YAP upstream components of the human Hippo pathway (Fig. 4G). Two WW domains, located at the N terminus of WWC1 (Fig. 4E), were required for its binding with PTPN21 (Fig. 4H). Moreover, PTPN14 and PTPN21 had a similar capacity for binding to YAP (Fig. 4I). Interestingly, the YAP paralog protein TAZ was not able to associate with PTPN21, indicating the presence of binding specificity between YAP and PTPN21 (Fig. 4J). Like the interaction between YAP and PTPN14, the binding of YAP and PTPN21 required the two WW domains of YAP and the linker region of PTPN21 (Fig. 4J and 4K). These results confirm our proteomic findings and demonstrate that PTPN21 is a bona fide binding partner of both WWC1 and YAP in the Hippo pathway.

Next, we determined whether PTPN21 regulated the Hippo-YAP pathway. As shown in Fig. 4L, PTPN21 mostly localized on the cell membrane, and overexpression of PTPN21 translocated YAP from the nucleus into the cytoplasm, whereas as a negative control, this was not the case for PTPN5. YAP forms a complex with the transcription factor TEAD in the nucleus to promote downstream gene transcription. However, when YAP was translocated by PTPN21 into the cytoplasm (Fig. 4L), it lost its ability as a transcriptional co-activator. This finding indicates that PTPN21 is a negative regulator of YAP. Because PTPN14 and PTPN21 share the same binding partners (YAP and WWC1) in the Hippo pathway and have similar domain structures (21), we determined whether PTPN14 and PTPN21 behaved similarly in Hippo pathway regulation. Indeed, loss of either PTPN14 or PTPN21 increased the transcription of YAP downstream target genes, whereas double knockdown of both PTPN14 and PTPN21 further increased the expression of these genes (Fig. 4M). These results demonstrate that PTPN21 and PTPN14 function similarly in the suppression of YAP and they may act together in the regulation of Hippo-YAP pathway (Fig. 4N).

Roles of the MTM/MTMR PTP Subfamily in Cell Cycle Regulation

The MTM/MTMR DSP subfamily comprises 15 members: MTM1 and MTMRs 1–14 (7). Nine members of this family have been shown to have catalytic activity as both lipid phosphatases and PTPs (7). A proteomic analysis of MTM/MTMR DSP subfamily showed that individual family members had relatively divergent interactomes (Fig. 5A), which was consistent with the finding that this PTP subfamily exhibits relatively unique and nonoverlapping functions in the cell (7). A previous signaling pathway and cellular function analysis revealed that cell cycle regulation was one of the key functions of this PTP subfamily (Fig. 3A). To further investigate this function, we conducted a GO analysis of HCIPs of the MTM/MTMR DSP subfamily to determine their functions in different cell cycle phases (Fig. 5B). Several MTM/MTMR subfamily members associated with components in different cell cycle phases; MTMR8 and MTMR3 may regulate G1/S transition through CDK4, and MTMR4 may regulate mitosis through CEP55 (Fig. 5B). These results support the potential roles of MTM/MTMR DSP subfamily members in cell cycle regulation.

The roles of the myotubularin/myotubularin-related dual-specificity phosphatase subfamily in cell cycle regulation.A, Protein interaction network of the myotubularin/myotubularin-related (MTM/MTMR) DSP subfamily. B, Identification of the members of the MTM/MTMR DSP subfamily involved in each cell cycle phase. C, The merged protein interaction network between MTMR4 and CEP55. D, Schematic illustration of the domain structures of MTMR4 and CEP55. The GPP(X)3Y motif at the N terminus of MTMR4 is shown, which is conserved in different species. E, The interaction between MTMR4 and CEP55 was validated. Pulldown experiments were performed using S protein beads, and the indicated proteins were detected by Western blotting analysis. F, The N-terminal 12 amino acids of MTMR4 were required for its interaction with CEP55. G, The single site mutation Y11A in the GPP(X)3Y motif of MTMR4 disrupted the interaction between MTMR4 and CEP55. H–I, MTMR4 was down-regulated by shRNA in HeLa cells. MTMR4 protein levels (H) and mRNA levels (I) were detected in HeLa cells transduced with the indicated shRNA. J, Loss of MTMR4 induced a cytokinesis defect. Immunofluorescence staining was performed in shRNA-transduced HeLa cells, as indicated by GFP at different cell cycle phases. Ri, shRNA knockdown; Meta, metaphase; Telo, telophase. K, Loss of MTMR4 induced bi-nuclear cell formation. The percentage of bi-nuclear cells was analyzed for the indicated shRNA and plasmid-transduced cells. *** p < 0.001. L, The vesicle-like localization of MTMR4 in different cell cycle phases. The localizations of SFB-MTMR4 and HA-CEP55 were indicated by immunofluorescence staining. M, MTMR4 associated with endosome proteins, which are involved in membrane fusion at the cleavage stage. HeLa cells were transfected with the indicated plasmids, synchronized at cytokinesis, and subjected to pulldown assays.

MTMR4 Couples with CEP55 to Regulate Cytokinesis

We next performed functional validation to test the hypothesis that MTMR4 plays a major role in cell cycle regulation, especially during mitosis and cytokinesis, with CEP55 as its functional partner in this process. The results of the protein interactome analysis suggested that CEP55 and MTMR4 can form a protein complex, because they were identified in the reciprocal TAP-MS analyses (Fig. 5C). Notably, a proteomic analysis of the CEP55 complex did not identify any other MTMR4-associated protein, and vice versa (Fig. 5C), suggesting that they form a distinct complex in vivo.

CEP55 is a centrosome- and midbody-associated protein that acts as a key regulator of cytokinesis (42). CEP55 functions together with the endosomal sorting complex required for transport machinery to complete the final abscission during cytokinesis, where CEP55 recruits TSG101 and ALIX to complete membrane fission events (43). Interestingly, one GPP(X)3Y motif, characterized within the CEP55-binding proteins (i.e. ALIX and TSG101) (43), was identified at the N terminus of MTMR4, and this motif in MTMR4 was highly conserved in different species (Fig. 5D). We confirmed the interaction between CEP55 and MTMR4 (Fig. 5E) and that their binding was mediated through the GPP(X)3Y motif of MTMR4, because either deletion of the N terminus of MTMR4 (Fig. 5F) or mutation of the Tyr residue within this motif (Fig. 5G) disrupted the association between CEP55 and MTMR4. These results revealed that CEP55 is a bona fide binding partner of MTMR4.

To determine the role of MTMR4 in mitosis, we performed shRNA-mediated loss-of-function studies of MTMR4 in HeLa cells (Fig. 5H and 5I). Loss of MTMR4 induced bi-nuclear cell formation (Fig. 5J and 5K), suggesting the presence of a cytokinesis defect in the absence of MTMR4. Indeed, MTMR4 knockdown cells passed through metaphase and telophase but were blocked at cytokinesis, as shown by the extended central spindle bridge between two daughter cells (Fig. 5J). The CEP55 association was required for this process, because unlike wild-type MTMR4, the CEP55 binding-deficient MTMR4 mutant did not rescue the formation of bi-nuclear cells (Fig. 5K). These results indicate that MTMR4 and its association with CEP55 are required for the completion of cytokinesis.

To explore the role of MTMR4 in cytokinesis, we checked the subcellular localization of MTMR4 in different cell cycle phases (Fig. 5L). In interphase cells, MTMR4 showed vesicle-like localization (Fig. 5L), which is consistent with the results of previous reports that MTMR4 localizes in early and recycling endosomes (44, 45). This vesicle-like localization of MTMR4 was persistent through mitosis and cytokinesis, and MTMR4 also localized on the central spindle bridge during cytokinesis (Fig. 5L). As a matter of fact, endosome has been found to be enriched at the site of abscission during cytokinesis to facilitate membrane fusion (46⇓⇓–49). Moreover, MTMR4 associated with several endosome proteins that have been reported to regulate the final cleavage during cytokinesis (46⇓⇓⇓–50), especially in cells synchronized at late mitosis (Fig. 5M). These results indicate that MTMR4 is involved in vesicle transportation and membrane fusion during the abscission stage.

Interestingly, MTMR4 and CEP55 did not show identical localization in interphase and mitotic cells, but they co-localized at the site of abscission during late cytokinesis (Fig. 5L). Because CEP55 is a known midbody-associated protein, these data suggest that CEP55 serves as the docking site at midbody to recruit MTMR4-containing endosomes to the abscission site and complete membrane fusion for the final cleavage during cytokinesis.

DISCUSSION

This proteomic study identified associated proteins for almost 70% of PTPs encoded in the human genome and therefore provides a glimpse into the protein-protein interaction network of the human PTP family. The interactomes generated for the human PTP family reveal extensive PTP-involved cellular functions and extend our understanding of the dynamic regulation of protein tyrosine phosphorylation. Moreover, according to the established annotation for the identified PTP interactors, the major PTP subfamilies are associated with several key signaling pathways or cellular functions, as shown by the findings that one or two of these pathways and functions were highly enriched for each subfamily.

Notably, the annotation of the atypical-MKP DSP subfamily not only confirmed the results of previous studies that showed their roles in MAPK kinase family regulation but also suggested more extensive regulatory functions for this subfamily that extend to cover the whole RAS-RAF-MEK-MAPK pathway. As for the classic PTP subfamily, we further dissected the crosstalk between this subfamily and the Hippo-YAP pathway; this led to the discovery of PTPN21, which functions similarly to PTPN14 to regulate the Hippo-YAP pathway. Finally, in line with the implications of the GO annotation, our examination of the potential roles of the MTM/MTMR DSP subfamily in cell cycle regulation revealed that MTMR4 has a role in cytokinesis through CEP55. As an endosome-associated protein, MTMR4 may form a complex with CEP55 during the abscission stage to facilitate membrane fusion and therefore complete cytokinesis. In this process, MTMR4 can serve as a bridge to dock endosome vesicles to CEP55-marked midbody. This hypothesis was confirmed by a recent finding that the MTMR4 FYVE motif, which is required for the association between MTMR4 and the endosome vesicle, was also required to secure normal cytokinesis (51). Taken together, these functional validations further demonstrate the success of human PTP family pathway annotation built on our proteomic study.

In affinity purification followed by mass spectrometry (AP/MS), one of the major challenges is to distinguish the bona fide interactors from the large number of nonspecific interactors (52). More importantly, it is difficult to systematically analyze and rank a list of hundreds, if not thousands, of putative associated proteins. Several post-AP/MS data analysis algorithms have been proposed to eliminate the common contaminants and generate a list of HCIPs (53⇓⇓–56). Unfortunately, we repeatedly observed a considerable number of false-positive and false-negative discoveries, as defined by experimental validations, in the results generated by the available algorithms. Another issue is that all the previous algorithms were designed to find the most confident interacting proteins, sometimes by combining several filters, such as NSAF and NSAF-contaminant extraction (55) or CompPASS-Z and CompPASS-D/WD (56, 57), and sometimes by even combining different algorithms (58, 59). Although these approaches ensure the quality of the HCIP lists, they may cause even more significant loss of bona fide but weak interacting proteins. These relatively weak interacting proteins could be functionally important, especially in signal transduction pathways. For example, several Wnt pathway regulators were identified as weakly associated proteins in Wnt pathway AP/MS studies, but were confirmed later by a genome-wide siRNA screening (60).

We developed a MUSE scoring system for the current PTP proteomics data analysis. To demonstrate the advantage of the MUSE algorithm, we compared high-confident interactions generated by MUSE, SAINT-express (the upgraded version of SAINT) and CompPASS-WD scores (supplemental Fig. S11A). The three algorithms recognized similar number of interactions: MUSE, 1483, MUSE score > 0.9; SAINT-express, 1525, SAINT score > 0.99; CompPASS-WD, 1217, 5% threshold. However, only 311 interactions were recognized by all three algorithms, suggesting that different data analysis algorithms generate different results (supplemental Fig. S11A). We overlapped the HCIP lists generated by the three algorithms with knowledge databases to identify the potential true-positive hits (supplemental Fig. S11B). The three algorithms recognized 182 previously reported interactions in total. MUSE only missed 29 (15.9% potential false-negative rate), whereas CompPASS and SAINT missed 64 and 69, respectively (35.2% and 37.9% potential false-negative rates). Moreover, MUSE and CompPASS shared 22 reported interactions that were not recognized by SAINT; MUSE and SAINT shared 21 reported interactions which were not recognized by CompPASS; CompPASS and SAINT shared only 5 reported interactions which were not recognized by MUSE (supplemental Fig. S11B). These findings suggest that MUSE recognizes more true-positive interactions than do SAINT and CompPASS.

We also compared the false-positive hits generated by MUSE, SAINT, and CompPASS by overlapping with the CRAPome database, a collection of proteins frequently shown in AP/MS experiments. MUSE generated only 10.9% of potential false-positive hits (i.e. overlapped with CRAPome at a 20% frequency). However, this rate increased to 35.5% and 45.5% with results generated by SAINT and CompPASS, respectively (supplemental Fig. S11C). The results of analyses suggest that MUSE recognizes fewer false-positive interactions than do SAINT and CompPASS.

On the basis of our findings, together with our previous results (supplemental Figs. S8–S10), we concluded that our MUSE algorithm performed better than did the SAINT and CompPASS algorithms, at least for this particular PTP data set analysis.

Of note, we failed to establish HEK293T cells that expressed some of the PTPs, probably because of their functions as suppressors of cell proliferation. Therefore, this large-scale proteomic study of the human PTP family only covered ∼70% of PTPs. Our conclusions and follow-up analysis were based on our proteomic analysis of these 70% PTPs. Moreover, because our proteomic study of the PTP family was performed in HEK293T cells, the protein-protein interaction network and signaling enrichment analysis generated here may have some limitations because of cell line or cell type specificity. In addition, our PTP family proteomic analysis failed to identify some of the substrates of a small number of PTPs for which substrates are known. This is likely because of the transient nature of enzyme-substrate interactions. The catalytically inactive PTP mutants could be used as an alternative approach for the identification of PTP substrates (61). Moreover, specific stimuli or treatment with phosphatase inhibitors could also be used for the discovery of specific PTP-associated complexes. Nevertheless, our current proteomic study of human PTPs serves as a strong foundation for the further investigation of these interesting proteins under various cellular, physiological, and pathological conditions.

Acknowledgments

We thank our colleagues in Dr. Chen's laboratory for insightful discussion and technical assistance, especially Drs. Jingsong Yuan and Lin Feng. We thank Drs. Rudy Guerra and Benjamin White (Rice University) for bioinformatics discussion and assistance in developing the MUSE algorithm. We also thank Drs. Steven Gygi and Ross Tomaino (Taplin Mass Spectrometry Facility, Harvard Medical School) for the mass spectrometry analysis and raw data. We thank Kathryn Hale and Ann Sutton for proof-reading the manuscript. We thank Dr. Yutong Sun and the staff of the shRNA-ORFeome core facility at MD Anderson Cancer Center for the ORFs and shRNAs.

Footnotes

Author contributions: W.W. and J.C. conceived the experiments. W.W. performed all of the experiments, with assistance from X.L., M.K.T., K.E.A, A.V.S., and J.C. X.L. performed the MUSE analysis. X.L. and W.W. performed the bioinformatics analysis. J.C. supervised the study. W.W., X.L., and J.C. wrote the manuscript.

↵* This work was supported in part by the Breast Cancer Research Foundation-AACR Career Development Award for Translational Breast Cancer Research to W.W. Grant number: 16-20-26-WANG and Department of Defense Era of Hope research scholar award to J.C. J.C. is also a member of M.D. Anderson Cancer Center (CA016672).

(2006) Epigenetic disruption of two proapoptotic genes MAPK10/JNK3 and PTPN13/FAP-1 in multiple lymphomas and carcinomas through hypermethylation of a common bidirectional promoter. Leukemia20, 1173–1175