This web page was produced as an assignment for an undergraduate course at Davidson College.

Review of a Scientific Article on Physical and In Silico Approaches

to Investigating Protein Interactomes

This webpage is a review paper of "Physical and in silico approaches identify DNA-PK in a Tax DNA-damage response interactome," by Ramadan et al. (article here). This review paper is mostly a summary of the findings detailed in the paper, in addition to some commentary at the end.

What was the goal of their study?

Tax is a protein that is believed to be involved in the initiation of Adult T-cell Leukemia (ATL), which is caused by Human T-cell Leukemia Virus type 1 (HTLV-1). As a result of its central role in the development of ATL, there has been widespread effort to figure out the functions of Tax. According to the authors of this paper, these efforts have led to the generation of large amounts of data on the interactions of Tax protein with other proteins.

The authors believe that this surge of data has given rise to the problem of being able to validate all this information on the interactome of Tax protein. In their study, the researchers demonstrate an effective method of investigating the interactome of proteins. They utilized both a physical method and an in silico method to find proteins that are involved with DNA damage repair and interact with Tax protein. Their goal was both to prove the effectiveness of their dual-approach method and to get an idea of how Tax-induced defects in DNA repair works.

The physical approach

The researchers created an S-Tax-GFP vector which expressed Tax protein that was attached to His6 and S-tags on the amino-terminal and GFP on the carboxyl-terminal. Then they transfected this vector into 293T cells. This version of Tax was functional and would be utilized in the same way as wild-type Tax and in the same places. The GFP tag allowed the researchers to keep track of whether the transfection was successful and to see where Tax protein was localized within a cell.

Next, the cells with the transgenic Tax were burst open, and the resulting substance was mixed with S-agarose beads. Additional cell lysate was added to the mixture to increase the amount of protein that could possibly bind with Tax. Transgenic Tax and protein complexes containing transgenic Tax should bind with the beads, due to the S-tags. The Tax-containing protein complexes were then purified from the rest of the cell material by washing the beads with appropriate buffer. Furthermore, the researchers increased the yield of Tax-containing protein complexes by pre-treating the additional cell lysate with S-agarose beads, so that proteins that happened to have an affinity with the beads would be removed. The isolated protein complexes were then analyzed using LC-MS/MS. The researchers repeated the experiment three times and conducted a control experiment using an S-GFP vector.

The researchers ranked the proteins that they measured from the LC-MS/MS according to several criteria that were influenced by the amount of protein measured and the strength of binding. This experiment revealed an interaction between DNA-PK and Tax, which had not been reported before. In Table 2, DNA-dependent Protein Kinase is shown as the highest hit out of the top five hits when ranked by the number of unique peptides in each protein.

Table 2. Tax binding proteins sorted by number of unique peptides.

In order to confirm binding between DNA-PK and Tax, the researchers introduced S-GFP and S-Tax to 293T cells, and isolated protein by mixing the cell lysate with S-agarose beads. The isolated protein from each type of vector were treated with SDS-PAGE, and then reacted with antibodies of Tax, DNA-PKcs (the catalytic subunit of DNA-PK) and GFP. In Figure 5, the results of the immunoblotting are shown. The first two lanes are for S-GFP, one for before affinity purification and one for after. Similarly, in the last two lanes, one is for S-Tax before affinity purification and one for after. As expected, S-GFP after affinity purification does not show binding to the DNA-PKcs antibody since it does not bind to DNA-PKcs. However, in S-Tax after affinity purification, we can see a faint band, showing that DNA-PKcs does indeed bind to Tax.

Figure 5. Tax protein binds with DNA-PKcs. Immunoblotting was performed with antibodies for Tax, DNA-PKcs and GFP.

The in silico approach

The researchers examined research articles to find proteins that had been found to bind with Tax protein and were known to have a potential functional in DNA repair (Table 1). They hoped that examining the interaction between Tax protein and protein associated with DNA repair would elucidate the mechanism by which Tax-induced DNA repair defects occurred. The researchers eventually chose the proteins, Rad51, TOP1, Chk2 and 53BP1. Through, their analyses, as described below, the researchers were able to show that DNA-PKcs is likely to have a significant role in the DNA repair mechanism that is affected by Tax protein.

Figure 1

For their first figure, the researchers graphed all the proteins that bind directly with the four proteins (first neighbor interactions) listed above to form a sub-network that they named G1. They further analyzed this graph by forming cores, which were made by showing only the proteins that had at least a certain number of edges. Using this method, they were able to find several pieces of information suggesting that DNA-PK had an important role in DNA repair mechanism that is affected by Tax protein.

Firstly, they found that DNA-PKcs was apart of a highly interactive 5-core (at least 5 edges from each protein) that consisted only of proteins involved with DNA repair mechanisms. Secondly, they found that six of the twelve proteins that were apart of this 5-core were detected in the LC-MS/MS analysis described above, thus lending validity to this in silico method. They also note that the two regulatory subunits of DNA-PK were also placed in the 5-core, which further reinforces the claim about the important role of DNA-PK. Furthermore, they found that DNA-PKcs ranked eighth in degree (the number of edges per protein), and in the top 30% in both betweeness (how often the protein is required to form the shortest possible path between any two proteins in the network) and closeness (the reciprocal of the sum of the distances of a protein to all other proteins in the network).

Figure 1. The network (G1)formed from the the first neighbors of the four chosen proteins, Rad51, TOP1, Chk2 and 53BP 1. The four proteins used to produced this network are colored yellow. Each line represents a protein-protein interaction.

Figure 2

Next, they removed the four proteins, Rad51, TOP1, Chk2 and 53BP1, from G1. This step created more than one sub-networks, but the researchers only show the largest one, which is shown in figure 2. The researchers determined that in this sub-network, DNA-PKcs was one of the top six in degree and betweeness. The point of this figure was to show that the apparently important role of DNA-PKcs as shown from figure 1 was not effected by the presence of the four initial proteins, which were manually chosen.

Figure 2. The largest sub-network produced after Rad51, TOP1, Chk2 and 53BP1 were removed from G1. DNA-PK (PRKDC) is colored yellow.

Figure 3

For figure 3, the researchers restricted G1 by showing only the proteins that were known to be involved in DNA repair. DNA-PKcs was found to be fourth in degree and ninth in betweeness. Compared to the values calculated from figure 1, the ranks of DNA-PKcs based on degree and betweeness have risen. Once again, this indicates that DNA-PKcs is very involved with DNA repair.

Figure 3. Network produced after the proteins not associated with DNA repair were removed from G1.

Figure 4

Figure 4 was created for the purpose of seeing what roles DNA-PKcs and other Tax-associated proteins play, beyond just affecting DNA repair. The researchers expanded G1, by including proteins that bind with the proteins that bind with Tax (second neigbor proteins), creating the G2 network. Then, the researchers restricted the network by showing only the proteins known to be involved with DNA repair. Furthermore, they looked only at the 3-core of this sub-network by showing only the proteins that had 3 or more edges.

Within this 3-core, the researchers identified five clusters of proteins. A crude explanation of the way the researchers formed those clusters was that they looked for sets of proteins that shared unusually high number of edges within themselves. Between the five clusters, the researchers found three 'bridge' proteins, TP53, PCNA and DNA-PK. Figure 4 was a graph of the connections between the five clusters and the three 'bridge' proteins. The numbers on each line represent the number of edges connecting that specific 'bridge' protein to that particular cluster.

Four members of cluster 1 allow cells to overcome DNA damage, while gene in Cluster 2 mediate cell cycle arrest when the cell is stressed. This finding is intriguing, as it suggests that beyond just affecting DNA repair mechanisms, Tax protein interactions may affect some stress response pathways that could hinder HTLV1 infection.

Figure 4. The connections among five clusters joined by 'bridge' proteins in the G2 network. The G2 network includes all proteins from the G1 network, in addition to second neighbors.

Commentary

The dual-approach method presented has been shown to be effective, as it allows for validation of one's data. As stated in the paper, researchers dealing with protein-protein interactions have to deal with 'false-positives', and this dual-approach method seems to offer a way to minimize them from happening. In this research, the proteins that showed the best binding in the LC-MS/MS analysis were also the proteins that ranked the highest in the in silico analysis. The researchers used the data from the physical method to do a proof-of-concept on the in silico method. However, future researchers can conversely use the in silico method to validate their physical methods.

Also, the way that the researchers analyzed their data offers a model to other researchers as to how to handle large volumes of data. In this paper, the researchers produced small sub-networks from a large interactome by restricting it with various factors such as the number of edges each member of the interactome must have or by screening for proteins with specific functions.

Finally, this paper demonstrates that in silico analysis has the potential to provide target proteins for physical testing. From their in silico interactome analysis, the researchers were able to show that DNA-PK very likely has an important role in Tax-mediated repair of DNA. They also showed that certain proteins that bind to Tax protein may bind to other proteins that may be involved with mediating cell stress responses. Conversely, they were also able to say that ATM and ATR, both of which are proteins that are highly involved with DNA repair, are not involved in Tax-mediated deficiency in DNA repair.

As can be seen, in silico methods can provide a large amount of useful information. However, protein-protein interactions do not tell very much about exactly how the proteins work. Therefore, in silico approaches have tremendous potential to guide researchers in the right direction so that they can conduct efficient physical research.