Mycobacterium tuberculosis (Mtb) survival during infection requires rapid integration of diverse signals to adapt to the host. In many organisms, molecular scaffolds organize signaling pathways and integrate signals, for example the KSR scaffold in MAPK signaling. Scaffolds can increase the local concentration of signaling components, insulate pathways from crosstalk, and provide a combinatorial “switchboard” function. A common theme in scaffold architecture is modularity. Rv3651 is essential for survival of Mtbin vivo, but not in vitro (Sassetti & Rubin, 2003; Sassetti et al., 2003), indicating a specific role in pathogenesis. The crystal structure of Rv3561 solved by the SSGCID (PDB ID 4Q6U) was unexpected: structural homology to other structures in the PDB revealed that Rv3651 consists of two Per-ARNT-Sim (PAS) domains and one GAF domain that were not predicted by sequence analysis. Rv3651 forms a symmetrical dimer and thus presents four PAS and two GAF domains to the solvent. Both types of domains are commonly found in signaling proteins, particularly in prokaryotes, where they often function either as a sensor domain that directly recognizes ligands or as a protein-protein interaction domain (Heikaus et al. 2009, Henry & Crosson, 2011). GAF domains bind the secondary messengers cAMP and cGMP. With 15 adenylyl cyclases encoded in its genome, Mtb maintains a complex cAMP system; cAMP profoundly affects Mtb gene regulation and is required for virulence, and secretion of cAMP into the host disrupts the host's immune response to infection (Bai et al. 2011). These data provide a plausible link between Rv3651 and its in vivo essentiality. The N-terminal GAF domain of Rv3651 is structurally most similar to a sensor domain associated with a Vibrio cholera histidine kinase. Domain 2 is structurally similar to the human transcription factor HIF2, and the C-terminal domain resembles another transcriptional regulator, PpsR, suggesting different functions for the three pairs of domains. The modular arrangement of six sensor and interaction domains, seemingly without any associated enzymatic activity, is highly unusual and suggests that Rv3651 has a molecular scaffold function. We hypothesize that these multiple domains serve as a scaffold, which integrates signals that are essential during Mtb infection. In order to test this hypothesis and identify novel components that are required for Mtb signaling and survival during infection, we propose to characterize the protein interaction partners of Rv3561 and small-molecule ligands, and further to define the transcriptional networks affected by Rv3561.

Specific Aim 1: Identify protein binding partners of Rv3651.Work to be performed in the Grundner lab at SCRI, carried out during Q1,2.

PAS domains are often protein-protein interaction modules. To test for a role of Rv3651 in protein-protein interactions and to identify up- and downstream components of potential Rv3651 signaling, we will utilize immunoprecipitation to identify Rv3651 binding partners. We will overexpress FLAG-tagged Rv3651 in Mtb and pull down binding partners. We will identify specific interactors by shotgun proteomics using mass spectrometry. We will map the binding of these interactors to individual PAS and GAF domains by using single domain constructs for pull-downs.

Specific Aim 2: Define the regulatory footprint of Rv3651.

Work to be performed in the Grundner/Sherman labs at CISCRIR, carried out during Q3.

To determine the cellular pathways affected by Rv3651, we will determine the transcriptional response of Rv3651 overexpression and knockout by microarray analysis. We will group genes into pathways to identify likely targets of Rv3651 regulation.

Specific Aim 3: Identify Rv3651 ligands.Work to be performed in the Grundner lab, carried out during Q4.

In a recent chemical biology screen (Ansong et al., 2013), we identified adenosine nucleotide-binding proteins in Mtb, including >80 hypothetical proteins, one of which was Rv3651. GAF domains are cAMP or cGMP binding domains. ATP binding of Rv3651 in our screen further suggests nucleotide binding activity of Rv3651, in particular cAMP binding. We will use isothermal titration calorimetry to determine cyclic nucleotide binding of Rv3651. Informed by our transcriptional data, we will test other small molecules that are known to bind PAS domains for binding to Rv3651, for example FAD, FMN, and several metabolites.

Together, these studies will probe the function of an unusual hypothetical Mtb protein, Rv3651. We hypothesize that Rv3651 functions as a signaling scaffold that coordinates the response of Mtb to infection. By defining protein interactors, transcriptional effects, and potential ligands of Rv3651, we will discover specific functions and general pathways that Rv3651 may coordinate to support infection.

* Not applicable; milestone not relevant in light of new enzymatic Rv3651 function

Summary: The primary goal of the functional study is to characterize the protein interaction partners of Rv3561 (MytuD.18669.a), a proposed molecular scaffold protein, as well small-molecule ligands that bind Rv3561. At the completion of the award year, we achieved milestones M1, M2, M3, M5 and half of M6. M4 and the other half of M6 became irrelevant when the new enzymatic function of Rv3651 was determined. M8 (manuscript prep) is in progress.

The discovery of function for proteins of unknown function is a major challenge in the post-genomic era. The identification of the biochemical activity of a protein in particular can be exceedingly difficult for highly divergent or species-specific proteins. We have previously established activity-based protein profiling (ABPP) to assign biochemical function to hypothetical M. tuberculosis (Mtb) proteins in parallel. However, full characterization of a protein’s function cannot be achieved by ABPP alone, and the combination of structural studies and ABPP has in the past been highly successful. In this project, we combined ABPP data predicting biochemical activity as an ATPase with protein structure to identify the function of Rv3651, a hypothetical protein that is required for Mtb infection. The primary hypothesis of this project was that the hypothetical protein Rv3651 (MytuD.18669.a) is a molecular scaffold. Our preliminary data from our activity-based proteomics screen in combination with the unusual, modular domain organization revealed by the crystal structure generated by the SSGCID suggested this possibility. Thus, the initial goal of this study was to test the scaffold hypothesis by identifying protein and small ligand interaction partners of Rv3561, as well as determining the transcriptional footprint of Rv3651. While we achieved the initial milestones in the first half of the year, the data did not support the scaffold hypothesis, requiring a complete rethinking of the project. However, after opening an additional line of experimentation that tested function of Rv3651 more broadly, the project moved in a new and unexpected direction. We now have biochemical, metabolomic, and genetic evidence showing that Rv3651 has gluconate kinase activity.

Specific Aim 1: Identify protein binding partners of Rv2651.

Completed. The Grundner lab successfully generated the Rv3651 over-expression strain early in the first quarter of this project, thus completing milestone M1. The Grundner lab optimized pulldown conditions and identified several protein interactor candidates by mass spectrometry. A large-scale pulldown was analyzed by MS to increase depth and confirm initial hits (M3). Protein interactors were enriched for several transcription-related proteins, such as multiple subunits of RNA polymerase and DNA gyrase (M4). However, in light of new data from our lab (see Aim 2) and from our collaborator Kyu Rhee that Rv3651c is a gluconate kinase, these experiments are not being pursued further.

Specific Aim 2: Define the regulatory footprint of Rv3651.

Completed. As mentioned above, milestone M1 is complete. The Rv3651c knockout is now confirmed by DNA sequencing (M2). The RNA-seq analysis of the Rv3651 overexpression strain has been completed (M6). The changes in transcription upon overexpression were small and analysis is now underway to characterize the changes in the context of known Mtb pathways and the initial pulldown results, to find potential overlap hinting to function. Next up is analysis of the Rv3651 knockout strain (also M6), which might produce stronger gene expression changes. However, in light of new data that Rv3651c is a gluconate kinase, transcriptional effects of Rv3651c appear less relevant and studies will instead focus on the gluconate kinase function.

Specific Aim 3: Identify Rv3651 ligands.

Completed. We could not detect binding of likely candidate small molecule interactors such as ATP, GTP, dATP, other nucleotides and small molecules as tested by isothermal titration calorimetry (M5).

Other.

While the initial hypothesis for Rv3651 function as a molecular scaffold has not been borne out, we were successful in determining another, entirely unexpected function of Rv3651. Our data now strongly suggest that the unusual architecture of the three PAS domains, rather than serving as a ligand binding scaffold, might have evolved towards gluconate kinase activity. Gratifyingly, this function is also consistent with our initial data from the ABPP screen that showed ATPase activity of Rv3651. To conclusively prove gluconate kinase activity, we are now generating a co-crystal structure with ligands, and will confirm genetic and metabolomic data with additional mutants. In addition to the initial milestones, we:

Together, these data will conclusively test gluconate kinase function of Rv3651. The repurposing of the PAS domain for catalysis is almost unprecedented. We now have biochemical, metabolomic, and preliminary genetic data consistent with this activity. We think that this unusual protein is a unique and extreme example of convergent evolution, or repurposing, of a ligand binding protein scaffold for catalysis. This project will continue to be a rich source of insight into protein evolution, structure-function relationships, as well as provide starting points for understanding gluconate metabolism and its role for Mtb pathogenesis.