Welcome to MTB Network Portal Help pages.

If you still have questions please send us an email at sturkarslan@systemsbiology.org. MTB Network Portal team

Genes Table

Genes page will display table of Mycobacterium tuberculosis H37Rv genes. Genome annotations are mainly derived from Tuberculist . Locus name is linked to individual gene pages that include more detailed information. Various filters can be used to pinpoint gene of interest.

Bound by: This column will list number of TF(s) that were found to bind to upstream of the given gene in ChIP-Seq experiments. Reference: Minch KJ, Rustad TR, Peterson EJR, et al. "The DNA-binding network of Mycobacterium tuberculosis." Nat Commun. 2015;6:5829

Binds To: If the gene is a TF, this column will list number of target genes that it binds to as identified in ChIP-Seq experiments. Reference: Minch KJ, Rustad TR, Peterson EJR, et al. "The DNA-binding network of Mycobacterium tuberculosis." Nat Commun. 2015;6:5829

Filters:

Genes table can be further filtered by number of criteria. Multiple filters can be combined to create more specific queries. Sorting can also be changed based on different parameters. Simply select filters you want to apply, select sorting method and press "Apply".

Modules Table

Regulatory units (modules) are identified by using cMonkey algorithm. More specifically, module or bicluster refers to set of genes that are conditionally co-regulated under subset of the conditions. Identification of modules integrates co-expression, de-novo motif identification, and other functional associations such as operon information and protein-protein interactions.

Columns:

Title: Name of the regulatory module/bicluster identified by cMonkey and linked to module details page

Residual: is a measure of bicluster quality. Mean bicluster residual is smaller when the expression profile of the genes in the module is "tighter". So smaller residuals are usually indicative of better bicluster quality.

Score: Bicluster score.

Motif 1 and 2: cMonkey integrates powerful de novo motif detection to identify conditionally co-regulated sets of genes. For each module 2 de novo predicted motifs are listed in the module page as motif logo images. Click on the image will show larger motif logo

Motif 1 and 2 e-values: Motif e-value is an indicative of the motif co-occurences between the members of the module. Smaller e-values are indicative of significant sequence motifs. Our experience showed that e-values smaller than 10 are generally indicative of significant motifs.

Filters: You can filter the rows in the table based on residual and motif e-values by using sliders.

Regulators Table

The Inferelator is an algorithm for infering predictive regulatory networks from gene expression data. It does so by selecting the regulators (transcription factors or environmental factors) whose levels are most predictive of each gene or bicluster's expression.

Columns

Regulator Locus: Locus tag of the TF and its product linked to gene page

Target Bicluster: Bicluster predicted to be regulated by given TF. Link will take you to the module page.

Weight: Weight of the regulatory influence. It can be positive (up-regulates) or negative (down-regulates). Higher absolute value of the weight is associated with confidence of the influence prediction.

Regulator Taxonomy: List all other modules that are predicted to be regulated by the same TF.

Filters: Regulators table can be further filtered by searching for a specific Regulator, Interaction or weight.

Motifs Table

Transcription factor binding motifs help to elucidate regulatory mechanism. cMonkey integrates powerful de novo motif detection to identify conditionally co-regulated sets of genes. De novo predicted motifs for each module are listed in the module page as motif logo images along with associated prediction statistics (e-values).

Columns:

Title: Name of the regulatory motif identified by cMonkey and linked to motif's details page

Logo: is a graphical representation of the sequence conservation of DNA

e-value: Motif e-value is an indicative of the motif co-occurences between the members of the module. Smaller e-values are indicative of significant sequence motifs. Our experience showed that e-values smaller than 10 are generally indicative of significant motifs.

Number of sites: Number of genes in a given module that includes identified motif in the upstream region

Length: Number of nucleotide bases in the motif logo

Motif Bicluster: Regulatory module/bicluster that includes given motif.

: Clicking on the icon will display detailed motif information in an overlay window.

ChIP-Seq Data

This page provides a gateway for all available ChIP-Seq data and associated analyses results. ChIP-Seq Data was described in Minch et al. 2015, Nature Communicaions. Data is summarized in 3 different tables with appropriate links to gene/TF pages and ChIP-Seq Profiles.

ChIP-Seq Profiles

ChIP-Seq profiles table enables quick display of ChIP-Seq tracks in UCSC Genme Browser. Results are paginated for easy browsing with link to all the results at the lower left corner.

ChIP-Seq Data Files

ChIP_Seq Data Table lists summary information and download links for each ChIP-Seq data file. Similar to ChIP-Seq Profiles table, results are paginated for easy browsing with link to all the results at the lower left corner.

TFOE Data

To investigate the MTB transcriptional landscape in a systematic manner, we developed a high-throughput approach to identify the genes controlled by nearly all predicted MTB TFs. We individually cloned and conditionally overexpressed 206 MTB TFs to induce the regulatory signature of each one. Using this approach we identified the sets of genes affected by TF overexpression (TFOE) and assembled them into an easily searchable map of transcriptional regulation in MTB.

Accessing large datasets like the TFOE expression data can be difficult when the data spreads over thousands of genes and hundreds of regulators. To address the difficulties usually associated with accessing large data sets, we have designed a simple Excel spreadsheet for querying TFOE data to find regulators of specific genes or sets of genes.
(Rustad et al. Genome Biol. 2014)

Gene Details Page

Gene Details page provides gene specific information from various resources in different panels. Sidebar on the left display quick links for easy access to other resources. Structure and Domains Block include UniProt entry links, PDB and PM Portal entries if available and Subcellular localization information. SSGCID Block displays entry for this in Seattle Structural Genomics Center for Infectious Diseases. If no entry is available user can submit request through their website. DNA and Amino acid sequences are also provided on this block.

Summary

Summary Panel shows basic genome annotation information for the gene such as Locus tag, symbol, Protein product description and genomic coordinates etc. If the gene is also a TF, this is indicated in the last column.

Overview

Overview Panel summarizes regulatory information predicted through cMonkey algorithm by indicating regulatory modules containing this gene and associated motif and term enrichment information. Essentiality for in vitro growth on cholesterol is also indicated here.

Binds To (ChIP-Seq)

If gene is a TF included in ChIP-Seq experiments, this panel will display list of genes whose coordinates found to be close to ChIP-seq binding peak coordinates. In addition to Distance from binding coordinate and genomic feature type of the target gene, differential expression and associated significance scores identified in TFOE experiments will also be shown here.

Bound By (ChIP-Seq)

List of TF that were found to have binding peak close to genomic coordinates of this gene in ChIP-Seq experiments will be listed here together with differential expression and binding peak properties.

Regulatory Modules (cMonkey Network)

This panel displays regulatory modules identified by cMonkey algorithm. For each module module name is linked to detailed information. For each module two de novo identified motifs are also listed together with associated motif logos and link to details page. Module residual which is a measure of bicluster quality is in the last colum. Mean bicluster residual is smaller when the expression profile of the genes in the module is "tighter". So smaller residuals are usually indicative of better bicluster quality.

Regulated by (Inferelator Network)

This panel displays regulatory influences of TF as identified by using Inferelator algorithm. For each influence, name of the regulator, direction of the influence (up-regulates vs down-regulates) and influence weight are shown. Confidence of given influence relates to higher absolute values of the influence weights. Other influences column provides link to other predicted targets for the given regulator.

Regulates (Inferelator Network)

If the gene is a TF and included in Inferelator network predictions, predicted target regulatory modules are displayed in this panel.

Tuberculist

Quick access to Tuberculist entry for the given gene is provided. Locus info will direct to Tuberculist page while Quickview opens corresponding page in an overlay window to provide a galnce of the TUberculist page. Genome view will display genomic neighborhood information from Tuberculist.

KEGG Pathways

If gene has KEGG Pathway information link to specific pathway together with description of the KEGG Pathway and the number and links to other genes within the same pathway is shown in this panel.

BioCyc Pathways

BioCyc is a collection of 5500 Pathway/Genome Databases (PGDBs), plus software tools for understanding their data. If gene is included in BioCyc, link to details page is shown. Cellular Overview Map will display the placement of the gene in the Cellular Overview Map in an overlay window.

String Network

STRING is a database of known and predicted protein interactions.The interactions include direct (physical) and indirect (functional) associations.
This panel will link gene to associated STRING entry and STring Network Column will display network of functional associations for the gene from STRING in an overlay window.

RefSeq

RefSeq is a comprehensive, integrated, non-redundant, well-annotated set of reference sequences including genomic, transcript, and protein. RefSeq associated GI Number and Protein ID is displayed and linked to respective RefSeq pages. BLast column enables directly submitting this gene sequence as a query for Blast analysis. Conserved Domains column will enable similar analysis for the Conserved Domains database.

Go Terms

Gene Ontology terms as provided by genome annotation are displayed. For each term, Term ID and Description is provided. Zoom icon will display more information about the GO Terms.

Gene Expression

Gene Expression panel will link out to expression profile page for the given gene in TBDB database .

SSGCID Details

SSGCID’s (Seattle Structural Genomics Center For Infectious Disease) primary mission is to determine the structure of ca. 70 protein targets from NIAID Category A-C agents, as well as emerging and re-emerging infectious disease organisms, each year for a period of five years. If structural information for the given gene is available, it will be displayed with summary information.

ChIP-Seq Profiles

If gene is a TF, ChIP-Seq profile information can be visualized as UCSC Genome Browser tracks in an overlay window. These binding profiles are collected from ChIP-Seq binding experiments performed by Minch et al. 2015. For more details see Minch et al. 2015. Nature Comunications

Cholesterol Essentiality

Indicates if the gene is found to be essential for in vitro growth on Cholesterol. Whether the gene is essential or non-essential is displayed along with t-test p-value and Colesterol/Glycerol Ratio as described below.
The relative representation of each mutant was determined by calculating the fold change (sequence reads/insertion in cholesterol divided by sequence reads/insertion in glycerol) for each gene. Statistical significance was determined by t-test. Each insertion site in each replicate sample was treated as a separate data point. The hyperbola used for defining genes specifically required for growth in cholesterol was defined by the formula, y = 3.8/x+0.7. Genes above this line are annotated as required for growth on cholesterol.
- Griffin JE, Gawronski JD, Dejesus MA, Ioerger TR, Akerley BJ, Sassetti CM, High-resolution phenotypic profiling defines genes essential for mycobacterial growth and cholesterol catabolism. PLoS Pathog (2011) 7(9).

Module Detail Page

Regulatory module (bicluster) refers to set of genes that are conditionally co-regulated under subset of the conditions. Identification of modules integrates co-expression, de-novo motif identification, and other functional associations such as operon information and protein-protein interactions. Modules are based on cMonkey algorithm and Inferelator regulatory influences on these modules.

Expression Profiles :

Expression profiles is a plot of the expression ratios (log10) of the module's genes, over all subset of the conditions included in the module. The X-axis represent conditions and the Y-axis represents log10 expression ratios. Each gene is plotted as line plot with different colors. Colored legend for the lines are presented under the plot.

Motifs:

De novo predicted motifs for each module are listed in the module page as motif logo images along with associated prediction statistics (e-values). The main module page also shows the location of these motifs within the upstream sequences of the module member genes.

Motif Locations:

Location of the Identified motifs for the module in the upstream regions of the member genes are shown under the expression profiles plot. This plot shows the diagram of the upstream positions of the motifs, colored red and green for motifs #1, and 2, respectively. Intensity of the color is proportional to the significance of the occurence of that motif at a given location. Motifs on the forward and reverse strand are represented over and under the line respectively.

Functions:

Biological networks contain sets of regulatory units called functional modules that together play a role in regulation of specific functional processes. Connections between different modules in the network can help identify regulatory relationships such as hierarchy and epistasis. In addition, associating functions with modules enables putative assignment of functions to hypothetical genes. It is therefore essential to identify functional enrichment of modules within the regulatory network.
We use hypergeometric p-values to identify significant overlaps between co-regulated module members and genes assigned to a particular Gene Ontology category. P-values are corrected for multiple comparisons by using Benjamini-Hochberg correction and filtered for p-values ≤ 0.05.

UniProt Table

UniProt page will display table of Mycobacterium tuberculosis H37Rv UniProt entries. These entries are collected from UniProt proteome:up000001584 . Title is linked to individual entry pages that include more detailed information.

Functions Table

Functions page will display functional annotations for Mycobacterium tuberculosis H37Rv from different resources. Currently, functional annotations from GO, KEGG, UniPathways and InterPro are included. Each function is linked to gene entries listed under this function.