Tag Archives: MetFrag

Identifying “known unknowns” via suspect and non-target screening of environmental samples with the in silico fragmenter MetFrag (http://msbi.ipb-halle.de/MetFragBeta/) typically relies on the large compound databases ChemSpider and PubChem (see e.g. Ruttkies et al 2016). The size of these databases (over 50 and 90 million structures, respectively), yield many false positive hits of structures that were never produced in sufficient amounts to be realistically found in the environment (e.g. McEachran et al 2016). One motivation behind the US EPA’s CompTox Chemistry Dashboard is to provide access to compounds of environmental relevance – currently approx. 760,000 chemicals. While the web services are not yet available to incorporate the Dashboard in MetFrag as a database like ChemSpider and PubChem, there are a number of features in MetFragBeta that enables users to use the CompTox Chemistry Dashboard to perform “known unknown” identification with MetFrag. This post highlights the Suspect Screening Functionality.

First we have our (charged) mass. Take m/z = 256.0153. This was measured in positive mode and we assume (correctly) that it’s [M+H]+. Make sure you set this correctly in MetFrag.

Then retrieve your candidates, e.g. using ChemSpider or PubChem and a 5 ppm error margin:

You could now process the candidates … but we have not done anything with the Dashboard! This is hidden in the middle in the “Candidate Filter & Score Settings” tab:

You can use the Candidate Filter to process ONLY candidates that are in the CompTox Chemistry Dashboard, excluding all other candidates, by clicking on “Suspect Inclusion Lists” and selecting the “DSSTox” box (see screenshot), which retains (currently) 11 of the 156 ChemSpider candidates:

Once finished the processing, the plot in the “Statistics” tab should look something like this – depending on what additional scores you selected:

It is also possible to use one (or more!) suspect lists to SCORE the different candidates without excluding any matches from ChemSpider or PubChem, by selecting the same box under the “MetFrag Scoring Terms” part instead (see screenshot). Additional lists like the Swiss Pharma list shown below can be downloaded from the NORMAN Suspect Exchange (http://www.norman-network.com/?q=node/236) and also viewed under the lists tab in the CompTox Chemistry Dashboard (https://comptox.epa.gov/dashboard/chemical_lists). MetFrag only needs a text file containing InChIKeys of the substances for the upload – which can be obtained from the Dashboard or Suspect Exchange downloads.

Using the Suspect Lists as a “Scoring term”, along with some other criteria and restrictions, will give you a results plot looking more like this: