Wrapper for processing of GC-MS data files

Share:

Description

Main function of the pipeline for GC-MS data processing. It includes
XCMS peak detection, definition of pseudospectra, and compound
identification by comparison to a database of standards. The function
also takes care of removal of artefacts like column bleeding and
plasticizers, and definition of unknowns, consistently present across
samples.

Usage

Arguments

files

input files, given as a vector of strings containing the
complete paths. All formats supported by XCMS can be used.

xset

alternatively, one can present a list of xcmsSet objects
for whom CAMERA grouping has been done. In this case, only the
annotation process will be done. If both files and
xset are given, the former takes precedence.

settings

a nested list of settings, to be used at individual
steps of the pipeline.

rtrange

part of the chromatograms that is to be analysed. If
given, it should be a vector of two numbers indicating minimal and
maximal retention time (in minutes).

DB

database containing the spectra of the pure standards. At
least the following fields should be present: Name,
std.rt, pspectrum and monoMW.

removeArtefacts

logical, whether or not to remove patterns
identified as (e.g.) column bleeding. Only performed if a database
containing such patterns is available.

findUnknowns

logical, whether to find patterns without
identification that are present consistently in several samples. The
default is to use TRUE if the number of samples is larger than the
min.class.size setting in the 'betweenSamples' metaSetting.

returnXset

logical: should the XCMS output be returned? If yes,
this is a a list of xcmsSet objects, one element for each
input file.

RIstandards

A two-column matrix containing for the standards
defining the RI scale both retention times and retention indices. If
not given, no RI values will be calculated and retention times will
be used instead.

nSlaves

Number of cores to be used in peak picking.

Value

A list with the following elements:

PeakTable

data.frame containing annotation information. Every
line is a feature, i.e. a pseudospectrum. The first columns are used
to give information about these features, a.o. compound name, CAS and
Chemspider IDs, etcetera. The last of these meta-information columns
is always the one giving the retention time: “rt”. After that,
columns
correspond to input files, and give measures of intensities for every
single one of the features. If a feature is not detected in a sample,
this is indicated with “0” (zero).

PseudoSpecra

A list of pseudospectra in msp format, in the
same order as the rows in the PeakTable.

xset

optionally, the xcmsSet object is returned, which can be
useful for more detailed inspection of the results. It can also be
used as an input for runGC, e.g., to test different annotation
settings independently of the xcms/CAMERA part.

sessionInfo

The output of sessionInfo() to keep track of the sw
version used for the processing