Proteomics data formats

Contents

Data formats

Raw data level

Thermo Xcalibur or Waters MassLynx .RAW filesThe file formats are different, but distinguishable by: Xcalibur=file, MassLynx=directory. Original data file as exported by Xcalibur or MassLynx respectively during data acquisition. These files can programmatically be accessed on the Microsoft Windows platform through a OLE DLL. Normally this will only work with a C++ implementation, however the PeakML library (due to be released open source; currently available on request r.a.scheltema@rug.nl) provides a 1-to-1 mapping for accessing the data for Java implementations. It is advisable to use these original formats, as there is a large amount of information contained in these files, which is not mapped to an open file format like mzML.

Peptide identification level

PeakMLWas developed by Richard Scheltema as response to the needed to have the ability to store intermediate data (extracted mass traces, matched sets of these, parameters, etc.), in order to create a modular pipeline setup.

NetCDF (obsolete)It was developed to be general purpose and as such is a very poor fit for mass spec data. This means it will miss much useful information on your mass spec run. Do not use it.