mzIdentML 1.2.0 (Released March 2017 - current version of the standard)

In 2013-2017, PSI-PI has updated mzIdentML from version 1.1 to 1.2. The main update relates to improvement in the representation of protein grouping relationships, through the use of mandatory CV terms. Minor updates have also being proposed for capturing pre-fractionation of samples, de novo sequencing and the use of multiple search engines. Specifications have also been added for supporting proteogenomics and cross-linking MS.

mzIdentML 1.1.1: XML Schema, Documentation

Released in July 2015, as a minor update to version 1.1.0. This update should be viewed as a "bugfix" update only. The only change is to ensure that mass deltas encoded in the format are consistently encoded as doubles and not as floats. As of March 2017, both mzIdentML 1.1.1 and 1.2 (see above) will be generally supported for some years, although we strongly encourage new implementers to work with mzIdentML 1.2.

This has resulted in a change to the schema (XSD) and the specification document only. All other resources are unchanged from version 1.1.0.

mzIdentML Tools and Implementations

This was the first version of the mzIdentML format, released August 2009. mzIdentML 1.0.0 is NOW DEPRECATED - users should use mzIdentML 1.1.x or 1.2 versions.

mzIdentML was developed as an extension to the Functional Genomics Experiment (FuGE) object model. However, in a change agreed at the PSI Spring Meeting, 2008, the XML schema was developed directly rather than performing the design in UML and converting to XML. A cut-down version of the FuGE xsd has been developed to facilitate this. As a consequence, the UML class diagram in subversion is now out of date.

Tags:

From 2005-2008 there has existed two separate XML formats for encoding raw spectrometer output: mzData developed by the PSI and mzXML developed at the Seattle Proteome Center at the Institute for Systems Biology. It was recognized that the existence of two separate formats for essentially the same thing generated confusion and required extra programming effort. Therefore the PSI, with full participation by ISB, has developed a new format by taking the best aspects of each of the precursor formats to form a single one. It is intended to replace the previous two formats. This new format was originally given a working name of dataXML. The final name is mzML.

On 2008-06-01, mzML 1.0.0 was released.

In early 2009, several implementation efforts have identified a few minor shortcomings in mzML 1.0.0. Since no vendors have yet released software supporting mzML 1.0, but have identified a few minor problems with it, the working group has decided to release an update in June 2009. It is expected that all software will support mzML 1.1 as the long-term-stable format instead of 1.0. Below are the available documents and initial implementations. We encourage the community to begin implementing mzML 1.1.0, to phase out use of mzData and mzXML, and to send feedback to psidev-ms-dev@lists.sourceforge.net.

On 2009-06-01, mzML 1.1.0 was released. There are no planned further changes as of early 2013.

mzML Release Schedule

(updated 2013-05-02)

2008-06-01 mzML 1.0.0 released

2009-06-01 mzML 1.1.0 released

2010-06-01 mzML index wrapper schema updated to 1.1.1

2013-05 Minor updates to CV still occur, but no new schema changes are planned at this time

mzML 1.1.0 Finished Specification

(updated 2010-07-13)

The information and documents in this subsection are related to mzML 1.1.0, revised after going through the PSI document process on May 19, 2009. Everyone is encouraged to update their implementation to mzML 1.1.0 and release software supporting that instead of mzML 1.0. It is sincerely hoped that mzML 1.1 will remain stable for a long time.

NOTE: On 2010-06-01, the mzML index schema was updated from 1.1.0 to 1.1.1. There was no functional change, but rather the addition of an enumeration constraint to an attribute to prevent creative, unintended values. This could cause some files that previously validated to no longer validate. However, any such files should never have successfully validated in the first place.

PSI-MS: Mass Spectrometry Standards Working Group

The PSI-MSS working group defines community data formats and controlled vocabulary terms facilitating data exchange and archiving in the the field of proteomics mass spectrometry.

Current projects are:

The mzML format, which merges the mzData format (see below) and another similar format mzXML. mzML 1.1.0 was released on June 1, 2009 and has been stable since then. Everyone is encouraged to implement mzML 1.1.0 in their software. See the mzML information page for the full specification, other documentation and examples.

The TraML format has been developed as a standardized format for the exchange and transmission of transition lists for selected reaction monitoring (SRM) experiments. This specification has been been accepted through the PSI document process and is complete. Please email the list psidev-ms-dev@lists.sourceforge.net with your questions, comments, and suggestions. See the TraML information page for the full specification, other documentation and examples.

Past achievements are:

The mzData standard, which captures mass spectrometry output data. mzData's aim is to unite the large number of current formats (pkl's, dta's, mgf's, .....) into a single format. mzData has been released and is stable at version 1.05. It is now deprecated in favor of mzML.

As of 2006 there existed two separate XML formats for encoding raw spectrometer output: mzData developed by the PSI and mzXML developed at the Seattle Proteome Center at the Institute for Systems Biology. It is recognized that the existence of two separate formats for essentially the same thing generates confusion and extra programming effort. Therefore the PSI, with full participation by ISB, developed a new format intended to replace the previous two formats, by merging the best ideas from each format. This new format is called mzML. See the information page for mzML which includes current documentation, example files and other related materials. We encourage everyone to implement mzML in their software and workflows and cease using mzXML and mzData as soon as possible.

Controlled Vocabulary development: The PSI-MS CV

The PSI-MS Controlled Vocabulary is developped in common with the PSI-Proteomics Informatics group. It consists of a large collection of structured terms covering description and use of Mass Spectrometry instrumentation as well as Protein Identification and Quantitation software. The source of the terms are multiple: they include vocabulary and definitions in chapter 12 of the IUPAC nomenclature book, instrument and software vendors and developers and other user-submitted terms. Although its structure and use is linked to mzML, mzIdentML and mzQuantML, it is dynamically maintained in a OBO format.

mzData in a nutshell

mzData is a data format capturing peak list information. Its aim is to unite the large number of current formats (pkl's, dta's, mgf's, .....) into one; mzData. mzData is NOT a substitute for the rawfile formats of the instrument vendors. Some vendors, if not all, will provide software transforming their raw files to mzData. There are already a number of programs which can use mzData. In order to keep the filesize of mzData limited, mz/intensity information is stored in "binary base 64 format".

Tags:

The HUPO PSI Mass Spectrometry Standards Working Group (MSS WG) has developed a specification for a standardized format for the exchange and transmission of transition lists for selected reaction monitoring (SRM) experiments. This specification has now completed rigorous review with the PSI document process and is complete. Please email the list psidev-ms-dev@lists.sourceforge.net with your questions, comments, and suggestions.

TraML Development Timeline

(updated 2013-04-22)

2010-03-31 TraML 0.9.4 draft posted

2010-07-01 TraML 0.9.4 submitted as 1.0.0RC1 to PSI document process

2011-08-23 TraML 0.9.5 draft posted

2011-12-12 TraML 1.0.0 released

New TraML 1.0.0 Finished Specification

(updated 2013-04-22)

The information and documents in this subsection are related to TraML 1.0.0, now complete in its development cycle. There are currently no open issues for a follow-on version. Everyone is encouraged to examine and implement the formatas widely as possible.