RefMet Naming Conventions

The names used in RefMet are generally based on common, officially accepted terms and incorporate notations which are appropriate for the type of analytical technique used. In general, high-throughput untargeted MS experiments are not capable of deducing stereochemistry, double bond position/geometry and sn position (for glycerolipids/lycerophospholipids). Secondly, the type of MS technique employed, as well as the mass accuacy of the instrument will produce identifications at different levels of detail. For example, MS/MS methods are capable of identifying acyl chain substituents in lipids (e.g. PC(16:0/20:4), Cer(d18:1/16:0)) whereas MS methods only using precursor ion information might report these ions as "bulk" species (e.g. PC(36:4), Cer(d34:1)). RefMet covers both types of notations in an effort to enable data-sharing and comparative analysis of metabolomics data, using an analytical chemistry-centric approach. Some notes pertaining to different metabolite classes are outlined below.

RefMet classification:

Amino acids:

The amino acids are listed as Glycine, Arginine, Tyrosine, etc. without specifying an "L-" prefix and are linked to the structures of the predominant "L-" forms. D-amino acids, on the other hand are explicitly listed as such (D-Arginine, D-Asparagine, etc.).

Dipeptides:

Dipipetides are listed by their 3-letter amino acid abbreviations such as Lys-Arg, Asn-Leu and linked to the L-structures where applicable.

Sugars:

Monosaccharide sugars are generally listed without the "D/L-" prefix and linked to the structure of the most abundant enantiomer in nature. Thus, Glucose, Galactose and Fucose are linked to the strucures of D-Glucose, D-Galactose and L-Fucose, respectively. Other enantiomers such as D-Fucose or D-Galactose are explicitly listed as such.

Sphingolipids:

In cases where N-acyl chain-containing sphingolipids such as ceramides and sphingomyelins are identified by MS precursor ion the "bulk" abbreviation is used, such as Cer(d34:1) and SM(d42:1) where the lower-case letter (m(mono), d(di), t(tri)) within the parentheses denotes the number of hydroxyl groups in the entire molecule, and the numbers denote the number of carbons:number of double bonds in the entire molecule.
Where MS/MS methods or other techniques were used to identify the nature of the N-acyl chain and/or sphingoid base, a nomenclature such as Cer(d18:1/16:0) and SM(d18:0/24:1) is used. In this case he lower-case letter (m,d,t) referes to the number of hydroxyl groups in the sphingoid base only. A hydroxyl group in the N-acyl chain is represented as Cer(d18:0/24:0(2OH)).
Cer: Ceramide
CerP: Ceramide-1-phosphate
SM: Sphingomyelin
GlcCer: Glucosyl ceramide
GalCer: Galactosyl ceramide
HexCer: Hexosyl ceramide (the nature of the hexose sugar could not be determined)
LacCer: Lactosyl ceramide
Sulfatide: 3-sulfo-galactosyl ceramide

Glycerolipids:

Mono-, di- and tri-radylglycerols are designated by the MG, DG and TG prefixes respectively. In cases where the nature of the chains has not been determined ,"bulk" abbreviations such as TG(54:2) and DG(36:0) are used.
Species containing an alkyl ether chain in place of an acyl chain are designated with an "O-" prefix, e.g DG(O-36:0).
Glycerolipids whose chain constituents have been identified by MS/MS methods or other techniques are designated as TG(16:0_18:1_20:4) or DG(18:1_18:2) where the underscore "_" indicates that the sn position on the glycerol backbone is unknown. In cases where the chains are the same (e.g. TG(16:0/16:0/16:0)) a forward slash "/" is used because there is no sn position ambiguity.
In the case of diradylglycerols, the sn location of the chains (1,2- , 1,3- or 2,3-) is not assumed unless explictly specified. Similarly for the monoradylglycerols, an abbreviation such as MG(16:0) represents a general species covering sn1, sn2 and sn3 substitution. Triradylglycerols containing one alkyl chain (MonoEther-DIacylGlycerols) are listed as MeDAG(54:2), etc.

Glycerophospholipids:

In cases where the nature of the chains has not been determined ,"bulk" abbreviations such as PC(34:2) and PE(36:0) are used.
Species containing an alkyl ether chain in place of an acyl chain are designated with an "O-" prefix and species containing a (1Z) vinyl ether chain (i.e. Plasmalogens) are designated with an "P-" prefix. Since a phospholipid with a plasmenyl group (e.g PC(P-32:0)) is isobaric with an alkyl ether species containing a double bond at a chain position other than C1 (e.g. PC(O-32:1)), MS methods generally cannot distinguish between these isomers, and they are listed as PC(P-32:0)/PC(O-32:1).
Glycerophospholipids whose chain constituents have been identified by MS/MS methods or other techniques are designated as PC(16:0_20:4) where the underscore "_" indicates that the sn position on the glycerol backbone is unknown. In cases where the chains are the same (e.g. PE(18:0/18:0)) a forward slash "/" is used because there is no sn position ambiguity.
Lysophophospholipids are preceded with an "L".
(L)PC: (Lyso)Glycerophosphocholines
(L)PE: (Lyso)Glycerophosphoethanolamines
(L)PS: (Lyso)Glycerophosphoserines
(L)PG: (Lyso)Glycerophosphoglycerols
(L)PI: (Lyso)Glycerophosphoinositols
(L)PA: (Lyso)Glycerophosphates
CL: Cardiolipins (Glycerophosphoglycerophosphoglycerols)