Introduction

Start with importing the yaml package.

>>>importyaml

Loading YAML

Warning: It is not safe to call yaml.load with any data received from an untrusted source!
yaml.load is as powerful as pickle.load and so may call any Python function.
Check the yaml.safe_load function though.

yaml.load accepts a string, a Unicode string, an open file object, or
an open Unicode file object. A string or a file must be encoded with utf-8,
utf-16-be or utf-16-le encoding. yaml.load detects the encoding
by checking the BOM (byte order mark) sequence at the beginning of the
string/file. If no BOM is present, the utf-8 encoding is assumed.

If a string or a file contains several documents, you may load them all with the
yaml.load_all function.

>>> documents ="""
... ---
... name: The Set of Gauntlets 'Pauraegen'
... description: >
... A set of handgear with sparks that crackle
... across its knuckleguards.
... ---
... name: The Set of Gauntlets 'Paurnen'
... description: >
... A set of gauntlets that gives off a foul,
... acrid odour yet remains untarnished.
... ---
... name: The Set of Gauntlets 'Paurnimmen'
... description: >
... A set of handgear, freezing with unnatural cold.
... """>>>for data in yaml.load_all(documents):...print data
{'description':'A set of handgear with sparks that crackle across its knuckleguards.\n','name':"The Set of Gauntlets 'Pauraegen'"}{'description':'A set of gauntlets that gives off a foul, acrid odour yet remains untarnished.\n','name':"The Set of Gauntlets 'Paurnen'"}{'description':'A set of handgear, freezing with unnatural cold.\n','name':"The Set of Gauntlets 'Paurnimmen'"}

Note that the ability to construct an arbitrary Python object may be dangerous
if you receive a YAML document from an untrusted source such as Internet.
The function yaml.safe_load limits this ability to simple Python objects
like integers or lists.

Dumping YAML

The yaml.dump function accepts a Python object and produces a YAML document.

yaml.dump accepts the second optional argument, which must be an open file.
In this case, yaml.dump will write the produced YAML document into the file.
Otherwise, yaml.dump returns the produced document.

>>> stream =file('document.yaml','w')>>> yaml.dump(data, stream)# Write a YAML representation of data to 'document.yaml'.>>>print yaml.dump(data)# Output the document to the screen.

If you need to dump several YAML documents to a single stream, use the function
yaml.dump_all. yaml.dump_all accepts a list or a generator producing
Python objects to be serialized into a YAML document. The second optional argument is
an open file.

yaml.dump supports a number of keyword arguments that specify
formatting details for the emitter. For instance, you may set the
preferred intendation and width, use the canonical YAML format or
force preferred style for scalars and collections.

yaml.YAMLObject uses metaclass magic to register a constructor, which
transforms a YAML node to a class instance, and a representer, which serializes
a class instance to a YAML node.

If you don't want to use metaclasses, you may register your constructors
and representers using the functions yaml.add_constructor and
yaml.add_representer. For instance, you may want to add a constructor
and a representer for the following Dice class:

YAML syntax

You may also check the YAML cookbook. Note
that it is focused on a Ruby implementation and uses the old YAML 1.0 syntax.

Here we present most common YAML constructs together with the corresponding Python objects.

Documents

YAML stream is a collection of zero or more documents. An empty stream contains no documents.
Documents are separated with ---. Documents may optionally end with ....
A single document may or may not be marked with ---.

Each style has its own quirks. A plain scalar does not use indicators to denote its
start and end, therefore it's the most restricted style. Its natural applications are
names of attributes and parameters.

Using single-quoted scalars, you may express any value that does not contain special characters.
No escaping occurs for single quoted scalars except that duplicate quotes '' are replaced
with a single quote '.

Double-quoted is the most powerful style and the only style that can express any scalar value.
Double-quoted scalars allow escaping. Using escaping sequences \x** and \u****,
you may express any ASCII or Unicode character.

There are two kind of block scalar styles: literal and folded. The literal style is
the most suitable style for large block of text such as source code. The folded style is similar
to the literal style, but two consequent non-empty lines are joined to a single line separated
by a space character.

Aliases

Note that PyYAML does not yet support recursive objects.

Using YAML you may represent objects of arbitrary graph-like structures. If you want to refer
to the same object from different parts of a document, you need to use anchors and aliases.

Anchors are denoted by the & indicator while aliases are denoted by *. For instance,
the document

Plain scalars without explicitly defined tag are subject to implicit tag
resolution. The scalar value is checked against a set of regular expressions
and if one of them matches, the corresponding tag is assigned to the scalar.
PyYAML allows an application to add custom implicit tag resolvers.

YAML tags and Python types

The following table describes how nodes with different tags are converted
to Python objects.

YAML tag

Python type

Standard YAML tags

!!null

None

!!bool

bool

!!int

int or long

!!float

float

!!binary

str

!!timestamp

datetime.datetime

!!omap, !!pairs

list of pairs

!!set

set

!!str

str or unicode

!!seq

list

!!map

dict

Python-specific tags

!!python/none

None

!!python/bool

bool

!!python/str

str

!!python/unicode

unicode

!!python/int

int

!!python/long

long

!!python/float

float

!!python/complex

complex

!!python/list

list

!!python/tuple

tuple

!!python/dict

dict

Complex Python tags

!!python/name:module.name

module.name

!!python/module:package.module

package.module

!!python/object:module.cls

module.cls instance

!!python/object/new:module.cls

module.cls instance

!!python/object/apply:module.f

value of f(...)

String conversion

There are four tags that are converted to str and unicode values:
!!str, !!binary, !!python/str, and !!python/unicode.

!!str-tagged scalars are converted to str objects if its value is ASCII. Otherwise it is converted to unicode.
!!binary-tagged scalars are converted to str objects with its value decoded using the base64 encoding.
!!python/str scalars are converted to str objects encoded with utf-8 encoding.
!!python/unicode scalars are converted to unicode objects.

Conversely, a str object is converted to

a !!str scalar if its value is ASCII.

a !!python/str scalar if its value is a correct utf-8 sequence.

a !!binary scalar otherwise.

A unicode object is converted to

a !!python/unicode scalar if its value is ASCII.

a !!str scalar otherwise.

Names and modules

In order to represent static Python objects like functions or classes, you need to use
a complex !!python/name tag. For instance, the function yaml.dump can be represented as

!!python/name:yaml.dump

Similarly, modules are represented using the tag !python/module:

!!python/module:yaml

Objects

Any pickleable object can be serialized using the !!python/object tag:

!!python/object:module.Class { attribute: value, ... }

In order to support the pickle protocol, two additional forms of the !!python/object tag
are provided:

load(stream) parses the given stream and returns a Python object constructed from
for the first document in the stream. If there are no documents in the stream, it returns None.

load_all(stream) parses the given stream and returns a sequence of Python objects
corresponding to the documents in the stream.

safe_load(stream) parses the given stream and returns a Python object constructed from
for the first document in the stream. If there are no documents in the stream, it returns None.
safe_load recognizes only standard YAML tags and cannot construct an arbitrary Python object.

safe_load_all(stream) parses the given stream and returns a sequence of Python objects
corresponding to the documents in the stream. safe_load_all recognizes only standard YAML tags
and cannot construct an arbitrary Python object.

dump(data, stream=None) serializes the given Python object into the stream.
If stream is None, it returns the produced stream.

dump_all(data, stream=None) serializes the given sequence of Python objects
into the given stream. If stream is None, it returns the produced stream.
Each object is represented as a YAML document.

safe_dump(data, stream=None) serializes the given Python object into the stream.
If stream is None, it returns the produced stream. safe_dump produces only standard YAML
tags and cannot represent an arbitrary Python object.

safe_dump_all(data, stream=None) serializes the given sequence of Python objects
into the given stream. If stream is None, it returns the produced stream.
Each object is represented as a YAML document. safe_dump_all produces only standard YAML
tags and cannot represent an arbitrary Python object.

add_constructor(tag, constructor) allows to specify a constructor for the given tag.
A constructor is a function that converts a node of a YAML representation graph to a native Python object.
A constructor accepts an instance of Loader and a node and returns a Python object.

add_multi_constructor(tag_prefix, multi_constructor) allows to specify a multi_constructor
for the given tag_prefix. A multi-constructor is a function that converts a node of a YAML
representation graph to a native Python object. A multi-constructor accepts an instance of Loader,
the suffix of the node tag, and a node and returns a Python object.

add_representer(data_type, representer) allows to specify a representer for Python objects
of the given data_type. A representer is a function that converts a native Python object to a node
of a YAML representation graph. A representer accepts an instance of Dumper and an object and returns a node.

add_multi_representer(base_data_type, multi_representer) allows to specify a multi_representer
for Python objects of the given base_data_type or any of its subclasses. A multi-representer is
a function that converts a native Python object to a node of a YAML representation graph.
A multi-representer accepts an instance of Dumper and an object and returns a node.

add_implicit_resolver(tag, regexp, first) adds an implicit tag resolver for plain scalars.
If the scalar value is matched the given regexp, it is assigned the tag. first is a
list of possible initial characters or None.

add_path_resolver(tag, path, kind) adds a path-based implicit tag resolver.
A path is a list of keys that form a path to a node in the representation graph.
Paths elements can be string values, integers, or None. The kind of a node can
be str, list, dict, or None.

Mark

Mark(name, index, line, column,buffer, pointer)

An instance of Mark points to a certain position in the input stream. name is
the name of the stream, for instance it may be the filename if the input stream is a file.
line and column is the line and column of the position (starting from 0).
buffer, when it is not None, is a part of the input stream that contain the position
and pointer refers to the position in the buffer.

YAMLError

YAMLError()

If the YAML parser encounters an error condition, it raises an exception which is an instance of
YAMLError or of its subclass. An application may catch this exception and warn a user.

Events

Events are used by the low-level Parser and Emitter interfaces, which are similar to the SAX API.
While the Parser parses a YAML stream and produces a sequence of events, the Emitter accepts
a sequence of events and emits a YAML stream.

The flow_style flag indicates if a collection is block or flow. The possible values are
None, True, False. The style flag of a scalar event indicates the style of the scalar.
Possible values are None, '', '\'', '"', '|', '>'. The implicit flag of a collection
start event indicates if the tag may be omitted when the collection is emitted. The implicit flag
of a scalar event is a pair of boolean values that indicate if the tag may be omitted when the scalar
is emitted in a plain and non-plain style correspondingly.

Nodes

Nodes are entities in the YAML informational model. There are three kinds of nodes:
scalar, sequence, and mapping. In PyYAML, nodes are produced by Composer
and can be serialized to a YAML stream by Serializer.

The style and flow_style flags have the same meaning as for events.
The value of a scalar node must be a unicode string. The value of a sequence node is
a list of nodes. The value of a mapping node is a dictionary which keys and values
are nodes.

Loader

Loader(stream)
SafeLoader(stream)
BaseLoader(stream)# The following classes are available only if you build LibYAML bindings.
CLoader(stream)
CSafeLoader(stream)
CBaseLoader(stream)

Loader(stream) is the most common of the above classes and should be used in most cases.
stream is an input YAML stream. It can be a string, a Unicode string, an open file, an open Unicode file.

Loader supports all predefined tags and may construct an arbitrary Python object. Therefore it is not safe to use
Loader to load a document received from an untrusted source. By default, the functions scan, parse,
compose, construct, and others use Loader.

SafeLoader(stream) supports only standard YAML tags and thus it does not construct class instances and
probably safe to use with documents received from an untrusted source. The functions safe_load and
safe_load_all use SafeLoader to parse a stream.

BaseLoader(stream) does not resolve or support any tags and construct only basic Python objects:
lists, dictionaries and Unicode strings.

CLoader, CSafeLoader, CBaseLoader are versions of the above classes written in C
using the LibYAML library.

Loader.construct_scalar(node) checks that the given node is a scalar and returns its value.
This function is intended to be used in constructors.

Loader.construct_sequence(node) checks that the given node is a sequence and returns a list
of Python objects corresponding to the node items. This function is intended to be used in constructors.

Loader.construct_mapping(node) checks that the given node is a mapping and returns a dictionary
of Python objects corresponding to the node keys and values. This function is intended to be used in constructors.

Dumper(stream) is the most common of the above classes and should be used in most cases.
stream is an output YAML stream. It can be an open file or an open Unicode file.

Dumper supports all predefined tags and may represent an arbitrary Python object. Therefore
it may produce a document that cannot be loaded by other YAML processors. By default, the functions
emit, serialize, dump, and others use Dumper.

SafeDumper(stream) produces only standard YAML tags and thus cannot represent class instances and
probably more compatible with other YAML processors. The functions safe_dump and safe_dump_all
use SafeDumper to produce a YAML document.

BaseDumper(stream) does not support any tags and is useful only for subclassing.

CDumper, CSafeDumper, CBaseDumper are versions of the above classes written in C
using the LibYAML library.

Dumper.emit(event)

Dumper.emit(event) serializes the given event and write it to the output stream.

Dumper.open()
Dumper.serialize(node)
Dumper.close()

Dumper.open() emits StreamStartEvent.

Dumper.serialize(node) serializes the given representation graph into the output stream.

Subclassing YAMLObject is an easy way to define tags, constructors, and representers
for your classes. You only need to override the yaml_tag attribute. If you want
to define your custom constructor and representer, redefine the from_yaml and to_yaml method
correspondingly.

Deviations from the specification

need to update this section

rules for tabs in YAML are confusing. We are close, but not there yet.
Perhaps both the spec and the parser should be fixed. Anyway, the best
rule for tabs in YAML is to not use them at all.

Byte order mark. The initial BOM is stripped, but BOMs inside the stream
are considered as parts of the content. It can be fixed, but it's not
really important now.

Empty plain scalars are not allowed if alias or tag is specified. This
is done to prevent anomalities like [ !tag, value], which can be
interpreted both as [ !<!tag,> value ] and [ !<!tag> "", "value" ].
The spec should be fixed.

Indentation of flow collections. The spec requires them to be indented
more than their block parent node. Unfortunately this rule many intuitively
correct constructs invalid, for instance,

block: {
} # this is indentation violation according to the spec.

':' is not allowed for plain scalars in the flow mode. ~{1:2} is
interpreted as { 1 : 2 }.~