The state of YAML in PHP

December 21, 2009

My first exposure to YAML was in 2001, back in the days when I was mainly
working with Perl. Well, I was not using YAML per se at that time, but rather
Data::Denter, a
Perl library that provides data serialization/deserialization. I used this
library mainly for debugging purposes. From its documentation:

"It formats nested data structures in an indented fashion. It is optimized
for human readability/editability, safe deserialization, and (eventually)
speed."

At the end of the year 2002, the module was deprecated in favor of a new
serialization language, YAML, with the added bonus of
being programming language independent. I promptly switched to use the Perl
YAML module, and I
never looked back. I used YAML as a mean to debug my Perl programs, but I also
started to use it more and more to store configuration data.

When I started to use PHP at the end of 2004, one of the first thing that
quickly bothered me was the poor support for YAML in the PHP world.

By the way, if symfony uses YAML a lot, it has nothing to do with Ruby on
Rails ;) It just happens that Ruby also has some Perl heritage!

But first, what is YAML?

According to the official YAML website, YAML (YAML Ain't Markup Language), is
a human friendly data serialization standard for all programming
languages.

YAML can be used to describe both simple and complex data structures. It's an
easy to learn language that describes data. As PHP, it has a syntax for simple
types like strings, booleans, floats, integers, arrays, and even more complex
ones like objects.

Nowadays, YAML is a heavily used format for configuration files, mainly
because even non programmers are able to understand and modify YAML files
easily.

To sum up the benefits of YAML, I often say that YAML files are as
expressive as XML files and as readable as INI files.

Since the creation of YAML, another lightweight data-interchange format has
come to life: JSON. JSON is quite similar to YAML (and as
a matter of fact, JSON is a subset of YAML); but even if it is easy for humans
to read and write, I think it is not as readable as YAML, and a bit too
verbose.

YAML

If you already know what is YAML and how to use it to describe your data
structures, just skip this section.

Besides strings, Booleans, and numbers, let's have a look at one of the
simplest configuration structure you can describe with YAML:

key: value
foo: bar

The above snippet is the simplest way to express key/value pairs in YAML. The
foo key has a bar value. The equivalent PHP code would be:

This section has barely scratched the surface of what you can express with
YAML. If you want to learn more, you will find plenty of
documentation
on the Internet.

YAML in PHP

YAML is human-friendly, but not so developer-friendly for someone willing to
write a parser for it. The YAML specification is really huge. If you
read it carefully, you can easily
imagine that writing a YAML parser is not an easy task. As I mainly use YAML
as a configuration format like many other developers, I'm more looking for a
fast, incomplete but correct library, instead of a fat, spec-compliant one.

Back in 2005, I was looking for such a YAML parser and dumper for PHP. Chris
Wanstrath, who will eventually create
Github some years later, wrote one such limited parser
and dumper, Spyc, specifically to be used as
a simple configuration library.

I used it for symfony 1.0. I fixed some bugs
from time to time, but as time passed, I found many limitations and became
more and more frustrated about it. One day, I eventually decided to write a
more robust and stable YAML parser and dumper for symfony.

Since then, Alexey Zakhlestin created a
PECL extension that wraps the Syck
library.

At the beginning of 2009, I decided to release this library as a standalone
library, with no dependency whatsoever. It means that you can start using it
today.

The YAML Symfony Component

Released under the MIT license, the YAML Symfony Component can be used in any
application, even commercial ones.

When I created this YAML library for PHP, I had several goals in mind:

Ease of use: Installation should be easy and fast. Install it via PEAR,
download an archive, or checkout the SVN or Git repository, and you are
ready to go. No configuration. Drop the files in a directory and start
using it right away.

Fast: One of the main goal of Symfony YAML was to find the right balance
between speed and features.

Unit tested: The library is unit-tested (with more than 400 unit tests
as of today).

"Real" Parser: To correctly handle a large subset of the YAML
specification, a dedicated and hand-written parser has been written. The
parser is robust, easy to understand, and simple enough to extend.

Clear error messages: Whenever you have a syntax problem with your YAML
files, the library should output helpful messages with the filename and
the line number where the problem occurred. It eases debugging a lot.

And of course, YAML being not so well-known in the PHP world, the YAML
component also comes with a full
documentation.

The easiest way to
install the Symfony
YAML Component is probably to use the PEAR installer:

Using YAML in your Projects

The Symfony YAML library consists of two main classes: one to parse YAML
strings, and the other to dump a PHP variable to a YAML string. On top of
these two core classes, the main sfYaml class acts as a thin wrapper and
simplifies common uses:

YAML for PHP 5.3

The previous sections use the PHP 5.2 compatible version of the library. If
you have already switched to use PHP 5.3, the good news is that the YAML
Component is already available for that version too. For now, it is only
available on the Symfony 2 Subversion repository:

The YAML Symfony Component is already used by and bundled with many popular
Open-Source PHP software like symfony, Doctrine, and PHPUnit. Other frameworks
like the upcoming Okapi2 framework and the
mootools plugins repository,
announced
some days ago, make a heavy use of YAML and also use the YAML Symfony
Component.

Next time you look for a flexible mean to store or share data, consider using
YAML!

Discussion

simo — December 21, 2009 09:39 #1

The Yaml component is great! But I really think the ability to convert from xml to yml and vice-versa (as symfony framework do it) is missing. IMHO, the native implementation would make it more powerful and useful.

Is there a plan to go through that feature? thanks for your reply

Florian Mueller — December 21, 2009 10:09 #2

Hi,

Wouldn't it be better to return the contents as an iterator (maybe php 5.3 iterators or your own) and then eg provide functionality to determine line number and column of entry:

@simo: You can have create a generic converter from XML to YAML or vice-versa, because the semantics are quite different. In symfony, we support both YAML and XML, but the conversion is hand-crafted for each feature.

romanb — December 21, 2009 11:14 #4

I prefer XML over YAML any day because I get automatic validation + intellisense + code completion against a DTD/XSD by any decent XML editor. Compared to that, working with YAML files that you're not yet familar with can be a real pain.

Or did I miss something and there is a way to describe the valid structure of a YAML document (an equivalent to a DTD/XSD?) which can then be used by tools/IDEs to validate the document as you type and give inline help+intellisense etc.?

But whomever is driving the YAML spec seems to put those desirable features aside, and added a load more complexity.

Robin — December 21, 2009 17:56 #7

YAML can be great for config files and such, altough I agree with Romanb that it can be a problem that there is no standard schema definition for YAML yet.
But YAML should not be used on unreliable data streams; it is possible to describe data without an end delimiter. Therefore the parser can not be sure it has read all the data. With e.g. XML it uses an end element for that..
Anyway, great to have a choice :)

eswar — December 21, 2009 18:11 #8

hi

sapphirecat — December 21, 2009 19:49 #9

@Fabien: "You can have create a generic converter from XML to YAML..."

Perhaps that would be better said as "You cannot easily create a generic converter from XML to YAML..."

I tried it once, and the specific problem I ran into was that XML provides more dimensions of data than YAML, because XML tags _span_ document text. Elements and attributes can be fairly easily translated to YAML, but handling anything else seemed to require making the generated YAML look like the DOM API (list of children, node types, etc.). Otherwise, I couldn't losslessly preserve e.g. a paragraph with some emphasized words.

simo — December 21, 2009 23:15 #10

@Fabien > I thought the conversion was done by a generic and smart class! Pity, it's not that easy! it makes sense to switch from one to an other.

@sapphirecat > thanks for decoding ;-))

Fabian Spillner — December 22, 2009 22:42 #11

Thank you for share sfYaml component! It's so powerful, useful and working great and I couldn't imagine a project without Yaml configuration files. Thank you again!

Jeff Dickey — December 23, 2009 08:22 #12

I've been using YAML for a while now (see, for example, this recent blog entry at http://archlever.blogspot.com/2009/11/reuse-renew-recycle-data-structures.html). It's now my default format for configuration files and structured persistence in general, after nearly ten years working with XML. XML has better tools, as people have noted, but the only tool you really need for YAML 1.1 is a text editor; if your YAML is getting too unwieldy, that's an indication that you may want to refactor or redesign your code that uses it.

And as far as comparisons with JSON go.... you can express JSON /in/ YAML; there's at least a couple of people out there who've written code to do that. To me, that's akin to replacing a flat screwdriver with a Philips; don't use either on any "nails".

Olivier El Mekki — December 26, 2009 01:00 #13

I like YAML as well as JSON, but I never conceived them as similar.

Maybe theirs designs are alike, but I won't try to use json as configuration files (not expressive enough), as well as I won't try to send YAML back to an ajax request (json is builtin in the javascript layer of many browsers). They each happened to have different purpose.