JSON implementation for Ruby

Description

This is a implementation of the JSON specification according to RFC 4627 www.ietf.org/rfc/rfc4627.txt
. Starting from version 1.0.0 on there will be two variants available:

A pure ruby variant, that relies on the iconv and the stringscan
extensions, which are both part of the ruby standard library.

The quite a bit faster C extension variant, which is in parts implemented
in C and comes with its own unicode conversion functions and a parser
generated by the ragel state machine compiler www.cs.queensu.ca/~thurston/ragel
.

Both variants of the JSON generator generate UTF-8 character sequences by
default. If an :ascii_only option with a true value is given, they escape
all non-ASCII and control characters with uXXXX escape sequences, and
support UTF-16 surrogate pairs in order to be able to generate the whole
range of unicode code points.

All strings, that are to be encoded as JSON strings, should be UTF-8 byte
sequences on the Ruby side. To encode raw binary strings, that aren't
UTF-8 encoded, please use the to_json_raw_object method of String (which
produces an object, that contains a byte array) and decode the result on
the receiving endpoint.

The JSON parsers can parse UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, and
UTF-32LE JSON documents under Ruby 1.8. Under Ruby 1.9 they take advantage
of Ruby's M17n features and can parse all documents which have the
correct String#encoding set. If a document string has ASCII-8BIT as an
encoding the parser attempts to figure out which of the UTF encodings from
above it is and trys to parse it.

Installation

It's recommended to use the extension variant of JSON, because it's
faster than the pure ruby variant. If you cannot build it on your system,
you can settle for the latter.

Just type into the command line as root:

# rake install

The above command will build the extensions and install them on your
system.

# rake install_pure

or

# ruby install.rb

will just install the pure ruby implementation of JSON.

If you use Rubygems you can type

# gem install json

instead, to install the newest JSON version.

There is also a pure ruby json only variant of the gem, that can be
installed with:

# gem install json_pure

Compiling the extensions yourself

If you want to build the extensions yourself you need rake:

You can get it from rubyforge:
http://rubyforge.org/projects/rake
or just type
# gem install rake
for the installation via rubygems.

If you want to create the parser.c file from its parser.rl file or draw
nice graphviz images of the state machines, you need ragel from: www.cs.queensu.ca/~thurston/ragel

Usage

To use JSON you can

require 'json'

to load the installed variant (either the extension 'json' or the
pure variant 'json_pure'). If you have installed the extension
variant, you can pick either the extension variant or the pure variant by
typing

require 'json/ext'

or

require 'json/pure'

Now you can parse a JSON document into a ruby data structure by calling

JSON.parse(document)

If you want to generate a JSON document from a ruby data structure call

JSON.generate(data)

You can also use the pretty_generate method (which formats the output more
verbosely and nicely) or fast_generate (which doesn't do any of the
security checks generate performs, e. g. nesting deepness checks).

To create a valid JSON document you have to make sure, that the output is
embedded in either a JSON array [] or a JSON object {}. The easiest way to
do this, is by putting your values in a Ruby Array or Hash instance.

There are also the JSON and JSON[] methods which use parse on a String or
generate a JSON document from an array or hash:

To get back a ruby data structure from a JSON document, you have to call
JSON.parse on it:

JSON.parse json
# => [1, 2, {"a"=>3.141}, false, true, nil, "4..10"]

Note, that the range from the original data structure is a simple string
now. The reason for this is, that JSON doesn't support ranges or
arbitrary classes. In this case the json library falls back to call
Object#to_json, which is the same as #to_s.to_json.

It's possible to add JSON support serialization to arbitrary classes by
simply implementing a more specialized version of the #to_json method, that
should return a JSON object (a hash converted to JSON with #to_json) like
this (don't forget the *a for all the arguments):

The hash key 'json_class' is the class, that will be asked to
deserialise the JSON representation later. In this case it's
'Range', but any namespace of the form 'A::B' or
'::A::B' will do. All other keys are arbitrary and can be used to
store the necessary data to configure the object to be deserialised.

If a the key 'json_class' is found in a JSON object, the JSON
parser checks if the given class responds to the json_create class method.
If so, it is called with the JSON object converted to a Ruby hash. So a
range can be deserialised by implementing Range.json_create like this:

JSON.generate always creates the shortest possible string representation of
a ruby data structure in one line. This is good for data storage or network
protocols, but not so good for humans to read. Fortunately there's also
JSON.pretty_generate (or JSON.pretty_generate) that creates a more readable
output:

There are also the methods Kernel#j for generate, and Kernel#jj for
pretty_generate output to the console, that work analogous to Core
Ruby's p and the pp library's pp methods.

The script tools/server.rb contains a small example if you want to test,
how receiving a JSON object from a webrick server in your browser with the
javasript prototype library www.prototypejs.org works.

Speed Comparisons

I have created some benchmark results (see the benchmarks/data-p4-3Ghz
subdir of the package) for the JSON-parser to estimate the speed up in the
C extension:

In the table above 1 is JSON::Ext::Parser, 2 is YAML.load with YAML
compatbile JSON document, 3 is is JSON::Pure::Parser, and 4 is
ActiveSupport::JSON.decode. The ActiveSupport JSON-decoder converts the
input first to YAML and then uses the YAML-parser, the conversion seems to
slow it down so much that it is only as fast as the JSON::Pure::Parser!

If you look at the benchmark data you can see that this is mostly caused by
the frequent high outliers - the median of the Rails-parser runs is still
overall smaller than the median of the JSON::Pure::Parser runs:

In the table above 1-3 are JSON::Ext::Generator methods. 4, 6, and 7 are
JSON::Pure::Generator methods and 5 is the Rails JSON generator. It is now
a bit faster than the generator_safe and generator_pretty methods of the
pure variant but slower than the others.

To achieve the fastest JSON document output, you can use the fast_generate
method. Beware, that this will disable the checking for circular Ruby data
structures, which may cause JSON to go into an infinite loop.