Introduce jsonb, a structured format for storing json.
The new format accepts exactly the same data as the json type. However, it is
stored in a format that does not require reparsing the orgiginal text in order
to process it, making it much more suitable for indexing and other operations.
Insignificant whitespace is discarded, and the order of object keys is not
preserved. Neither are duplicate object keys kept - the later value for a given
key is the only one stored.
The new type has all the functions and operators that the json type has,
with the exception of the json generation functions (to_json, json_agg etc.)
and with identical semantics. In addition, there are operator classes for
hash and btree indexing, and two classes for GIN indexing, that have no
equivalent in the json type.
This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which
was intended to provide similar facilities to a nested hstore type, but which
in the end proved to have some significant compatibility issues.
Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan.
Review: Andres Freund

After it got committed, it got covered pretty well, but I decided to write about it too, with some examples.

First, let's see how it works.

I'll start with some test values:

{"a":"abc","d":"def","z":[1,2,3]}

{"a":"abc","d";"def","z":[1x2,3]}

{"a":"abc","d":"def","z":[1,2,3]}

and

{"a":"abc","d":"def","z":[1,2,3],"d":"overwritten"}

First, let's see what will happen after casting these to json and jsonb:

In both cases it correctly reported error, but in the jsonb case it said
“invalid input syntax for type json". It's probably due to cast order, and
should be OK normally. And anyway – JSON and JSONB are similar enough not to
cause problems.

In here we see the whitespace removal. I'd say it's pretty cool. Of course not if (for whatever reason) you want to retain the spaces, but these shouldn't be meaningful, so depending on them being there doesn't sound wise.

The nesting restriction referred to is really just that you cannot query against a “sub-document” extracted using the -> operator, for example, unless you have an appropriate expression index. This might be a particular problem if you were checking existence in your predicate (the ‘?’ operator), because that simply doesn’t work past the first nesting level (those are the semantics, but it would be hard to make “nested existence” checking work with the current approach to GIN indexing, plus that mostly isn’t too compelling anyway). There may be some minor advantages to testing “existence” rather than “containment”. Perhaps even the term “containment” is misleading, since it’s a kind of nested containment, but that’s a holdover from hstore.

We want to discourage people from indexing everything, which is more or less what a straight GIN index on a jsonb column represents. People should continue to make informed decisions about what to index in a way consistent with actual querying patterns, since presumably in general it isn’t all that useful to be able to use an index scan to find *any* one thing.

Having said all that, if you’re using a jsonb_hash_ops GIN index, this approach actually works surprisingly well.

No doubt that JSONB will be useful to web developers, but it seems useless or annoying to everyone else as a regular document store. One of the attractions to PostgreSQL is its comprehensive data type support, as well as its user defined data types. If it requires casting dates and others into strings or integers, then those types have to be cast back for each query that reads them. Besides being very tedious to code around, this negates any benefit from being stored as binary.

I suppose if one wants a hierarchical data store, the xml type may be preferable as XML can be used in various ways to deal with rich data types. But that is still a pain. I would rather have something like hstore2, but with all data types supported. That way, the I/O through the database driver is just like any column in the database.