JSON Syntax Options

This page is being used by the RDF WG to harvest different approaches to enabling the key features of RDF, in JSON.

URI Properties

RDF uses URIs to name things, including properties. A key benefit of this is that it allows different data sources to all use properties defined in open vocabularies, thus enabling shared understanding of data.

JSON on the other hand, is typically used for domain specific / silo based information where properties are simple lexical terms (like "name") and what the property "means" is documented somewhere out of band, for instance in API documentation, or in a JSON-Schema document.

There follows a collection of different approaches we can take which enable the use of URI identified properties in JSON.

Full URIs

{ "http://xmlns.com/foaf/0.1/name": "Bob" }

Benefits:

Unambiguous and easy to process.

When following your nose around the web, property equivalence uses the in serialization URI.

Drawbacks:

Increased bytesize over the wire.

Can be verbose to use when using the returned (JSON.parsed) data without an API or tooling.

Verbose to author.

Example usage (assuming the returned data has been JSON.parsed):

obj["http://xmlns.com/foaf/0.1/name"]
obj[ foaf('name') ] // when using a tabulator ns style approach in your code
obj[ resolve('foaf:name') ] // when using a function which allows the resolution of CURIEs as found in the RDF API

non-colon names only: Easy to use when using the returned (JSON.parsed) data without an API or tooling.

Drawbacks:

Requires tooling to normalize TERMs prior to using the data when following your nose around the web.

Requires TERM resolution to do property comparison (equivalence must be between URIs not TERMs)

Unreliable when following your nose around the web (the same URI could be shortened to "foo" or "bar")

with colon names only: Verbose to use when using the returned (JSON.parsed) data without an API or tooling.

Example usage (assuming the returned data has been JSON.parsed):

obj.name; // non-colon - but ONLY when you are familiar with the data and NOT when following your nose
obj["rdfs:label"]; // with colon - but ONLY when you are familiar with the data and NOT when following your nose

Note:
This may look wonderful, but comes with the one-vocab caveat that means when publishers require multiple terms, they will be likely to create "proxy" vocabularies that simply pull together many terms from different vocabularies and merge them. There is a processing and understanding cost to that which can't be stepped in to lightly.

Benefits:

Minimal bytesize over the wire

Familiar to traditional JSON users

Familiar to RDFa users

Easy to author

Unambiguous and easy to process.

Easy to use when using the returned (JSON.parsed) data without an API or tooling.

note: the additional types would need to be quoted like strings in order to keep JSON compatibility, e.g. "http://example.org/" rather than the same without quotes.

Benefits:

Potentially requires no special processing of data

Familiar to most users

Simple

Enough to cover most common use cases.

Drawbacks:

No way to use other common or custom datatypes

Property Range from Vocab

Either of the Limited Expressibility options could be augmented with type hinting from the range of the property being used.

Benefits:

Potentially requires no special processing of data (when not following your nose)

Familiar to most users

Simple

Enough to cover most common use cases.

Allows expression of common or custom datatypes

Drawbacks:

Potentially requires understanding of properties when following your nose & tooling to do so. (nathan: is this a drawback??)

Map the property to a datatype

Either of the Limited Expressibility options could be augmented with type hinting on the property, this could be included in the serialization, or in an external map as with the External Maps option for URIs.

Benefits:

Potentially requires no special processing of data (when not following your nose)

Familiar to most users

Simple

Enough to cover most common use cases.

Allows expression of common or custom datatypes

Drawbacks:

Potentially requires understanding of properties when following your nose & new tooling to do so.

Datatypes from JSON schema

As above, what we're doing could be merged with JSON Schema, in fact we could fully externalize and work with JSON Schema to create a single spec which covers most of the webs JSON needs, and our own RDF needs - but that's perhaps too wild for this group and out of charter.

(nathan likes this idea)

In-String TypedLiterals

This approach involves including both the data and the datatype in a single quoted string, for example "FDE3^^xsd:base64Binary"

note: the exact format of the combined string would be up for discussion, we may want to use full IRIs for datatypes, may explicitly offer a set of predefined tokens mapped to IRIs (e.g. "^int"), may have the datatype prefixed or postfixed - many different approaches

Benefits:

Can express all common and custom datatypes

Drawbacks:

Always requires special processing

Unfamiliar to most typical JSON users

Verbose

What to do when you don't understand a datatype?

Paired Values - value/datatype

Using either the object or array syntax from JSON, we could specify typed literals like such:

Languages

RDF currently includes support for specifying the language strings (for example english or dutch), Plain Literals, support is often serialization specific, with RDFa delegating to the lang/xml:lang attributes, and turtle taking the "Bob"@en approach.

JSON currently has no support for specifying the language of strings.

No Language

It's an option... JSON natively supports unicode, thus strings like "花澄" are perfectly acceptable, and JSON is used effectively throughout the web without requiring a language tag, and further often text consists of multiple different languages and which language tag to use is not clear. For example:

彭博社:2987名人大代表中70名最富的人资产总值为4931亿人民币，约751亿美元！The richest 70 of the 2,987 members have a combined wealth of 493.1 billion yuan ($75.1 billion)

Property Specifies Language

This option would involve language specific properties being created in vocabs, for example "rdfs:label-en" and "rdfs:label-ja".

Not saying much about this one as it's a huge change to RDF and quite possibly entirely impractical from almost every angle. But, it is an option.

Property Modifiers

This option involves adding a language hint to the property, as serialization sugar only, for example:

{ "label@en": "London" }

Benefits:

Can express languages

Potentially lighter to process than "in-string language"

Potentially smaller bytesize on the wire than both of the paired values option (and allows repetition)

Drawbacks:

Always requires special processing to use the data

Unfamiliar to most typical JSON users

Can be verbose when working with data in many languages (requires a min of one property value pair per language)

In-String Language

This approach involves including both the data and the language in a single quoted string, for example "花澄@ja"

note: the exact format of the combined string would be up for discussion, we may want to use IRIs for languages, may explicitly offer a set of predefined tokens (e.g. "@en"), may have the language prefixed ("ja@花澄") or postfixed ("花澄@ja") - many different approaches

Benefits:

Can express languages

Drawbacks:

Always requires special processing (including tracking back over parsed data)

Unfamiliar to most typical JSON users

Paired Values - value/language

Using either the object or array syntax from JSON, we could specify plain literals with languages as such:

{ "property": {
"_value": "花澄",
"_language": "ja",
}
}

Benefits:

Can express languages

Lighter to process than "in-string language"

Drawbacks:

Always requires special processing to use the data

Unfamiliar to most typical JSON users

Verbose

Paired Values - language arcs

Using either the object or array syntax from JSON, we could specify plain literals with languages as such:

Smaller bytesize on the wire than the other paired values option (allows repetition)

Drawbacks:

Always requires special processing to use the data

Unfamiliar to most typical JSON users

Externalized Languages / Language Mapped to Property

Can't think of a decent, reliable way to do this? Maybe somebody can.

Syntax Structure

RDF is very flexible syntax-wise, because it is a graph based data model (nodes and edges), and can be expressed in any number of ways, from a set of triples, through to key/value objects with a subject assigned.

JSON is typically used to express simple key/value objects, plain old data objects.

RDF in JSON can therefore be assembled as key/value objects with a subject assigned, or in an n-triples like manner (a big list of triples) or anywhere in-between, as with turtle.

note: the benefits and disadvantages run much deeper than the simple ones mentioned in this section, as each option in this document has it's own set of trade-offs, however some primary ones are listed here which are specific to the general syntax option.

Triples

This option involves specifying the serialization to be a simple set of s,p,o triples, an example may be:

and so forth, perhaps adopting some of the various options outlined in this document in the process.

Benefits:

Reduced bytesize over the wire

Drawbacks (depending how close to "Objects" you get):

Requires RDF Tooling to use for most practical purposes

Unfriendly for typical JSON Developers (and anybody working with the JSON.parsed data directly)

It's not triples, and it's not objects

Distinguishing Features:

approach is unconstrained and every option is viable, including multiple option combinations (multiple ways to state a property for example).

note: there may be more benefits, feel free to add, the original author of this document (nathan) can't see any though, to him this is just unfriendly turtle.

Objects

This approach starts with typical plain old simple objects as found in most JSON in the wild, then focusses on keeping it as close to the JSON data that's in the wild as possible, and allowing data from specific sources to be consumed without the use of RDF tooling. This lends more to a mapping based approach.

Example starting point:

{
"id": 1237642,
"name": "Bob",
"age": 44
}

Typical approach would be to start with a simple key/value object then layer on subjects and additional datatypes (like IRI and dates)

Simple for developers to work with when using JSON.parsed data / not nose-following

Minimal bytesize over the wire

Familiar to most users

Easy to publish without requiring a full RDF tooling or tech stack changes

Potentially allows bootstrapping of many web 2.0 data sources.

Drawbacks:

Requires RDF Tooling to use when following your nose around the web

Takes more processing when working with the data like RDF than the Triples approach

Distinguishing Features:

approach is constrained such that the end result would be as close to existing typical JSON usage as possible, to simple objects that is.

Summary

There are many different variations possible, especially when taking the "Iterative Reduction" approach.

Two points to consider from nathan:

Every follow your nose usecase always requires tooling and processing, so this can be null and voided from most of the drawback sections. The only variables are "how much processing?", "how big? (bytesize)" and "can this be easily used as simple JSON.parsed data when not following your nose?".

It helps to have a usecase/requirements/constraints when creating things, both the "triples" and "objects" approaches have clear requirements and end goals, the "iterative reduction" option on the other hand..