Cons

Current JSON formats are not aligned - differnent approaches - making it JSON-user friendly versus making it familiar to existing RDF users.

Needs some R&D and alignment.

Risk that the result would be some standard that would not be adopted if it was not 'web author' friendly.

Deliverables

JSON Serialization of RDF

Questions to Contemplate

What are the use cases for the JSON serialization?

Are we to create a lightweight JSON based RDF interchange format optimized for machines and speed, or an easy to work with JSON view of RDF optimized for humans (developers)?

Is it necessary for developers to know RDF in order to use the simplest form of the RDF-in-JSON serialization?

Should we attempt to support more than just RDF? Key-value pairs as well? Literals as subjects?

Must RDF in JSON be 100% compatible with the JSON spec? Or must it only be able to be read by a JavaScript library and thus be JSON-like-but-not-compatible (and can thus deviate from the standard JSON spec)?

Must all major RDF concepts be expressible via the RDF in JSON syntax?

Should we go more for human-readability, or terse/compact/machine-friendly formats? What is the correct balance?

Should there be a migration story for the JSON that is already used heavily on the Web? For example, in REST-based services?

Should processing be a single-pass or multi-pass process? Should we support SAX-like streaming?

Should there be an API defined in order to easily map RDF-in-JSON to/from language-native formats?

RDF in JSON Use Cases

RDF REST Web Services

Frank wants to be able to easily post and get RDF data RESTfully via Web Services. He wants to make sure that the data that is exchanged looks very much like the JSON data that is passed to and from popular services like Twitter's API. He wants to utilize the current JSON-based tools and workflows that he uses for all of his other data on the Web, but add semantics to that data in a way that is easy to explain to his fellow developers.

Easy Transition to/from JSON Web Services

Stacy designed the data that is sent and received by her Web Services in a way that maps very easily to RDF. She wants to be able to take the data that she is already publishing and transform it into RDF for internal use. She wants to be able to do this without impacting the developers that are currently using her system. She also wants to be able to give the developers that care about RDF a data model that maps to RDF well. She would like to support both regular JSON developers and semantic web JSON developers at the same time via her JSON-based Web Services API.

Digital Signatures on Graphs

Graeme would like to publish assets for sale on his website via a JSON-based Web Services API. He would like this data to be cached on third party sites without the pricing information being changed or forged. He accomplishes this by digitally signing the graph of information that he publishes such that search engines and other caching mechanisms can relay the information without needing to directly access his site. By cryptographically signing the graph, he is also ensuring that information about the asset, including pricing information, cannot be changed or forged to different values.

Universal Payment Standard for the Web

The PaySwarm Web platform is an open web standard that enables Web browsers and Web devices to perform Universal Web Payment. The nascent standard is using a form of RDF in JSON extensively in order to support distributed listing of assets, description of licenses and digital contracts, and digital signatures on graphs of RDF information. Information is published via HTML+RDFa and then used in JSON-form when transmitted to and from PaySwarm-aware Web Services.

RDF in JSON Design Requirements

There should be two serialization formats

There should be a machine-friendly serialization format and there should be a human-friendly serialization format.

-1 Manu Sporny, given the limited time for this working group, I think we should focus on the human-friendly serialization format. RDF already has a number of machine-friendly serialization formats.

A primary goal SHOULD be to build a human-friendly version of the serialization for JSON developers

The serialization should be optimized for humans first, machines second. The ability for machines to quickly parse the file is secondary to the ability for developers to be able to use the serialization with JavaScript. A focus should be placed on making the serialization fit into JavaScript frameworks easily, even at the cost of JSON-LD processor implementation complexity.

-1 Lee. Given the existing work in the RDFa group on an API, I'd rather see a simple, machine-friendly format that implementations can then make available via an API. I'm not convinced that a standard human-friendly JSON format is a big win.

A primary goal SHOULD be to build a machine-optimized version of the serialization

The serialization should be optimized for machines first, humans second. The ability to use the serialization in JavaScript is secondary to the ability for machines to quickly parse the file. A focus should be placed on making implementations very easy to write.

The serialization SHOULD be able to transform most JSON in use today into RDF

There should be a flexible mechanism, such as a "context", that is capable of mapping from JSON key-value pairs to RDF triples. This mechanism could be specified either in-band or out-of-band from the serialization. Having this feature could map much of the existing JSON in the wild into RDF.

Developers do not need to be familiar at all with RDF to start using the serialization

Understanding the semantic web and the concepts of RDF (triples, graphs, etc.) should not be required in order to use the format. That means that the format may have a very simple, stripped down version for beginners and a more advanced set of features for semantic web enthusiasts.

The serialization MAY include features not in RDF

There are certain features, such as generic key-value pairs in JSON that do not map well to RDF. They would map well if RDF had a concept of plain literals in the subject or predicate position. The serialization could include these concepts but may specify that the values may not be serialized to all RDF serialization formats (such as RDF/XML, TURTLE or RDFa).

The serialization MUST be 100% compatible with the JSON spec

Additional features such as comments or short-hand notation to support datatypes could be supported in the serialization if we extended the JSON format. This would mean that the serialization would be incompatible with vanilla JSON readers and writers. While this may make serialization nicer, we should not make any additions/modifications to the JSON format to ensure maximum compatibility with pre-existing processors.

It is a requirement that all RDF concepts MUST be expressible in the serialization

There are concepts like RDF datatypes and g-snaps/graph literals that could be omitted from the serialization in order to reduce learning and implementation complexity.

-1 Manu Sporny, Good design is a balancing act - we should only include what will help the most number of people.

There should be a migration story for going from existing JSON in the wild to this new format

The serialization task force should ensure that there is a subset of the serialization that is useful to beginners that use pure JSON, then show how developers could sprinkle in a little RDF into their JSON, then show how developers can fully migrate to the new serialization format. The transition to the serialization format will probably take multiple years The transition should be as smooth and organic as possible. We should also understand that many may not need to transition to RDF - JSON may work just fine for their application. We should not assume that people will go straight from regular JSON to the new serialization format.

Memory usage and CPU usage while processing SHOULD be a primary consideration

Memory and CPU usage for processing JSON is low. We should ensure that processing the serialization format is only slightly more complex than processing regular JSON.

+0 Manu Sporny, we want to be cognizant of resource usage but I don't think this should be a primary driver for design decisions for the language.

The serialization MUST support disjoint/unconnected graphs

All current RDF serialization formats allow you to express two graphs that are not necessarily connected to one another. The new serialization format should allow the same mechanism. This is also important because normalization is difficult to achieve in a general way without also supporting disjoint graphs in the serialization. JSON-LD disjoint graphs example.

The serialization MUST provide a normalization algorithm

Normalization, also known as canonicalization, is typically used when determining whether two sub-graphs that are expressed in different ways are identical. It is also very useful when hashing sub-graphs for checksumming or digital signature purposes. JSON-LD normalization example.

+1 Manu Sporny, I think we need normalization because we need to have a good digital signatures story

The serialization SHOULD enable digital signatures

Digital Signatures have a number of useful purposes. When combined with g-snaps/graph literals they provide a very easy way of establishing cryptographically verifiable provenance. These features are used heavily in electronic commerce. JSON-LD digital signature example.

The serialization SHOULD support advanced graph concepts

The serialization format should support advanced graph concepts such as g-box, g-snap and g-text such that you can make statements about snapshots of graphs. Annotating graphs with metadata such as graph retrieval time, digital signatures on the contents of the graph, and other metadata associated with graphs are an important feature for higher-level concepts like provenance. Sandro's explanation of advanced graph concepts.

The serialization MUST support automatic typing

Being able to transform a JSON document into a native object is one of the key benefits of using JSON over other serialization formats. Automatically typing of numbers and boolean values into language-native datatypes removes an extra step that developers must perform without this feature. For example, one could easily transform a serialized number that is an xsd:integer into a language-native integer. JSON-LD automatic typing example.

The serialization SHOULD support type coercion

While not immediately obvious, type coercion allows one to map regular JSON into RDF in a way that may add datatype decorators to object literals. In other words, it provides for a way to get Typed Literals from regular JSON data. JSON-LD type coercion example.

The serialization SHOULD rely on microsyntaxes instead of nested structures

There are two common approaches to expressing RDF in JSON. One of them is to use nested structures to express language and type information for literals. The other approach is to use shallow structures with microsyntaxes mirroring TURTLE to express language and type information for literals.

-1 Richard Cyganiak It's ugly as hell and makes the language unusable without an API

The serialization SHOULD provide an API

An API would allow developers to transform incoming documents into a format that is easier for them to work with. In other words, it would allow them to drop all type information if it wasn't useful to them, or remove any micro-syntaxes that would get in the way of basic usage of the data. Keep in mind that even JSON has an api: JSON.parse(). JSON-LD API example.

There SHOULD be one and only one way to serialize a given triple

The more different ways there are to express the same triple or graph, the harder it gets to use the host language's native toolbox (that is, pure JS expressions) to process data. At some point, using the host language becomes impossible without using a parser library layered on top of the host language, negating the benefit of basing the language on JSON in the first place. This is the lesson to be learnt from RDF/XML.

+0 Manu Sporny, while I agree in principle I don't know how we'd enforce this in practice - that is, what's the difference between "foo" and "foo"^^xsd:string in JSON? Would you serialize the plain literal "foo" and the Typed Literal "foo"^^xsd:string in the same way in JSON? If the answer is yes, isn't the translation lossy?