How can one create an RDF dataset without being a web server?

RESOLVED: We change the definition of RDF Datasets to allow blank node graph names. We note (in rdf-concepts) that earlier definitions of datasets did not include blank node graph names, and Skolemizaiton may be useful in providing compatiblity. This closes ISSUE-131 ←

===========

In general, the SPARQL definition of datasets (adopted into RDF 1.1 by WG resolution on 29 October 2012) satisfies our charter deliverable of allowing people to work with multiple graphs. However, it requires that each graph be labeled with an IRI, and creating such an IRI can be problematic.

It's easy enough for software to make up IRIs for graphs if it happen to be a web server, in charge of some range of web addresses. But how can other software do this? For instance, how can a web client create a dataset to send as one of several parameters in an HTTP POST operation? And how can a web client use datasets for HTTP PATCH (as the LDP Working Group wants to do). And how can something use datasets in a UDP or TCP based protocol?

At the moment, a few options come to mind:

Option 1 - Use RFC-4122 Random UUIDs as graph names. These are IRIs that look like urn:uuid:7a745845-5a5e-46ad-9ae7-6ec202741183, where the hex parts are 122 random bits, and 6 fixed bits (RFC-4122, sec 4.4). In theory, collision is unlikely if a good source of randomness is available. Perhaps the randomness can be improved by including a hash of the other parts of the dataset. Note that use of non-resolvable IRIs like this is bad practice for Linked Data.

Option 2 - Use a UUID-like string as an IRI base or prefix for graph names. (Slight variation on Option 1.) By going outside the RFC-4122 syntax, we can include a "local part" in the IRI. Something like:

Option 3 - Use a "relative" dataset, where the graph names are written as relative IRIs but the base for IRI-resolution is not known to the system generating the dataset and is assigned to some new, unique IRI base by each receiver. This is arguably not licensed by the current RDF drafts or the SPARQL 1.1 spec. Some client libraries will not store or serialize RDF with relative IRIs.

<#g3> { ... contents of graph 3 ... }

Option 4 - Use blank nodes as graph names. This is not allowed in Datasets as defined in the current RDF drafts or the SPARQL 1.1 spec. Some client libraries will not store or serialize RDF datasets with blank node graph names. As with other uses of blank nodes, knowing they cannot be referenced by other documents allows certain optimizations, and they can be Skolemized for use in systems that do not want/allow blank nodes.

_:g4 { ... contents of graph 4 ... }

Option 5 - Do not directly support this use case in RDF 1.1. Instead, require systems to use an extended RDF which allows blank node graph names, eg JSON-LD, or variations on TriG and N-Quads which may arise for this purpose.

Related notes:

RESOLVED: We change the definition of RDF Datasets to allow blank node graph names. We note (in rdf-concepts) that earlier definitions of datasets did not include blank node graph names, and Skolemizaiton may be useful in providing compatiblity. This closes ISSUE-131 ←