GraphQL Schemas to RDF/SHACL

GraphQL includes a schema language for defining
object types and fields.
In order to use GraphQL with RDF-based technologies like TopBraid Suite (version 6 onwards),
we needed to define a mapping from the GraphQL schema language to RDF, in particular
to SHACL shapes.

This mapping makes is possible to leverage GraphQL schemas as RDF data models and use them in conjunction
with existing RDF models.
Further, the mapping enables us to use the RDF technology (graph databases, rule engines, SPARQL and
data validation) in conjunction with GraphQL-based applications.
The mapping uses syntactic extensions (through GraphQL's directives extension point) to express richer
SHACL constraints.
This way, GraphQL can also be used as a user-friendly compact syntax for SHACL.

This document uses the prefix dash which represents the namespace http://datashapes.org/dash#
which is accessible via its URL http://datashapes.org/dash.
The prefix graphql represents the namespace http://datashapes.org/graphql#
which is accessible via its URL http://datashapes.org/graphql.

Overview

GraphQL is an increasingly popular language for describing queries
and updates using a JSON-based architecture.
As the name suggests, GraphQL has been designed for graph-shaped data models consisting of objects that
have fields, and fields may hold scalar values (aka literals) or link to other objects.
GraphQL's schema language defines a syntax to declare the types of such objects and fields.

Graphs also play a fundamental role in the Semantic Web world built upon the RDF family
of languages. RDF introduces the concept of nodes that are either literals or resources with a URI or
identified by internal IDs only (blank nodes). These nodes are used in triples that link a subject via
a predicate (property) to an object node.
While RDF Schema has a notion of classes that bears similarities with GraphQL object types,
there is a closer resemblance to the concept of shapes from the SHACL specification.
Like GraphQL object types, SHACL shapes also define fields via so-called property shapes,
and can include constructs to define the permissible value types of these fields, and many other constraint types.
Furthermore, GraphQL is often implemented as a view over data that is stored elsewhere, in
different forms. SHACL shapes also represent views on RDF nodes, allowing different dicing and slicing
of data to support different use cases.

To bring these two worlds closer together we defined a mapping from GraphQL schemas to RDF/SHACL data models.
As a result, structure from existing GraphQL systems can be re-used and JSON-based data can be seamlessly
converted into RDF graphs, for example to accomplish data integration tasks.

All GraphQL documents described here are valid GraphQL syntax.
A key design principle for us was to avoid syntactical elements that would break a GraphQL parser.
In some places the directives extension point of GraphQL is used.
Directives were specifically designed for tools to hook into GraphQL with information that is ignored
by other tools that are not aware of their meaning.
As soon as one or more tools agree on a set of such directives, a dialect of GraphQL emerges,
and using these directives is not limited to one particular implementation approach.
See GraphQL Data Shapes Directives for a general overview of most of the
directives used here, written for users without prior knowledge of RDF technology.
The rest of this page assumes familiarity with RDF and SHACL.

The following example GraphQL file defines a couple of object types, with fields and an enumeration.

The GraphQL schema above can be translated into the following RDF/SHACL, in Turtle notation.
Note that in this document we use blank nodes to represent property shapes, for brevity.
In many practical applications it is more sensible to use URIs for them, such as
ex:User-name.

The remaining sections of this document drill down into the technical details of this mapping.
The mapping is defined in the direction from GraphQL to SHACL, allowing all GraphQL schemas to
be treated as RDF/SHACL models.
The mapping can also be applied in reverse order, to produce GraphQL schemas from existing
RDF/SHACL models. However, that reverse mapping is partial, i.e. not all SHACL constructs have
a GraphQL equivalent.

GraphQL Names to URIs

The concept of URIs plays a key role in RDF modeling.
URIs are used to uniquely identify resources, properties and even graphs.
GraphQL does not natively have a concept of global identifiers, nor does it have a concept of
namespaces (all GraphQL names are simple Java-like identifiers).
Therefore, we designed conventions and instructions on how to turn any GraphQL schema into RDF graphs and URIs.

Identifying Graphs and their Imports

A graph name is a URI that identifies an RDF graph in a data set and in Linked Data use cases.
In RDF, graphs may import each other and then reference each other's terms.
The GraphQL directive @graph can be used with a GraphQL schema
definition, to declare the URI of the graph and any imported graphs.
This is illustrated in the following example:

Note that the converter has inserted default prefixes for a collection
of the well-known namespaces owl, rdf, rdfs,
sh and xsd.
These are always assumed to be present, e.g. for parsing qnames during conversion.

At a schema definition, the directive @graph can have an argument
uri of type String to specify the URI of the (named) graph itself.
If no such argument has been found in the GraphQL document, then the system will use the declared
default namespace (see the next section), if that exists.
If no default namespace has been declared, then the surrounding code is expected to provide a
default graph URI.
This may, for example, be a URL related to the GraphQL service or a URL derived from the name of
the GraphQL schema file.

The resource representing the named RDF graph gets rdf:type graphql:Schema.
The use of the rdf:type owl:Ontology is optional, yet recommended.

The argument imports of the @graph directive takes an array with items of
type String, each of which is turned into an owl:imports statement for the graph.

Namespace Prefixes

The GraphQL directive @prefixes can be used to declare RDF namespace prefixes.
These prefixes are used for the remainder of the conversion to turn GraphQL names into RDF IRIs.
The rule is that if a GraphQL name starts with a declared prefix and then the underscore,
then the IRI will be the namespace of the given prefix plus the remainder of the GraphQL name.
So with the prefix declaration above, the GraphQL name rdfs_Class is expanded into
the RDF resource rdfs:Class, aka http://www.w3.org/2000/01/rdf-schema#Class.

GraphQL names that do not match a given prefix (i.e. any plain name without underscores) are mapped
to URIs based on a default namespace.
That default namespace can be defined explicitly by adding " (default)" to the end
of a declared namespace, e.g.

schema
@prefixes (
ex: "http://example.com/ (default)"
) ...

In this case, a GraphQL name person would become the URI http://example.com/person.
If no such default namespace has been defined, the system will derive a default
namespace from the graph URI (using @graph(uri: ...) as described in the previous section):
If the graph URI ends with one of the
gen-delim characters such as /,
# or : then it will become the default namespace.
Otherwise, the default namespace will be the graph URI plus the # character.

The Root Query and Public Shapes

As explained in the following sections, GraphQL types are converted to SHACL shapes.
The property graphql:queryShape can be used to remember the SHACL shape that was created
from the GraphQL type referenced by query in the schema declaration.
The subject of the graphql:queryShape is the graph resource itself.
No graphql:queryShape triple is created if the GraphQL root query type is called _.

All shapes created for GraphQL object types and interfaces are recorded in the RDF schema resource
as values of graphql:publicShape.
This makes it possible to later reconstruct which of the shapes (in a data set of multiple RDF graphs)
belong together.

Types to Node Shapes

GraphQL object types, interface types and union types are mapped to SHACL node shapes
as outlined in the following subsections.

Object Types

Each GraphQL type is turned into a SHACL node shape.
As shown in the following example, sh:node will be used when
this type is referenced by a field:

Use the @class directive if the type shall also be turned into an rdfs:Class:

type Human @class {
...
}

ex:Human
a sh:NodeShape ;
a rdfs:Class ;
...

To let our engine know that it should convert all GraphQL types in your file into shapes that are
also classes, annotate the schema with the @classes directive.
Individual types can then override this default using @noClass:

Note that if the value type of a field is converted into a node shape that is also a class
(either through the @class directive or the @classes directive on
the whole schema), then the property sh:class will be used instead of sh:node
to link shapes via fields.

Use the subClassOf parameter of the @class directive to specify
one or more super classes, producing rdfs:subClassOf statements:

Each GraphQL type can be annotated with the @shape directive to give additional
input to the conversion.
@shape can take a parameter targetClass that is translated into
one or more sh:targetClass statements on the node shape.
The values of targetClass and subClassOf must be either:

GraphQL names (such as targetClass: Person)

Strings (such as targetClass: "ex:Person")

Arrays of the above.

In the case of strings, the values can be RDF qnames (using the defined namespace prefixes)
or, if this fails, full URIs. Strings may also be GraphQL names.
The following example refers to the class ex:Human, assuming that ex:
is the prefix of the default namespace:

Interface Types

GraphQL interfaces play a very similar role as object types, e.g. they can be used
as type of a field.
The mapping of interfaces to SHACL is similar to that of object types.
To distinguish interfaces from object types, shapes created from interfaces have the marker
property graphql:isInterface set to true.

The only added feature is that object types can implement interfaces, creating a
one-level-deep form of type extension or inheritance.
In the RDF world such type extension is sometimes represented using rdfs:subClassOf
and in the case of SHACL via an sh:node link from the "subclass" to the "superclass".
Intuitively, ex:SubShape sh:node ex:SuperShape means that any instance that
conforms to ex:SubShape must also conform to ex:SuperShape.

Fields to Property Shapes

Each GraphQL field declaration (from object types and interface types) gets mapped into
a SHACL property shape, connected to the corresponding node shape via sh:property.
The details of how to construct these property shapes are described in the following sub-sections.

Field Names to Paths

By default, the name of a field gets translated into a URI for a property following the
namespace-based syntax rules from above.
These then become values of sh:path in the property shape, as already shown
in many examples.

SHACL also supports complex path expressions, using a SPARQL-based syntax, that can be used
to walk properties in the inverse direction or take multiple steps at once.
To produce such paths, use the @shape directive with path as shown:

The values of path must be SPARQL path expressions that can be parsed using
the available namespace prefixes.

The property graphql:name is recommended to remember the original GraphQL name
in case the shapes are later translated back to a GraphQL schema.
graphql:name is mandatory for property shapes that use path expressions.

Shall we support a special syntax that does not need prefixes, e.g. @shape(path: INV_subClassOf)
for sh:inversePath, or @shape(inversePath: subClassOf)?
The inverse use case is very common, so syntactic sugar may help.

Scalar Types

If the type of a GraphQL field is a scalar type, then there is a sh:datatype
constraint in the property shape.
The datatype is selected according to the following table:

GraphQL Scalar Type

RDF Data Type

Boolean

xsd:boolean

Float

xsd:decimal

ID

xsd:string

Int

xsd:integer

String

xsd:string

Property shapes derived from ID fields get the value true for the property
graphql:isIDField true in addition to the sh:datatype xsd:string.

For explicitly declared scalar types that go beyond the GraphQL standard the system will by produce an
instance of graphql:ScalarType which is then referenced using sh:node.
A user-defined scalar type can be annotated with the datatype argument in the
@shape directive as follows:

If the enum has a @class directive, then the resulting node
shape also becomes a class, and each value an instance of that class, with the name of the value
as its rdfs:label.
This option also allows the values to contain comments:

Handling of Language-Tagged Strings

In RDF, the special datatype rdf:langString is used to represent language-tagged
strings, such as "Haus"@de and "House"@en.
Use the GraphQL object type LangString to use this type.
It maps to JSON objects with two fields: string (the lexical value such as "Haus")
and lang (the language tag such as "de"):

Note that this rule does not apply to list-valued GraphQL fields, because the semantics of
! means "not null" which includes the empty array [].

Order of Fields

The property sh:order MAY be used to record the relative order of fields from
the original GraphQL type.
By default, the first property shape will get sh:order "0"^^xsd:decimal, etc.
However, if a field declares a different order using @display(order: 7) then
this number is used instead.

Input Types

GraphQL input types are used to formalize the arguments of fields in query instances.
They are not mapped to shapes but to a specialized structure from the graphql: namespace.
These structures can be useful for round-tripping of GraphQL documents, or to perform RDF queries
over them, for example to explore linkage between various web services.

Directives for SHACL Constraints

The @shape directive can be used to attach other SHACL constraints to property shapes.
These declarations may also be good practices for GraphQL schema development in general, and
can also be used by non-RDF tools.
The example below states that age >= 18.

We have defined an easy-to-use set of GraphQL directives that can significantly improve the
value of GraphQL schemas for JSON-based data processing.
They are described in detail on the GraphQL Data Shapes Directives
page which includes a table showing all supported constraint types.
They intuitively map to corresponding SHACL constraints from the sh: namespace.
They intuitively map to corresponding SHACL constraints from the sh: namespace.

We have intentionally left out some of the RDF-specific constraint types.
For example, sh:languageIn and sh:uniqueLang may not be frequently
needed from a GraphQL perspective where no language tags exist.
sh:property has been left out because it is already covered via fields.
sh:closed has been left out because it does not really make sense for property shapes.
The shape-based constraint types sh:not, sh:and, sh:or,
sh:xone and sh:qualifiedValueShape have been left out to reduce complexity.
sh:nodeKind has been left out because GraphQL strictly separates literals and non-literals,
and the distinction between blank nodes and URIs is not relevant because they are enforced by @uri
directives.
Any of them may however be supported in the future should users require them.

Directives for Display Metadata

In addition to constraints, property shapes may include various annotation properties.
The GraphQL Data Shapes Directives page illustrates various
use cases such as form building and comes with an example.
Here we use the same examples with the equivalent RDF/SHACL triples.