RDF is a flexible and extensible way to represent information
about World Wide Web resources. It is used to represent, among
other things, personal information, social networks, metadata
about digital artifacts, like music and images, as well as
provide a means of integration over disparate sources of
information. A standardized query language for RDF data with
multiple implementations offers developers and end users a way
to write and to consume the results of queries across this wide
range of information. Used with a common protocol, applications can access
and combine information from across the web.

This section describes the status of this document at the time of its publication.
Other documents may supersede this document. A list of current W3C publications and the latest revision
of this technical report can be found in the
W3C technical reports index at http://www.w3.org/TR/.

Publication as a Working Draft does not imply endorsement by the W3C Membership.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time.
It is inappropriate to cite this document as other than work in progress.

Per section
4 of the W3C Patent Policy, Working Group participants have
150 days from the title page date of this document to exclude
essential claims from the W3C RF licensing requirements with
respect to this document series. Exclusions are with respect to
the exclusion reference document, defined by the W3C Patent Policy
to be the latest version of a document in this series that is
published no later than 90 days after the title page date of this
document.

An RDF graph is a set of triples, each triple consisting of a subject,
a predicate and an object, as defined in
RDF Concepts and Abstract syntax. These triples can come from a variety of
sources. For instance, they may come directly from an RDF
document. They may be inferred from other RDF triples. They may be
the RDF expression of data stored in other formats, such as XML or
relational databases.

SPARQL is a
query language for getting information from such RDF graphs. It provides
facilities to:

extract information in the form of URIs, blank nodes, plain and typed
literals.

extract RDF subgraphs.

construct new RDF graphs based on information in the queried graphs.

As a data access language, it is suitable for both local and
remote use. When used across networks, the companion SPARQL Protocol for RDF document
[11] describes a remote access protocol.

The SPARQL query language is based around matching graph patterns.
The simplest graph patterns are triple patterns, which are like an RDF triple
but with the possibility of a variable in any of the subject, predicate or object
positions. Combining these gives a basic graph pattern, where an exact match to a
graph is needed to fulfill a pattern.

Later sections describe how other graph patterns can be built using
the graph operators OPTIONAL
and UNION, be
grouped together and also how queries can extract
information from more than one graph. It is
also possible to restrict the values allowed in matching a pattern.

In this section, we cover simple triple patterns, basic graph patterns and the
SPARQL syntax related to these.

The example below shows a SPARQL query to find the title of a book
from the information in an RDF graph. The query consists of two
parts, the SELECT clause and the
WHERE clause. The SELECT clause identifies the variables of interest to the
application, and the WHERE clause has
one triple pattern.

Query Term Syntax

The terms delimited by "<>" are
relative URI references [RFC 3986]. After parsing,
these are resolved to give URIs. The term URI in this document refers to URIs
after resolution.

The query terms delimited by
double quotes ("") are literals which,
following Turtle
[15]syntax are a string, in quotes,
an optional language tag, introduced with '@',
or an optional datatype URI, introduced by
'^^'. Single quotes ('') are also
allowed instead of double quotes. As a convenience,
integers can be directly written and are
interpreted as typed literals of datatype xsd:integer; floating point numbers can also be directly written
and are interpreted as xsd:double.
Boolean values of type xsd:boolean
literals can also be written as
true or false.

Variables in SPARQL queries have global scope; it is the same
variable everywhere in the query that the name is used. Variables are indicated by
'?'; the '?' does not form part of the variable. '$' is an alternative
to '?' to help where systems use '?' as a substitution character. In a query,
$abc and ?abc are the same
variable.

Because URIs can be long, SPARQL provides an
abbreviation mechanism. Prefixes can be defined and a QName-like
syntax [14] provides shorter forms. Prefixes may be used anywhere
after they are declared; redefining a prefix causes the new definition to be used
from that point in the syntax. The base URI for the resolution of relative URIs [RFC 3869]
can be explicitly declared with the BASE keyword.

Prefixes are syntactic: the prefix name does not affect the
query, nor do prefix names in queries need to be the same prefixes as used for
data. The following query is equivalent to the any of the previous ones
and will give the same results when applied to the same data.

Result Descriptions used in this document

The term "binding" is used as a descriptive term to refer to a pair of
(variable, RDF term). In this document, we illustrate bindings in results in tabular form so if
variable x is bound to "Alice"
and variable y is bound to "Bob",
we show this as:

The building blocks of queries are triple patterns.
Syntactically, a SPARQL triple pattern is a subject, predicate
and object. The following triple pattern has a subject variable
(the variable book),
a predicate of dc:title and an object variable
(the variable title).

?book dc:title ?title .

Matching a triple pattern to a graph, gives bindings between variables and
RDF Terms so that the triple pattern, with the variables replaced by the
corresponding RDF terms, is a triple of the graph being matched.

"[The RDF core Working Group] noted that it is aware of no reason why literals should not
be subjects and a future WG with a less restrictive charter may
extend the syntaxes to allow literals as the subjects of statements."

There is a blank node [12] in this dataset, identified by
_:a.
The label is only used within the file for encoding purposes. The label
information is not in the RDF graph. No SPARQL query will be
able to identify that blank node by the label used in the serialization.

Blank Nodes and Queries

A blank
node can appear in a query pattern. It behaves as a variable,
although it can not be mentioned in the query result form or anyplace else
outside a graph pattern.

Blank nodes in queries are distinct from all blank nodes in the data.
A blank node in a graph pattern does not match a blank node in the
data by blank node label.

Blank Nodes and Query Results

In the results of queries, the presence of blank nodes can be indicated by
labels in the serializations of results. An application or client
receiving the results of a query can
tell that two solutions or two variable bindings differ in blank nodes but this
information is only scoped to the results as defined in
"SPARQL
Variable Binding Results XML Format" or the
CONSTRUCT result form.

The results above could equally be given with different blank node labels because
the labels in the results only indicate whether RDF terms in the solutions were
the same or different.

x

name

_:r

"Alice"

_:s

"Bob"

These two results have the same information: the blank nodes used to match
the query are different in the two solutions. There is no relation
between using _:a in the results and any
blank node label in the data graph.

SPARQL uses a "Turtle-like" syntax for writing basic graph patterns,
with the addition of named variables. There are a number of syntactic forms
that abbreviate some common sequences of triples. These syntactic forms do
not change the meaning of the query.

Predicate-Object Lists

Triple patterns with common subject can be written so that the subject is written
once, and used for more than one triple pattern using the ";"
notation.

?x foaf:name ?name ;
foaf:mbox ?mbox .

This is the same as writing the triple patterns:

?x foaf:name ?name .
?x foaf:mbox ?mbox .

Object Lists

If triple patterns share both subject and predicate, then these can be written
using the "," notation.

?x foaf:nick "Alice" , "Alice_" .

is the same as writing the triple patterns:

?x foaf:nick "Alice" .
?x foaf:nick "Alice_" .

Blank Nodes

Blank nodes have labels which are scoped to the query. They are written
as "_:a" for a blank node with label "a".

A blank node that is used in only one place in the query syntax can be
abbreviated with "[]". A unique blank node will be created and used to form
the triple pattern.

The "[:p :v]" construct can used to form triple patterns with a blank node for
subject.

An RDF Literal is written in SPARQL as a string containing the lexical form
of the literal, delimited by "", followed by an optional language tag
(indicted by '@') or optional datatype (indicated by '^^'). There are convenience forms for
numeric-types literals which are of type xsd:integer, xsd:double
or
xsd:boolean.

Matching Integers

The pattern in the following query has a solution :x because 42 is syntax for
"42"^^<http://www.w3.org/2001/XMLSchema#integer>.

SELECT ?v WHERE { ?v ?p 42 }

Matching Arbitrary Datatypes

The following query has a solution :y. The query processor does not
have to have any understanding of the values in the space of the datatype because, in this case, lexical form and datatype URI both match exactly.

Graph pattern matching creates bindings of variables. It is
possible to further restrict solutions by constraining the allowable
bindings of variables to RDF Terms. Value constraints
take the form of boolean-valued expressions; the language also allows
application-specific constraints on the values in a query solution.

A graph pattern may involve a value constraint, which is a boolean-valued
expression of variables and RDF Terms that restricts query solutions.

Constraints may be restrictions of the value of an RDF Term or they may be
restrictions on some part of an RDF term, such as its lexical form. SPARQL defines a set of
functions & operations (sections 11.1 and 11.2) that all implementations must provide.
In addition, there is an extension
mechanism (section 11.3) for operations that are specific to an
application domain or kind of data.

A constraint may lead to an error condition when testing some RDF term.
The exact error will depend on the constraint: for example, in numeric
operations, solutions with variables bound to a non-number or a blank node will lead to an
error. Any potential solution that causes an error condition in a
constraint will not form part of the final results, but does not cause the query to fail.

Open: whether to allow "foo"@?v or ?v@fr or ?v^^xsd:integer
or "foo"^^?v
One way to address this is to allow expressions in SELECT

A graph pattern GP may be a set of graph patterns,
GPi. A solution of Graph Pattern GP on graph G
is any solution S such that for each element GPi of GP, S is a solution
of GPi.

Syntactically, a group of patterns is delimited with {}s
(that is, braces).

For any solution, the same variable is given the same value everywhere in
the set of graph patterns. A Basic Graph Pattern
is, as described above, a group of triple patterns. For example, this query has a
group pattern of one basic graph pattern as the query pattern.

Basic graph patterns allow application to queries where
the whole of the query pattern must match for there
to be a solution. For every solution of the query, every variable is bound to
an RDF Term in a pattern solution. RDF is semi-structured so a regular, complete
structure can not be assumed and it is useful to be able to have queries
that allow information to be added to the solution where the information is
available, but not have the solution rejected just because that part of the
query pattern does not match. Optional matching provides this facility; if the
optional part does not lead to any solutions, variables can be left unbound.

There is no value of mbox in the solution where
the name is "Bob". It is left unbound.

This query finds the names of people in the data, and, if there is a
triple with predicate mbox and same subject, retrieves
the object of that triple as well. In the example, only a single triple pattern is given in
the optional match part of the query but, in general, it is any graph
pattern. The whole graph pattern of an
optional block must match for the optional to add to the query
solution.

In an optional match, either an additional graph pattern matches a graph and
so defines one or more pattern solutions, or gives an empty pattern
solution but does not cause matching to fail overall, leaving existing
solutions in the query results.

Optional patterns can occur inside any pattern, including a group graph pattern which
itself is optional, forming a nested pattern. The outer optional block must match for any
nested one to be matched.

This query finds the name, optionally the mbox, and also
the vCard given name; further, if there is a vCard Family name as well as the
Given name, the query gets that as well.

By nesting the
optional access to vcard:Family, the query only reaches
these if there is a vcard:N predicate.
It is possible to expand out optional blocks to remove nesting at
the cost of duplication of expressions. Here, the expression is a
simple triple pattern on vcard:N but it
could be a complex graph match with value constraints.

SPARQL provides a means of combining graph patterns so that one of several
alternative graph patterns may match. If
more than one of the alternatives matches, all the possible pattern solutions
are found.

This query finds titles of the books in the data, whether the title is
recorded using Dublin Core properties
from version 1.0 or version 1.1. If the application wishes to know how exactly
the information was recorded, then the query:

will return results with the variables x or
y bound depending on which way the query
processor matches the pattern to the data. Note that, unlike an
OPTIONAL pattern, if
neither part of the UNION pattern matched, then the query pattern would not match.

The working group decided on this design and closed the disjunction issue without reaching consensus. The objection was that adding UNION would complicate implementation and discourage adoption. If you have input to this aspect of the SPARQL that the working group has not yet considered, please send a comment to public-rdf-dawg-comments@w3.org.

The RDF data model expresses information as graphs, comprising of triples
with subject, predicate and object. Many RDF data stores hold multiple
RDF graphs, and record information about each graph, allowing an application to
make queries that involve information from more than one graph.

A SPARQL query is made against an RDF Dataset which represents
such a collection of graphs. Different parts of the query are matched against
different graphs as described in the next section. There
is one graph, the background graph, which does not have a name,
and zero or more named graphs, identified by URI reference.

An RDF dataset is a set = { G, (u1, G1),
(u2, G2), . . . (un, Gn) } where G
and each Gi are graphs, and each ui is a URI. Each ui
is distinct.

G is called the background graph. Gi are named graphs.

In the previous sections, all queries have been shown executed against
a single, background graph. A query does not need to involve the background graph; the
query can just involve the named graphs. A query processor is not required to support named graphs.

In this example, the background graph contains the publisher names of two
named graphs. The triples in the named graphs are not visible in the
background graph and, thought of as the default knowledge base, the
application is not directly trusting the information in the named graphs.

Example 2:

RDF data can be combined by RDF merge of graphs so the background graph
can be made to include the RDF merge of some or all of the information in
the named graphs. Because this information is now being published without
qualification, and a query application accepts as coming from the
publisher, not just from a source 9a named graph) that the publisher
incorporated.

In this next example, the named graphs contain the same information as
before. The RDF dataset includes an RDF merge of the named graphs in the
background graph, relabelling blank nodes to keep them distinct. Doing this is
trusting the contents of the named graphs. An implementation can efficiently
provide datasets of this form without duplicating stored triples.

Access to the graph labels of the collection of graphs being queried is by
variable in the GRAPH expression.

The query below matches the pattern on each of the named graphs in the
dataset and forms solutions which have the src
variable bound to URIs of the graph being matched. The pattern part of the
GRAPH only matched triples in a single named graph
in the same way that a graph pattern matches the background graph when there is
no GRAPH clause being applied.

A variable used in the GRAPH clause may also be
used elsewhere in the query, whether in another GRAPH
clause or in a graph pattern matched against the background graph in the
dataset.

This can be used to find information in one part of a query, and using it
to restrict the graphs matched in another part of the query. The query below uses the graph
with URI http://example.org/foaf/aliceFoaf to find the profile document for Bob;
it then matches another pattern against that graph. Note that the pattern in the
second GRAPH part finds the
blank node for the person with the same mail box (given by variable
mbox) as found in the first GRAPH
part, because the blank node used to match for variable
whom from Alice's FOAF file is not the
same as the blank node in the profile document
(they are in different graphs).

Query patterns can involve both the background graph and the named graphs. In
this example, an aggregator has read in a web resource on two different
occasions. Each time a graph is read into the aggregator, it is given a
URI by the local system. The graphs are nearly the same but the email address
for "Bob" has changed.

The background graph is being used to record the provenance information and
the RDF data actually read is kept in two separate graphs, each of which is
given a different URI by the system. The RDF dataset consists of two, named
graphs and the information about them.

This section has been added back into the document because the
Working Group is now considering putting some language constructs to specify
datasets. As such, this text has been added but not fully integrated and
may be inconsistent with the rest of the document. Comments about the
design are especially welcome.

The FROM clause gives a URI that the query processor
can use to create the background graph and the FROM NAMED
clause can be used to specify named graphs.

A query processor may use these
URIs in any way to associate an RDF Dataset with a query. For example, it could use URIs to retrieve documents,
parse them and use the resulting triples as one of the graphs; alternatively, it might only service queries
that specify URIs of graphs that it already has stored.

The FROM clause a single URIs that indicates the
graph to use as the background graph. This does not automatically put the graph
in as a named graph; a query can do this by also specifying the graph in the
FROM NAMED clause.

Examples

Examples of where is matters: OPTIONALs, and FILTERS

Evaluation Rules : Definitions

For pattern P, let var(P) be the variables mentioned
by P or any of its sub patterns.
For pattern P, let var-u(P) be the variables mentioned
by P or any of its sub patterns such that x is in var-u(P)
and P is a union expression, then x occurs all sub-patterns.
For group GP = { P1 or C1, (P2 or C2), ... (CN or PN) }
P pattern
C constraint

Fixed patterns: basic graph patterns and UNIONS

Recursive definition for nested patterns

Evaluation rule: Optional-1

Informally, this rule states that optional patterns must be executed as if
it came after any basic graph patterns, where there is a common variable.

If variable x in var(Pi), and Pi is an optional
and x in var(Pj) and Pj is a triple pattern or union
then
j < i

Evaluation rule: Optional-2

Informally, this rule states that there can't be two optionals with a
common variable, if that variable does not occur in a basic graph pattern as well.

If variable x in var(Pi), and Pi is an optional
and x in var-u(Pj) and Pj an optional, i != j
then x must occur in some fixed Pk

By rule opt 1, k < i and j.

Evaluation rule: Constraint

Informally, this rule states that constraints are evaluated after variable
are assigned values.

If Ci is a constraint expression, variable x in var(Ci)
and x in var(Pj)
then
j < i

Query patterns generate a number of solutions and each solution is a set of
variables and associated RDF terms. These solutions are passed through a stage
to control the solution sequence, then passed to the result form for the query.

The controls on the sequence of solutions are:

Projection

DISTINCT: ensure solutions in the sequence are
unique.

ORDER BY: put the solutions in order

LIMIT: restrict the number of solutions
processed for query results

OFFSET: control where the solutions processed
start from in the overall sequence of solutions.

The effect of applying these controls is as they are applied in the order given.

@@ToDo@@ could make sense, with LIMIT and OFFSET, in
CONSTRUCT and DESCRIBE

The solution sequence can be modified by adding the DISTINCT
keyword which ensures that every combination of variable bindings (i.e. each
solution) in the sequence is unique. Thought of as a table, each row is
different.

The ORDER BY clause takes a solution sequence and
applies ordering conditions. An ordering condition can be a variable or a
function call. The direction of ordering is ascending by default. It can be
explicitly set to ascending or descending by enclosing the condition in
ASC[] or DESC[]
respectively. If multiple
conditions are given, then they are applied in turn until one gives
the indication of the ordering.

Using ORDER BY on a solution sequence for a result form other than
SELECT has no direct effect because only
SELECT returns a sequence of results, not an RDF
graph. However, in combination with LIMIT and
OFFSET, it can be used to return partial results.

Ordering by expression

When ordering a solution sequence involves an expression, it is possible that the
ordering conditions do no give a completely determined ordering for the sequence.
In this case the ordering of solutions that are not distinguished,
is not determined.

Ordering by RDF term

If an ordering condition is a variable, SPARQL defines an
fixed, arbitrary order between some kinds of RDF terms that would not otherwise
be ordered. This arbitrary order is
necessary to provide slicing of query solutions by use of
LIMIT and OFFSET.

(Lowest) no value assigned to the variable in this solution.

Blank nodes

URIs

RDF literals

A plain literal before an XSD string with the same lexical form.

RDF Literals are compared with the "<" operator (see below) where
possible.

If the ordering criteria do not specify the order of values, then
the ordering in the solution sequence is undefined. However, an
implementation must consistently impose the same order so that applying
LIMIT/OFFSET will not miss any solutions.

Ordering a sequence of solutions always results in a sequence with the
same number of solutions in it, even if the ordering criteria does not
differentiate between two solutions.

OFFSET causes the solutions generated to start
after the specified number of solutions. An OFFSET
of zero has no effect.

The order in which solutions are returned is undefined so using
LIMIT and OFFSET to select
different subsets of the query solutions will given not be useful unless the
order is made predictable by ensuring ordered results using
ORDER BY.

The CONSTRUCT result form returns a single RDF
graph specified by a graph template. The result is an RDF graph formed by taking each query solution
in the solution sequence, substituting for the variables into the
graph template and combining the triples into a single RDF graph by set union.

If any such instantiation produces a triple
containing an unbound variable, or an illegal RDF construct (such as a
literal in subject or predicate position) then that triple is not included
in the RDF graph, and a warning may be generated.

Templates with Blank Nodes

A template can create an RDF graph containing blank nodes.
The labels are scoped to the template for each solution. If two such
prefixed names share the same label in the template, then there will be one
blank node created for each query solution but there will be different blank nodes
across triples generated by different query solutions.

The use of variable ?x in the template, which in this example will be
bound to blank nodes, causes an equivalent graph to be constructed with a
different blank node as shown by the document-scoped label.

Accessing Graphs in the RDF Dataset

Using CONSTRUCT it is possible to extract parts
of, or the whole of, graphs from the target RDF dataset. This first example
returns the graph (if it is in the dataset) with URI label
http://example.org/myGraph otherwise it returns an empty graph.

The access to the graph can be conditional on other information.
Suppose the background graph contains metadata about the named graphs in the
dataset then a query like this next one can extract one graph based on
information about the named graph:

The DESCRIBE form returns a single RDF graph
containing RDF data
about resources. This data is not prescribed by a SPARQL query,
where the query client would need to know the structure of the RDF in the data
source, but, instead, is determined by the SPARQL query processor.

The query pattern is used to create a result set. The
DESCRIBE form takes each of the resources identified in a solution,
together with any resources directly named by URI, and assembles a single RDF
graph by taking a "description" from the target knowledge base. The
description is determined by the query processor implementation and should
provide a useful description of the resource, where "useful" is left to nature
of the information in the data source.

If a data source, has no information about a resource, no RDF triples are
added to the result graph but the query does not fail.

The working group adopted DESCRIBE without reaching consensus. The objection was that the expectations around DESCRIBE are very different from CONSTRUCT and SELECT, and hence it should be specified in a separate query language. If you have input to this aspect of the SPARQL that the working group has not yet considered, please send a comment to public-rdf-dawg-comments@w3.org.

Explicit URIs

The DESCRIBE clause itself can take URIs to
identify the resources. The simplest query is just a URI in the
DESCRIBE clause:

DESCRIBE <http://example.org/>

Identifying Resources

The resources can also be a query variable from a result set. This enables
description of resources whether they are identified by URI or blank node in the
dataset being queried.

The property foaf:mbox is defined as being an
inverse function property in the FOAF vocabulary so, if treated as such, this query will
return information about at most one person. If, however, the query pattern
has multiple solutions, the RDF data for each
is the union of all RDF graph descriptions.

Descriptions of Resources

The RDF returned is the choice of the
deployment and may be dependent on the query processor implementation, data
source and local configuration. It should be the useful information the server
has (within security matters outside of SPARQL) about a resource.
It may include information about other resources: the RDF data
for a book may also include details of the author.

which includes the blank node closure for the vcard vocabulary vcard:N. For a vocabulary such as FOAF, where the
resources are typically blank nodes, returning sufficient information to
identify a node such as the InverseFunctionalProperty
foaf:mbox_sha1sum as well information which as name
and other details recorded would be appropriate. In the example,
the match to the WHERE clause was returned but this is not
required.

SPARQL expressions are constructed according to the grammar and provide access to named functions and syntactically constructed operations. The operands of these functions and operators are the subset of XML Schema DataTypes {xsd:string, xsd:decimal, xsd:double, xsd:dateTime} and types derived from xsd:decimal.
The SPARQL operations are listed in table 11.1 and are associated with productions in the grammar. In addition, SPARQL imports a subset of the XPath casting functions, listed in table 11.2, which are invokable by name within a SPARQL query. These functions and operators are taken from the XQuery 1.0 and XPath 2.0 Functions and Operators [17].

The namespace for XPath functions that are directly available by name is http://www.w3.org/2004/07/xpath-functions.
The associated namespace prefix used in this document is fn:. XPath operators are named with the prefix op:, XML Schema datatypes with the prefix op:, and types of RDF terms with the prefix r:. SPARQL operators are named with the prefix sop:.

These invoke XQuery's numeric type promotion to cast function arguments to the appropriate type. In summary: each of the numeric types is promoted to any type higher in the above list when used as an argument to function expecting that higher type. When an argument is promoted, the value is cast to the expected type. For instance, a "7"^^xs:decimal will be converted to an "7.0E0"^^xs:double when passed to an argument expecting an xs:double. Promotion does not change the bindings of variables.

The operators defined below that take numeric arguments expect all arguments to be the same type. This is accomplished by promoting the argument with the lower type to the same type as the other argument. For example, "7"^^xs:decimal+"6.5"^^xs:float would call op:numeric-add("7"^^xs:float, "6.5"^^xs:float). In addition, any r:Literal may be cast to xs:string or xs:numeric when used as an argument to an operator expecting that type.

XML Schema [] defines a set of types derived from decimal: integer; nonPositiveInteger; negativeInteger; long; int; short; byte; nonNegativeInteger; unsignedLong; unsignedInt; unsignedShort; unsignedByte and positiveInteger. These are all treated as decimals for computing effective boolean values. SPARQL does not specifically require integrity checks on derived subtypes. SPARQL has no numeric type test operators so the distinction between a primitive type and a type derived from that primitive type is unobservable.

SPARQL provides a subset of the functions and operators defined by XQuery Operator Mapping. XQuery 1.0 section 2.2.3 Expression Processing describes the invocation of XPath functions. The following rules accommodate the differences in the data and execution models between XQuery and SPARQL:

Unlike XPath/XQuery, SPARQL functions do not process node sequences. When interpreting the semantics of XPath functions, assume that each argument is a sequences of a single node.

Functions invoked with an argument of the wrong type will produce a type error.

Any expression other than value disjunction (||) that encounters a type error will produce a type error.

A value disjunction that encounters a type error on only one branch will return the result of evaluating the other branch.

A value disjunction that encounters type errors on both branches will produce a type error.

The SPARQL grammar identifies a set of operators (for instance, &&, *, isUri) used to construct constraints. The following table associates each of these grammatical productions with an operator defined by either the XQuery Operator Mapping or the additional SPARQL operators specified in section 11.2.2.

Some of the operators are associate with nested function expressions, e.g. fn:not(op:numeric-equal(A, B)). Note that per the xpath definitions, fn:not and op:numeric-equal return an error if their argument is an error.

†fn:string-match requires a collation to define character order and string equivalence. The XQuery 1.0 and XPath 2.0 Functions and Operators [F&O] defines the semantics of fn:string-compare and establishes a default collation. In addition, it identifies a specific collation with a distinguished name, http://www.w3.org/2004/10/xpath-functions/collation/codepoint which provides the ability to compare strings based on code point values. Every implementation of SPARQL must support the collation based on code point values.

This section defines the operators introduced by the SPARQL Query language. The names of the operators are prefixed with sop:. The examples show the behavior of the operators as invoked by the appropriate grammatical constructs.

Returns TRUE if the two arguments are the same RDF term or they are literals known to have the same value. The latter is tested with an XQuery function appropriate to the arguments. This function is overloaded because there is no syntactic way to separate xs:string = xs:string from r:literal = r:literal (or r:uri or r:bNode). I think I'm happy with that. Are you, dear reader?

The following sop:RDFterm-equal example passes the test because the mbox terms are the same RDF term:

One may test that a graph pattern is not expressed by specifying an optionalgraph pattern that introduces a variable and testing to see that the variable is notbound. This is called Negation as Failure in logic programming.

This query is similar to the one in 1.2.1.3 except that is matches the people with a name and an mbox which is a Literal. This would be used to look for erroneous data (foaf:mbox should only have a URI as its object).

Implementations may provide custom extended value testing
operations, for example, for specialized datatypes. These are
provided by functions in the query that return true or false for
their arguments.

A function returns an RDF term. It might be used to test some
application datatype not supported by the core SPARQL specification, it might
be a transformation between datatype formats, for example into an XSD dateTime
RDF term from another date format.

The function is called during FILTER evaluation for each possible
query solution. A function is named by URI in a QName form, and returns an RDF
term.

If a query processor encounters a function that it does not
provide, the query is not executed and an error is returned.

Functions should have no side-effects. A SPARQL query processor may
remove calls to functions if it can optimize them away.

Section status: drafted – terminal syntax not
checked against that of the XML 1.1 spec

A SPARQL query is a sequence of characters in the language defined by the following grammar, starting with the Query production. The EBNF format is the same as that used in the XML 1.1
specification. Please see the
"Notation"
section of that specification for specific information about
the notation.

Whitespace

Whitespace is used to separate two terminals which would otherwise be (mis-)recognized
as one terminals. Whitespace in terminals is significant. Otherwise
whitespace is ignored. Terminals are shown below enclosed in
<> or shown in-line.

Keywords

Keywords are shown in uppercase and are matched in a case insensitive
manner. The exception is the keyword 'a'
which, in line with Turtle and N3, is used in place of the URI
rdf:type (in full,
http://www.w3.org/1999/02/22-rdf-syntax-ns#type).

Comments

Comments in SPARQL queries take the form of '#', outside a URI or string,
and continue to the end of line or end of file if there is no end of line
after the comment marker.

We expect to state some formal characteristics of the grammar in later drafts.