The SPARQL Protocol and RDF Query Language (SPARQL) is a query language and protocol for RDF. This document
specifies the SPARQL Protocol; it uses WSDL 2.0 to describe a means for
conveying SPARQL queries to an SPARQL query processing service and returning the query results to the entity that
requested them. This protocol was developed by the W3C RDF Data Access
Working Group (DAWG), part of the Semantic Web Activity as described in
the activity statement .

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This November 2007 publication is a Proposed
Recommendation. W3C Members and other interested parties are invited to review
the document through 10 December 2007.

The Working Group produced a series of tests from the examples in this document,
along with a python test harness for running
the tests against SPARQL Protocol for RDF endpoints.
The Working Group's SPARQL Protocol For RDF
Implementation Report uses this test suite to demonstrate that the goals for interoperable implementations and
two conformant SPARQL services have been achieved.

Publication as a Proposed Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document (which refers to itself as "SPARQL Protocol for RDF") describes SPARQL Protocol, a means of conveying
SPARQL queries from query clients to query processors. SPARQL Protocol has been designed for compatibility with the SPARQL Query Language for RDF [SPARQL]. SPARQL Protocol is described
in two ways: first, as an abstract interface independent of any concrete realization, implementation, or binding to
another protocol; second, as HTTP and SOAP bindings of this interface. This document, as well as the
associated WSDL and W3C XML Schema documents, are primarily intended for software developers interested in implementing
SPARQL query services and clients.

When this document uses the words must,
must not, should, should
not, may
and recommended, and the words appear as
emphasized text, they must be interpreted as described
in RFC 2119
[RFC2119].

When this document contains excerpts from other documents, including WSDL and XML Schema instances, it uses the following namespace prefixes and namespace URIs:

The XML Schema document that normatively defines the types used in SPARQL Protocol.

SPARQL Protocol contains one interface, SparqlQuery, which in turn contains one operation,
query. SPARQL Protocol is described abstractly with WSDL 2.0 [WSDL2] in terms of a web service that implements its interface, types,
faults, and operations, as well as by HTTP and SOAP bindings. Note that while this document uses WSDL 2.0 to describe
SPARQL Protocol, there is no obligation on the part of any implementation to use any particular implementation strategy,
including the use of any WSDL library or programming language framework.

2.1.2 query In Message

Abstractly, the contents of the In Message of SparqlQuery's query operation is an
instance of an XML Schema complex type, called st:query-request in Excerpt 1.0, composed of two further
parts: one SPARQL query string; and zero or one RDF dataset descriptions. The SPARQL query string,
identified by one query type, is defined by
[SPARQL] as "a sequence of characters in the language defined by the [SPARQL] grammar, starting with the Query
production". The RDF dataset description is composed of zero or one default RDF graphs — composed by the RDF
merge of the RDF graphs identified by zero or more default-graph-uri types — and by zero or more
named RDF graphs, identified by zero or more named-graph-uri types. These correspond to the
FROM and FROM NAMED keywords in [SPARQL], respectively.

The RDF dataset may be specified either in a [SPARQL] query using FROM and FROM NAMED
keywords; or it may be specified in the protocol described in this document; or it may be specified in both the query
string and in the protocol.

In the case where both the query and the protocol specify an RDF
dataset, but not the identical RDF dataset, the dataset
specified in the protocol
must be the RDF dataset consumed by
SparqlQuery's query operation.

The BASE keyword in the query string defines the Base
IRI used to resolve relative IRIs per Uniform Resource Identifier
(URI): Generic Syntax [RFC3986] section 5.1.1, "Base URI
Embedded in Content". Section 5.1.2, "Base URI from the Encapsulating
Entity" defines how the Base IRI may come from an encapsulating
document, such as a SOAP envelope with an xml:base directive. The SPARQL
Protocol does not dereference query URIs so section 5.1.3 does not
apply. Finally, per section 5.1.4, SPARQL Protocol services must define
their own base URI, which may be the service invocation
URI.

Any message after the first in the pattern may be replaced with a fault message, which
must have identical direction. The fault message must be delivered to the same target node
as the message it replaces, unless otherwise specified by an extension or binding extension. If there is no path to this
node, the fault must be discarded.

Thus, the query operation contained in the SparqlQuery interface may return, in place of the Out Message, either the MalformedQuery message or the
QueryRequestRefused message, both of which are defined in this XML Schema fragment from protocol-types.xsd:

When the value of the query type is not a legal sequence of characters in the language defined by the
SPARQL grammar, the MalformedQuery or QueryRequestRefused fault message must be returned. According to the Fault Replaces Message Rule, if a WSDL fault is returned, including MalformedQuery, an Out Messagemust not be returned.

When the MalformedQuery fault message is returned, query processing services must
include explanatory, debugging, or other additional information for human consumption via the
fault-details type defined in Excerpt 1.3.

This WSDL fault message should be returned when a client submits a request that the service refuses
to process. The QueryRequestRefused fault message neither indicates whether the server may or may not
process a subsequent, identical request or requests, nor does it constrain a conformant SPARQL service from returning other HTTP status codes or HTTP
headers as appropriate given the semantics of [HTTP].

When the QueryRequestRefused fault message is returned, query processing services must
include explanatory, debugging, or other additional information intended for human consumption via the
fault-details type defined in Excerpt 1.3.

The SparqlQuery interface operation query described thus far is an abstract operation; it
requires protocol bindings to become an invocable operation. This next two sections of this document describe HTTP and SOAP
bindings. A conformant SPARQL Protocol servicemust
support the SparqlQuery interface; if a SPARQL Protocol service supports HTTP bindings, it
must support the bindings as described in protocol-query.wsdl. A SPARQL Protocol service may support
other interfaces. See 2.3 SOAP Bindings for more information.

[WSDL2-Adjuncts] defines a means of binding abstract interface operations to HTTP. The HTTP bindings for the
query operation (from protocol-query.wsdl) are as follows:

There are two HTTP bindings, queryHttpGet and queryHttpPost, both of which are
described as bindings of the SparqlQuery interface. In each of these bindings, the two faults described in SparqlQuery interface, MalformedQuery and QueryRequestRefused, are bound
to HTTP status codes400 Bad Request and
500 Internal Server Error, respectively [HTTP].

The queryHttpGet binding should be used except in cases where the URL-encoded query exceeds
practical limits, in which case the queryHttpPost binding should be used.

An Informative Note About Serialization Constraints. The output serialization of the
queryHttpGet and queryHttpPost bindings is intentionally under constrained in order to reflect the
variety of serialization types of RDF graphs. The fault serialization of queryHttpGet and
queryHttpPost is also intentionally under constrained. A conformant SPARQL Protocol service can provide alternative WSDL interfaces
and bindings with different constraints.

This binding of the query operation uses [HTTP] GET with the following serialization type
constraints: the value of whttp:faultSerialization is */*; second, the value of
whttp:inputSerialization is application/x-www-form-urlencoded with UTF-8 encoding; and,
third, the whttp:outputSerialization is application/sparql-results+xml with UTF-8 encoding,
application/rdf+xml with UTF-8 encoding, and */*.

This binding of the query operation uses [HTTP] POST with the following serialization type
constraints: the value of whttp:faultSerialization is */*; second, the value of
whttp:inputSerialization is application/x-www-form-urlencoded with UTF-8 encoding and
application/xml with UTF-8 encoding; and, third, the whttp:outputSerialization is
application/sparql-results+xml with UTF-8 encoding, application/rdf+xml with UTF-8 encoding, and
*/*.

The following abstract HTTP trace examples illustrate invocation of
the query operation under several different
scenarios. These example traces are abstracted from complete HTTP
traces in three ways: (1) In each example the string
"EncodedQuery" represents the URL-encoded string equivalent of
the SPARQL query given in the first block of each example; (2) only
partial response bodies, containing the query results, are displayed;
(3) the URI values of default-graph-uri
and named-graph-uri are also not URL-encoded.

That query — against the RDF dataset identified by the value
of the default-graph-uri
parameter, http://www.other.example/books — executed
by that SPARQL query service, returns the following query
result:

This protocol operation contains an ambiguous RDF dataset: the dataset specified in the query is different than the one
specified in the protocol (by way of default-graph-uri and named-graph-uri parameters). A
conformant SPARQL Protocol service must resolve this ambiguity by executing the query against the RDF dataset specified in
the protocol:

Some SPARQL queries, perhaps machine generated, may be longer than
can be reliably conveyed by way of the HTTP GET binding described in
2.2 HTTP Bindings. In those cases
the POST binding described in 2.2 may be used. This SPARQL query

The name of the SOAP binding
of SparqlQuery's query operation
is querySoap; it is a SOAP binding because of the value
of type attribute, which is set to the URI identifying
SOAP. The version of SOAP is 1.2. The underlying
protocol used in this SOAP binding is HTTP, as determined by the URI
value of the wsoap:protocol attribute. If a SPARQL
Protocol service supports SOAP bindings with the value of
the {http://www.w3.org/2006/01/wsdl/soap, protocol}
attribute set
to http://www.w3.org/2003/05/soap/bindings/HTTP,
it must support the bindings as described
in protocol-query.wsdl. SOAP
bindings with wsoap:protocol values set to transmission
protocols other than HTTP are not described in this document.

The two fault elements refer to the fault messages
defined in the SparqlQuery interface.

Finally, the operation element references
the query operation of the SparqlQuery
interface which has been previously described
in Excerpt 1.0 above. Since this SOAP
binding describes the operation as using HTTP as the underlying
transport protocol, the value of the wsoap:mep
attribute determines which HTTP method is to be used. This operation
is described as being implemented by a SOAP message exchange
pattern http://www.w3.org/2003/05/soap/mep/request-response,
which, according to [SOAP12] 7.4 Supported Features, is bound to an
HTTP POST method.

There are at least two possible sources of denial-of-service attacks against SPARQL protocol services. First,
under-constrained queries can result in very large numbers of results, which may require large expenditures of computing
resources to process, assemble, or return. Another possible source are queries containing very complex — either
because of resource size, the number of resources to be retrieved, or a combination of size and number — RDF dataset
descriptions, which the service may be unable to assemble without significant expenditure of resources, including bandwidth,
CPU, or secondary storage. In some cases such expenditures may effectively constitute a denial-of-service attack. A SPARQL
protocol service may place restrictions on the resources that it retrieves or on the rate at which external
resources are retrieved. There may be other sources of denial-of-service attacks against SPARQL query processing services.

Since a SPARQL protocol service may make HTTP requests of other origin servers on behalf of its clients, it may be used
as a vector of attacks against other sites or services. Thus, SPARQL protocol services may effectively act as proxies for
third-party clients. Such services may place restrictions on the resources that they retrieve or on the
rate at which external resources can be retrieved. SPARQL protocol services may log client requests in such
a way as to facilitate tracing them with regard to third-party origin servers or services.

SPARQL protocol services may choose to detect these and other costly, or otherwise unsafe, queries,
impose time or memory limits on queries, or impose other restrictions to reduce the service's (and other service's)
vulnerability to denial-of-service attacks. They also mayrefuse to process such
query requests.

Different IRIs may have the same appearance. Characters in different scripts may look similar (a Cyrillic "о" may appear
similar to a Latin "o"). A character followed by combining characters may have the same visual representation as another
character (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT has the same visual representation as LATIN SMALL LETTER
E WITH ACUTE). Users of SPARQL must take care to construct queries with IRIs that match the IRIs in the data. Further
information about matching of similar characters can be found in Unicode Security Considerations [UNISEC] and Internationalized Resource Identifiers (IRIs) [RFC3987] Section 8.

may implement other interfaces, bindings of the operations of those interfaces, or bindings of the query operation other than the normative HTTP or SOAP bindings described by SPARQL Protocol for RDF; and

must be consistent with the normative constraints (indicated by [RFC 2119] keywords) described in 3. Policy Considerations.