Abstract

This document describes and includes test cases for software agents that
extract RDF from XML source documents by following the set of mechanisms
outlined in the Gleaning Resource Description from Dialects of Language
[GRDDL] specification. They
demonstrate the expected behavior of a GRDDL-aware agent by
specifying one (or more) RDF graph serializations which are the GRDDL results
associated with a single source document.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document reconciles tests from other documents in the repository (see Acknowledgements) as well as material from the editor's draft of the primer

This is a Last Call Working Draft of the GRDDL Test Cases. This document was developed by the GRDDL Working Group, which was chartered in July 2006 to review the specification and develop use cases, tutorial materials, and tests.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Introduction

A set of test cases is provided as part of the definition of [GRDDL]. This
document presents those test cases. They are intended to provide examples
for, and clarification of, the normative behavior of a GRDDL-aware agent.
They should be used for testing the conformance of GRDDL-aware agents. The
normative tests cover behavior expected of a GRDDL-aware
agent. The informative tests demonstrate other permitted behavior with respect to
the issues resolved by the Working Group. This document itself has (as a GRDDL result) a manifest describing the test cases in RDF. For convenience, serializations of the GRDDL result are available as RDF/XML and Turtle.

Note: the zip archive does not include tests which require network connectivity in order to properly calculate their GRDDL results.

Test Manifest Format

This test collection uses an RDF vocabulary for manifests developed for
the RDF Test Cases
Recommendation. A GRDDL-aware agent can extract the test collection and
automatically test compliance by attempting to reproduce the expected GRDDL
result(s) associated with each test case. Some input documents have multiple output documents, see below

Using the Test Driver

It has options for --debug and such; invoke it with no arguments (or with
--help) for details:

Options:
-r, --run path to a GRDDL implementation to use to process the
source document (checking results)
-u, --update path to a GRDDL Implementation to use to process the
source document
--tester The URI of an agent associated with the EARL test assertions.
A BNode is used if none is given
--project The URI of the EARL 'subject' (the implementation being tested).
A BNode is used if none is given
--local A boolean flag (false by default) which indicates whether to run only the local tests

The tests do not require the use of this driver

EARL Reporting

In addition to writing various diagnostic messages to STDERR, the test
harness writes additional RDF data to STDOUT: an [EARL] test assertion about each test
it runs.

To tell it about the person running the tests and the software project
being tested, point it to a tester (a URI in a [FOAF] RDF graph) and a test
subject (a URI in a [DOAP] RDF graph).

Protocol Tracing

We find TCPWatch
useful for debugging [HTTP] protocol
interactions. If you start TCPWatch like so:

GRDDL Transform Library

A library of standard transforms is available for widespread use by authors

Local Policies, Faithful Rendition, and Conformance

The GRDDL specification states that any transformation identified by an
author of a GRDDL source document will provide a Faithful Rendition of the
information expressed in the source document. The specification also grants a
GRDDL-aware agent the license to makes a determination of whether or not to
apply a particular transformation guided by user interaction, a local
security policy, or the agent's capabilities. However, in defining these
tests it was assumed that the GRDDL-aware agent being tested is
using a security policy which does not prevent it from applying
transformations identified in each test. Such an agent should produce the GRDDL result
associated with each normative test, except as specified immediately below.

The tests manifest includes a symmetric property [OWL] (http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#alternative) asserted between them. A GRDDL-aware agent running the tests can take this into consideration.

Testing for Multiple Representations

Information resources can also have multiple
representations in response to content negotiation. In addition to the GRDDL results associated with each representation
a test for the maximal result is included: the GRDDL result which consists of the merge of all possible GRDDL results.

Note, however, that the maximal result is not isomorphic with the other results. To aid a test harness in determining compliance for scenarios such as these, the tests have a property (http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#subsumes) asserted from the test for the maximal result to the other tests in the group. A GRDDL-aware agent running the tests can take this into consideration.

Testing for Maximal Result

The remaining set of tests with multiple results are those where there is no ambiguity with the XPath data model associated with the source document, there is a single representation, and multiple GRDDL mechanisms apply.
In the absence of a policy which prevents each GRDDL result from being computed, a GRDDL-aware agent should produce the maximal result.

Test Naming Convention

Every test has a URI of the form:

http://www.w3.org/2001/sw/grddl-wg/td/grddl-tests#LOCALNAME

The test collection can either be run locally (see "Localized Tests") or over a network. Certain tests are marked as requiring a network connection with an open circle as their list item marker. These tests are asserted as members of the http://www.w3.org/2001/sw/grddl-wg/td/grddl-test-vocabulary#NetworkedTest class in the test manifest. A GRDDL-aware agent running the tests can take this into consideration.

The tests which require a network connection use absolute URIs (in the test manifest) to refer to their test material (input and output) using the form:

http://www.w3.org/2001/sw/grddl-wg/td/LOCALNAME

Tests which do not require a network connection use relative URIs (in the test manifest) instead.

Normative Tests

Each test has an input document and an output document. the output document is an RDF/XML document and
represents a GRDDL result of the input document.

Localized Tests

For the sake of convenience, this first set of normative tests cover
simple scenarios where neither namespace documents nor
absolute URIs are used. Such tests can run offline rather easily.

This test case exercises a single GRDDL transformation that is
identified using XHTML markup within the source document. Note that
this test case uses a transformation for RDFa that reflects
the status of RDFa markup as of the development of the test case.

This test case uses an inline GRDDL transformation reference (i.e.
within an a element) instead of one within a
link element. It also exercises the fact that the
rel attribute can take multiple space-separated values, and
only one of them needs to be equal to transformation to
indicate that the resource is in fact a GRDDL transformation.

Namespace Documents and Absolute Locations

This test case exercises identifying GRDDL transformations using
profileTransformation assertions. In this case, an XHTML
document notes a profile URI to which it belongs. The profile document,
retrieved from the URI, identifies a GRDDL transformation for the
original document with a profileTransformation assertion in
its own GRDDL result.

This test case exercises identifying GRDDL transformations using
namespaceTransformation assertions. In this case, an XML
document has a root element with a namespace URI. The namespace
document, retrieved from the URI, is an RDF/XML document (and so
contributes to its own GRDDL results) and identifies a GRDDL
transformation for the original document with a
namespaceTransformation assertion.

Ambiguous Infosets, Representations, and Traversals

In this test case, the input file uses XInclude to include xinclude2.xml, and that the output has only one
triple unless the XML Processor of the GRDDL implementation implements
XInclude. The output for this case assumes that the processor
does resolve XIncludes.

Note that the input is an RDF document with a GRDDL transformation, and that according to the rules given by the GRDDL Specification, there are three distinct and equally valid output graphs for this test for this document. This output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation.

An XML document
which has an HTML namespace document,
which has a profile being an XML document,
which has an HTML namespace document,
which has a profile being an XML document,
which has an RDF namespace document.

The following four tests demonstrate GRDDL results
for a self-referencing input document. Unlike other tests of this kind, the last of these - the maximal result - is not exlusive.
This reflects an interpretation of SHOULD as used in section 7. GRDDL-Aware Agents of
[GRDDL] with regards to the computation of GRDDL results. In particular, this interpretation and the text in the section that follows
(8. Security considerations) permits an implementation to
only pass the first test due to security restrictions against computing recursive GRDDL results.

For this particular test, an XML document
is its own namespace document,
with a GRDDL transformation, specifying
a namespaceTransformation, which specifies
a further namespaceTransformation.
This result is the first possible GRDDL result.
Implementations that make no allowance
for such cases may produce
this result.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.

An XML document
is its own namespace document,
with grddl transformation, specifying
a namespaceTransformation, which specifies
a further namespaceTransformation.
This result is the merge of the
first two possible GRDDL results.
Implementations that make no special allowance
for or prohibition of
such cases may produce
this result. Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.

An XML document
is its own namespace document,
with grddl transformation, specifying
a namespaceTransformation, which specifies
a further namespaceTransformation.
This result is the merge of the
first three possible GRDDL results.
Implementations that make no special allowance
for
or prohibition of
such cases may produce
this result.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.

An XML document
is its own namespace document,
with a GRDDL transformation, specifying
a namespaceTransformation, which specifies
a further namespaceTransformation.
This result is the merge of all possible GRDDL results.
Documents authors are advised against having
information resources whose GRDDL results depend
on other GRDDL results for the same resource.

This test differs from the previous example of applying GRDDL to an RDF/XML document in that the RDF file is served (not best practice, but rather common) as media-type "application/xml". The output is a graph that is merge of the graph given by the source document with the graph given by the result of the GRDDL transformation.

This test exists to bring attention to developers to issues of content negotiation, in particular, content negotiation over language as described and implemented by W3C QA. There are two valid resulting GRDDL results of running this GRDDL transformation depending on what language the GRDDL-aware agent uses, and an implementation of a GRDDL-aware agent only needs to retrieve the one that is appropriate for its HTTP header request. This result follows from retrieving a English version of the HTML representation and thus having the GRDDL result produce a result with English-language content.

This test case exercises resolution of relative references found in
the GRDDL results for a general XML document. In this case, according
to RFC
3986, section 5.1, a base URI for the relative reference is
recursively discovered on the encapsulating entity for the GRDDL
results, which is the root element of the input
document, in order to maintain fidelity to the faithful rendition
requirement. The root element assigns the base URI using the
mechanism described in XML
Base.

This test case exercises resolution of relative references found in
the GRDDL results for a general XML document. In this case, according
to RFC
3986, section 5.1, a base URI for the relative reference is
recursively discovered to be the URI used to retrieve the
input document, since no base URI is assigned in the content of the
encapsulating entity (that is, the root element of the input
document).

This test case exercises resolution of relative references found in
the GRDDL results for a general XML document when that document is
resolved through a protocol redirection mechanism. The base URI for these relative references is established by the
xml:base attribute on the root element, as for "An
xml document with an xml:base attribute".

This test case exercises resolution of relative references found in
the GRDDL results for a general XML document when that document is
resolved through a protocol redirection mechanism. The base URI of the
document is the target URI of the last redirection step; after
establishing this fact, this test case follows the same behavior as "A
similar xml document without an xml:base attribute".

Informative Tests

This section includes tests not covered explicitely by the normative text of
the GRDDL but demonstrate additional behavior that a GRDDL-aware agent may exhibit.
They reflect behavior suggested by the Working Group as a result
of resolving certain issues.

Security Tests

The following security tests are provided for implementers to
adapt and use for their implementation. Security issues are usually system specific, and it may be possible for a malicious party to access
XSLT version and vendor information concerning a specific GRDDL agent instance.

We do not provide instructions as to how to test your system
against these tests, since they are likely to be not directly
applicable. Developers of GRDDL aware agents are encouraged to understand
these tests, and consider how their own systems may have
potential security weaknesses.

Sending user information to remote server
The suspect code occurs in the transform for the
profile document, and is
rdf:resource="security6.sxsl?{system-property('user.home')}".
This uses a Saxon extension to the XSLT system-property function.

The security tests were created during the development of the
Jena GRDDL Reader which uses the
Saxon8.8 XSLT processor. They hence
illustrate how a malicious party may try to abuse features of such an implementation.