Abstract

The current Web is primarily made up of an enormous number of documents
that have been created using HTML. These documents contain significant
amounts of structured data, which is largely unavailable to tools and
applications. When publishers can express this data more completely, and
when tools can read it, a new world of user functionality becomes
available, letting users transfer structured data between applications
and web sites, and allowing browsing applications to improve the user
experience: an event on a web page can be directly imported into a
user's desktop calendar; a license on a document can be detected so that
users can be informed of their rights automatically; a photo's creator,
camera setting information, resolution, location and topic can be
published as easily as the original photo itself, enabling structured
search and sharing.

RDFa Core is a specification for attributes to express structured data
in any markup language. The embedded data already available in the
markup language (e.g., HTML) can often be reused by the RDFa markup, so
that publishers don't need to repeat significant data in the document
content. The underlying abstract representation is RDF [RDF11-PRIMER],
which lets publishers build their own vocabulary, extend others, and
evolve their vocabulary with maximal interoperability over time. The
expressed structure is closely tied to the data, so that rendered data
can be copied and pasted along with its relevant structure.

The rules for interpreting the data are generic, so that there is no
need for different rules for different formats; this allows authors and
publishers of data to define their own formats without having to update
software, register formats via a central authority, or worry that two
formats may interfere with each other.

RDFa shares some of the same goals with microformats [MICROFORMATS].
Whereas microformats specify both a syntax for embedding structured data
into HTML documents and a vocabulary of specific terms for each
microformat, RDFa specifies only a syntax and relies on independent
specification of terms (often called vocabularies or taxonomies) by
others. RDFa allows terms from multiple independently-developed
vocabularies to be freely intermixed and is designed such that the
language can be parsed without knowledge of the specific vocabulary
being used.

This document is a detailed syntax specification for RDFa, aimed at:

those looking to create an RDFa Processor, and who therefore need a
detailed description of the parsing rules;

those looking to integrate RDFa into a new markup language;

those looking to recommend the use of RDFa within their
organization, and who would like to create some guidelines for their
users;

anyone familiar with RDF, and who wants to understand more about
what is happening 'under the hood', when an RDFa Processor runs.

For those looking for an introduction to the use of RDFa and some
real-world examples, please consult the [RDFA-PRIMER].

How to Read this Document

First, if you are not familiar with either RDFa or RDF, and
simply want to add RDFa to your documents, then you may find the RDFa
Primer [RDFA-PRIMER] to be a better introduction.

If you are already familiar with RDFa, and you want to examine the
processing rules — perhaps to create an RDFa Processor — then you'll
find the Processing Model section of most
interest. It contains an overview of each of the processing steps,
followed by more detailed sections, one for each rule.

If you are not familiar with RDFa, but you are familiar
with RDF, then you might find reading the Syntax
Overview useful, before looking at the Processing
Model since it gives a range of examples of markup that use
RDFa. Seeing some examples first should make reading the processing
rules easier.

If you are not familiar with RDF, then you might want to take a look
at the section on RDF Terminology
before trying to do too much with RDFa. Although RDFa is designed to
be easy to author — and authors don't need to understand RDF to use it
— anyone writing applications that consume RDFa will need to
understand RDF. There is a lot of material about RDF on the web, and a
growing range of tools that support RDFa. This document only contains
enough background on RDF to make the goals of RDFa more clear.

Note

RDFa is a way of expressing RDF-style
relationships using simple attributes in existing markup languages
such as HTML. RDF is fully internationalized, and permits the use of
Internationalized Resource Identifiers, or IRIs. You will see the term
'IRI' used throughout this specification. Even if you are not familiar
with the term IRI, you probably have seen the term 'URI' or 'URL'.
IRIs are an extension of URIs that permits the use of characters
outside those of plain ASCII. RDF allows the use of these characters,
and so does RDFa. This specification has been careful to use the
correct term, IRI, to make it clear that this is the case.

Note

Even though this specification exclusively
references IRIs, it is possible that a Host Language will
restrict the syntax for its attributes to a subset of IRIs
(e.g., @href in HTML5). Regardless of
validation constraints in Host Languages, an RDFa Processor
is capable of processing IRIs.

Status of This Document

This section describes the status of this document at the time of its publication.
Other documents may supersede this document. A list of current W3C publications and the
latest revision of this technical report can be found in the W3C technical reports index at
http://www.w3.org/TR/.

This document has been reviewed by W3C Members, by software developers, and by other W3C
groups and interested parties, and is endorsed by the Director as a W3C Recommendation.
It is a stable document and may be used as reference material or cited from another
document. W3C's role in making the Recommendation is to draw attention to the
specification and to promote its widespread deployment. This enhances the functionality
and interoperability of the Web.

1. Motivation

This section is non-normative.

RDF/XML [RDF-SYNTAX-GRAMMAR] provides sufficient flexibility to represent all
of the abstract concepts in RDF. However, it presents a
number of challenges; first it is difficult or impossible to validate
documents that contain RDF/XML using XML Schemas or DTDs, which
therefore makes it difficult to import RDF/XML into other markup
languages. Whilst newer schema languages such as RELAX NG
[RELAXNG-SCHEMA] do provide a way to validate documents that contain
arbitrary RDF/XML, it will be a while before they gain wide support.

Second, even if one could add RDF/XML directly into an XML dialect like
XHTML, there would be significant data duplication between the rendered
data and the RDF/XML structured data. It would be far better to add RDF
to a document without repeating the document's existing data. For
example, an XHTML document that explicitly renders its author's name in
the text — perhaps as a byline on a news site — should not need to repeat
this name for the RDF expression of the same concept: it should be
possible to supplement the existing markup in such a way that it can
also be interpreted as RDF.

Another reason for aligning the rendered data with the structured data
is that it is highly beneficial to express the web data's structure 'in
context'; as users often want to transfer structured data from one
application to another, sometimes to or from a non-web-based
application, the user experience can be enhanced. For example,
information about specific rendered data could be presented to the user
via 'right-clicks' on an item of interest. Moreover, organizations that generate
a lot of content (e.g., news outlets) find it easier to embed the
semantic data inline than to maintain it separately.

In the past, many attributes were 'hard-wired' directly into the markup
language to represent specific concepts. For example, in XHTML 1.1
[XHTML11] and HTML [HTML401] there is @cite;
the attribute allows an author to add information to a document which is
used to indicate the origin of a quote.

However, these 'hard-wired' attributes make it difficult to define a
generic process for extracting metadata from any document since an RDFa
Processor would need to know about each of the special attributes. One
motivation for RDFa has been to devise a means by which documents can be
augmented with metadata in a general, rather than hard-wired, manner.
This has been achieved by creating a fixed set of attributes and parsing
rules, but allowing those attributes to contain properties from any of a
number of the growing range of available RDF vocabularies. In most cases
the values of those properties are the information that is
already in an author's document.

RDFa alleviates the pressure on markup language designers to anticipate
all the structural requirements users of their language might have, by
outlining a new syntax for RDF that relies only on attributes. By
adhering to the concepts and rules in this specification, language
designers can import RDFa into their environment with a minimum of
hassle and be confident that semantic data will be extractable from
their documents by conforming processors.

2. Syntax Overview

This section is non-normative.

The following examples are intended to help readers who are not
familiar with RDFa to quickly get a sense of how it works. For a more
thorough introduction, please read the RDFa Primer [RDFA-PRIMER].

In RDF, it is common for people to shorten vocabulary terms via
abbreviated IRIs that use a 'prefix' and a 'reference'. This mechanism
is explained in detail in the section titled Compact URI Expressions.
The examples throughout this document assume that the following
vocabulary prefixes have been defined:

bibo:

http://purl.org/ontology/bibo/

cc:

http://creativecommons.org/ns#

dbp:

http://dbpedia.org/property/

dbp-owl:

http://dbpedia.org/ontology/

dbr:

http://dbpedia.org/resource/

dc:

http://purl.org/dc/terms/

ex:

http://example.org/

foaf:

http://xmlns.com/foaf/0.1/

owl:

http://www.w3.org/2002/07/owl#

rdf:

http://www.w3.org/1999/02/22-rdf-syntax-ns#

rdfa:

http://www.w3.org/ns/rdfa#

rdfs:

http://www.w3.org/2000/01/rdf-schema#

xhv:

http://www.w3.org/1999/xhtml/vocab#

xsd:

http://www.w3.org/2001/XMLSchema#

Note

In some of the examples below we have used IRIs with
fragment identifiers that are local to the document containing the RDFa
fragment identifiers shown (e.g., 'about="#me"'). This
idiom, which is also used in RDF/XML [RDF-SYNTAX-GRAMMAR] and other
RDF serializations, gives a simple way to 'mint' new IRIs for entities
described by RDFa and therefore contributes considerably to the
expressive power of RDFa. The precise meaning of IRIs which include
fragment identifiers when they appear in RDF graphs is given in
Section 7 of [RDF-SYNTAX-GRAMMAR]. To ensure that such fragment
identifiers can be interpreted correctly, media type registrations
for markup languages that incorporate RDFa should directly or
indirectly reference this specification.

2.1 The RDFa Attributes

RDFa makes use of a number of commonly found attributes, as well as
providing a few new ones. Attributes that already exist in widely
deployed languages (e.g., HTML) have the same meaning they always did,
although their syntax has been slightly modified in some cases. For
example, in (X)HTML there is no clear way to add new @rel
values; RDFa sets out to explicitly solve this problem, and does so by
allowing IRIs as values. It also introduces the concepts of terms
and 'compact URI expressions'
— referred to
as CURIEs in this document — which allow a full IRI value to be
expressed succinctly. For a complete list of RDFa attribute names and
syntax, see Attributes and Syntax.

2.2 Examples

In (X)HTML, authors can include metadata and relationships concerning
the current document by using the meta and link
elements (in these examples, XHTML+RDFa [XHTML-RDFA] is used).
For example, the author of the page along with the pages
preceding and following the current page can be expressed using the
link and meta elements:

In simple cases the @property property can also be used
in place of @rel. Indeed, in case when the element does
not contain @rel, @datatype, or @content,
but there is, for example, a @href, the effect of @property
is analogous to the role of @rel. For example, the
previous example could have been written:

3. RDF Terminology

This section is non-normative.

The previous section gave examples of typical markup in order to
illustrate the structure of RDFa markup. RDFa is short for "RDF in
Attributes". In order to author RDFa you do not need to understand RDF,
although it would certainly help. However, if you are building a system
that consumes the RDF output of a language that supports RDFa you will
almost certainly need to understand RDF. This section introduces the
basic concepts and terminology of RDF. For a more thorough explanation
of RDF, please refer to the RDF Concepts document [RDF-SYNTAX-GRAMMAR] and
the RDF Syntax Document [RDF-SYNTAX-GRAMMAR].

3.1 Statements

The structured data that RDFa provides access to is a collection of
statements. A statement is a basic unit of information that
has been constructed in a specific format to make it easier to
process. In turn, by breaking large sets of information down into a
collection of statements, even very complex metadata can be processed
using simple rules.

To illustrate, suppose we have the following set of facts:

Example 13

Albert was born on March 14, 1879, in the German Empire. There is a picture of him at
the web address, http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.

This would be quite difficult for a machine to interpret, and it is
certainly not in a format that could be passed from one data
application to another. However, if we convert the information to a
set of statements it begins to be more manageable. The same
information could therefore be represented by the following shorter
'statements':

Example 14

Albert was born on March 14, 1879.
Albert was born in the German Empire.
Albert has a picture at
http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.

3.2 Triples

To make this information machine-processable, RDF defines a
structure for these statements. A statement is formally called a triple,
meaning that it is made up of three components. The first is the subject
of the triple, and is what we are making our statement about.
In all of these examples the subject is 'Albert'.

The second part of a triple is the property of the subject that we
want to define. In the examples here, the properties would be 'was
born on', 'was born in', and 'has a picture at'. These properties are
typically called predicates in RDF.

The final part of a triple is called the object. In the
examples here the three objects have the values 'March 14, 1879', 'the
German Empire', and
'http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg'.

Note

RDFa supports internationalized
characters in the subject,
'predicate', and the object.

3.3 IRI References

Breaking complex information into manageable units helps us be
specific about our data, but there is still some ambiguity. For
example, which 'Albert' are we talking about? If another system has
more facts about 'Albert', how could we know whether they are about
the same person, and so add them to the list of things we know about
that person? If we wanted to find people born in the German Empire,
how could we know that the predicate 'was born in' has the same
purpose as the predicate 'birthplace' that might exist in some other
system? RDF solves this problem by replacing our vague terms with
IRI references.

IRIs are most commonly used to identify web pages, but RDF makes use
of them as a way to provide unique identifiers for concepts. For
example, we could identify the subject of all of our statements (the
first part of each triple) by using the DBPedia [http://dbpedia.org]
IRI for Albert Einstein, instead of the ambiguous string 'Albert':

Example 15

<http://dbpedia.org/resource/Albert_Einstein>
has the name
Albert Einstein.
<http://dbpedia.org/resource/Albert_Einstein>
was born on
March 14, 1879.
<http://dbpedia.org/resource/Albert_Einstein>
was born in
the German Empire.
<http://dbpedia.org/resource/Albert_Einstein>
has a picture at
http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.

IRI references are also used to uniquely identify the objects in
metadata statements (the third part of each triple). The picture of
Einstein is already an IRI, but we could also use an IRI to uniquely
identify the country 'German Empire'. At the same time we'll indicate
that the name and date of birth really are literals (and not IRIs), by
putting quotes around them:

Example 16

<http://dbpedia.org/resource/Albert_Einstein>
has the name
"Albert Einstein".
<http://dbpedia.org/resource/Albert_Einstein>
was born on
"March 14, 1879".
<http://dbpedia.org/resource/Albert_Einstein>
was born in
<http://dbpedia.org/resource/German_Empire>.
<http://dbpedia.org/resource/Albert_Einstein>
has a picture at
<http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg>.

IRI references are also used to ensure that predicates are
unambiguous; now we can be sure that 'birthplace', 'place of birth',
'Lieu de naissance' and so on, all mean the same thing:

3.4 Plain Literals

Although IRI resources are always used for subjects and predicates,
the object part of a triple can be either an IRI or a literal.
In the example triples, Einstein's name is represented by a plain
literal, specifically a basic string with no
type or language information:

3.5 Typed Literals

Some literals, such as dates and numbers, have very specific
meanings, so RDF provides a mechanism for indicating the type of a
literal. A typed literal
is indicated by attaching an IRI to the end of a plain literal,
and this IRI indicates the literal's datatype. This IRI is usually
based on datatypes defined in the XML Schema Datatypes specification
[XMLSCHEMA11-2]. The following syntax would be used to unambiguously
express Einstein's date of birth as a literal of type http://www.w3.org/2001/XMLSchema#date:

3.6 Turtle

RDF itself does not have one set way to express triples, since the
key ideas of RDF are the triple and the use of IRIs, and not
any particular syntax. However, there are a number of mechanisms for
expressing triples, such as RDF/XML [RDF-SYNTAX-GRAMMAR], Turtle
[TURTLE], and of course RDFa. Many discussions of RDF make use of
the Turtle syntax to explain their ideas, since it is quite
compact. The examples we have just seen are already using this syntax,
and we'll continue to use it throughout this document when we need to
talk about the RDF that could be generated from some RDFa. Turtle
allows long IRIs to be abbreviated by using an IRI mapping, which can
be used to express a compact IRI expression as follows:

When writing examples, you will often see the following IRI in the
Turtle representation:

Example 23

<>

This indicates the 'current document', i.e., the document being
processed. In the end there will always be a full IRI based on the
document's location, but this abbreviation serves to make examples
more compact. Note in particular that the whole technique of
abbreviation is merely a way to make examples more compact, and the
actual triples generated would always use the full IRIs.

3.7 Graphs

A collection of triples is called a graph. All of the
triples that are defined by this specification are contained in the output
graph by an RDFa Processor. For more information on graphs
and other RDF concepts, see [RDF-SYNTAX-GRAMMAR].

3.8 Compact URI Expressions

In order to allow for the compact expression of RDF statements, RDFa
allows the contraction of most IRI references into a
form called a 'compact URI expression', or CURIE. A
detailed discussion of this mechanism is in the section CURIE
and IRI Processing.

Note that CURIEs are only used in the markup and Turtle examples, and
will never appear in the generated triples, which are
defined by RDF to use IRI references.

3.9 Markup Fragments and RDFa

A growing use of embedded metadata is to take fragments of markup and
move them from one document to another. This may happen through the
use of tools, such as drag-and-drop in a browser, or through snippets
of code provided to authors for inclusion in their documents. A good
example of the latter is the licensing fragment
provided by Creative Commons.

However, those involved in creating fragments (either by building
tools, or authoring snippets), should be aware that this specification
does not say how fragments are processed. Specifically, the processing
of a fragment 'outside' of a complete document is undefined because
RDFa processing is largely about context. Future versions of this or
related specifications may do more to define this behavior.

Developers of tools that process fragments, or authors of fragments
for manual inclusion, should also bear in mind what will happen to
their fragment once it is included in a complete document. They should
carefully consider the amount of 'context' information that will be
needed in order to ensure a correct interpretation of their fragment.

3.10 A Description of RDFa in RDF Terms

The following is a brief description of RDFa in terms of the RDF
terminology introduced here. It may be useful to readers with an RDF
background:

In RDFa, a subject IRI reference is generally indicated
using @about and predicates are represented using one of
@property, @rel, or @rev.
Objects which are IRI references are represented using @resource,
@src, or @href, whilst objects that are literals
are represented either with @content or the content of
the element in question (with an optional datatype expressed using @datatype,
and an optional language expressed using a Host Language-defined
mechanism such as @xml:lang).

4. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples,
and notes in this specification are non-normative. Everything else in this specification is
normative.

The key words MAY, MUST, MUST NOT, RECOMMENDED, SHOULD, and SHOULD NOT are
to be interpreted as described in [RFC2119].

4.1 RDFa Processor Conformance

This specification uses the term output
graph to mean all of the triples asserted by a document
according to the Processing Model section. A conforming RDFa Processor MUST make available to a consuming application a single RDF graph containing all possible triples generated by using the rules in the Processing Model section.
The
term processor graph is used to denote the collection of
all informational, warning, and error triples that MAY be generated by
the RDFa Processor to report its status.
The output graph and the processor graph
are separate graphs and MUST NOT be stored in the same graph by the
RDFa Processor. However, processors may permit the two graphs to be
retrieved together; see Section 7.6.1 for details.

A conforming RDFa Processor MAY make available additional triples
that have been generated using rules not described here, but these
triples MUST NOT be made available in the output graph.
(Whether these additional triples are made available in one or more
additional RDF graphs is implementation-specific, and
therefore not defined here.)

A conforming RDFa Processor MUST preserve white space in both plain
literals and XML literals.
However, it may be the case that the architecture in which a processor
operates has made changes to the white space in a document before that
document ever reaches the RDFa Processor (e.g., [XMLSCHEMA11-1]
processors are permitted to 'normalize' white space in attribute
values - see section 3.1.4). To ensure maximum consistency between
processing environments, authors SHOULD remove any unnecessary white
space in their plain and XML Literal content.

A conforming RDFa Processor MUST examine the media type of a document
it is processing to determine the document's Host Language. If the
RDFa Processor is unable to determine the media type, or does not
support the media type, the RDFa Processor MUST process the document
as if it were media type application/xml. See XML+RDFa
Document Conformance. A Host Language MAY specify additional
announcement mechanisms.

Note

A conforming RDFa Processor MAY use additional
mechanisms (e.g., the DOCTYPE, a file extension, the root element, an overriding
user-defined parameter) to
attempt to determine the Host Language if the media type is
unavailable. These mechanisms are unspecified.

4.2 RDFa Host Language Conformance

Host Languages that incorporate RDFa must adhere to the following:

All of the facilities required in this specification MUST be
included in the Host Language.

The required attributes defined in this specification MUST be included in
the content model of the Host Language.

Note

For the avoidance of doubt, there is no requirement that attributes
such as @href and @src are used in a
conforming Host Language. Nor is there any requirement that all
required attributes are incorporated into the content model of
all elements. The working group recommends that Host Language designers
ensure that the required attributes are incorporated into the content
model of elements that are commonly used throughout the
content model of the Host Language.

If the Host Language uses XML Namespaces [XML-NAMES], the
attributes in this specification SHOULD be defined in 'no
namespace' (e.g., when the attributes are used on elements in the
Host Language's namespace, they can be used with no qualifying
prefix: <myml:myElement property="license">).
When a Host Language does not use the attributes in 'no namespace',
they MUST be referenced via the XHTML Namespace (http://www.w3.org/1999/xhtml).

If the Host Language has its own definition for any attribute
defined in this specification, that definition MUST be such that the
processing required by this specification remains possible when the
attribute is used in a way consistent with the requirements herein.

4.3 XML+RDFa Document Conformance

This specification does not define a stand-alone document type. The
attributes herein are intended to be integrated into other host
languages (e.g., HTML+RDFa or XHTML+RDFa). However, this specification
does define processing rules for generic XML
documents - that is, those documents delivered as media types text/xml
or application/xml. Such documents must meet all of the
following criteria:

The document SHOULD use the attributes defined in this
specification in 'no namespace' (e.g., when the attributes are used on
elements they are used with no qualifying
prefix: <myml:myElement property="license">).

Note

It is possible that an XML grammar will have native attributes that
conflict with attributes in this specification. This could result in an RDFa
processor generating unexpected triples.

When an RDFa Processor processes an XML+RDFa document, it does so via
the following initial context:

5. Attributes and Syntax

This specification defines a number of attributes and the way in which
the values of those attributes are to be interpreted when generating RDF
triples. This section defines the attributes and the syntax of their
values.

a traditionally navigable IRI for expressing the
partner resource of a relationship (a 'resource object', in RDF
terminology);

inlist

An attribute used to indicate that the object
associated with a rel or property
attribute on the same element is to be added to the list for that
predicate. The value of this attribute MUST be ignored.
Presence of this attribute causes a list to be created if it does not already exist.

In all cases it is possible for these attributes to be used with
no value (e.g., @datatype="") or with a value that evaluates to
no value after evaluation using the rules for
CURIE and IRI Processing
(e.g., @datatype="[noprefix:foobar]").

5.1 Roles of attributes

The RDFa attributes play different roles in a semantically rich document.
Briefly, those roles are:

6. CURIE Syntax Definition

Note

The working group is currently examining the productions
for CURIE below in light of recent comments received from the RDF
Working Group and members of the RDFa Working
Group. It is possible that there will be minor changes to the production
rules below in the near future, and that these changes will be
backward incompatible. However, any such incompatibility will be
limited to edge cases.

The key component of RDF is the IRI, but these are usually long and
unwieldy. RDFa therefore supports a mechanism by which IRIs can be
abbreviated, called 'compact URI expressions' or simply, CURIEs.

When expanded, the resulting IRI MUST be a syntactically valid IRI
[RFC3987]. For a more detailed explanation see CURIE
and IRI Processing. The lexical space of a CURIE is as
defined in curie below. The value space
is the set of IRIs.

A CURIE is comprised of two components, a prefix
and a reference. The prefix is separated from the
reference by a colon (:). In general use it is possible to
omit the prefix, and so create a CURIE that makes use of the 'default
prefix' mapping; in RDFa the 'default prefix' mapping is http://www.w3.org/1999/xhtml/vocab#.
It's also possible to omit both the prefix and the colon, and
so create a CURIE that contains just a reference which makes use of the
'no prefix' mapping. This specification does not define a 'no prefix'
mapping. RDFa Host Languages MUST NOT define a 'no prefix' mapping.

Note

The RDFa 'default prefix' should not be confused with the
'default namespace' as defined in [XML-NAMES]. An RDFa Processor MUST NOT treat an XML-NAMES 'default namespace' declaration as if it were
setting the 'default prefix'.

The production safe_curie is not required,
even in situations where an attribute value is permitted to be a CURIE
or an IRI: An IRI that uses a scheme that is not an in-scope mapping cannot
be confused with a CURIE. The concept of a safe_curie is retained for
backward compatibility.

Note

It is possible to define a CURIE prefix mapping in such a way that
it would overshadow a defined IRI scheme. For example, a document could map the prefix
'mailto' to 'http://www.example.com/addresses/'. Then a @resource that
contained 'mailto:user@example.com' might create a triple with the object
'http://www.example.com/addresses/user@example.com'. Moreover, it is possible
though unlikely, that schemes will be introduced in the future that will conflict
with prefix mappings defined in a document (e.g., the newly proposed 'widget'
scheme [WIDGETS-URI]). In neither case would this RDFa overshadowing of the
underlying scheme alter the way other consumers of the IRI treat that IRI. It
could, however, mean that the document author's intended use of the CURIE is
mis-interpreted by another consumer as an IRI. The working group considers this
risk to be minimal.

In normal evaluation of CURIEs the following context information would
need to be provided:

a set of mappings from prefixes to IRIs;

a mapping to use with the default prefix (for example, :p);

a mapping to use when there is no prefix (for example, p);

a mapping to use with the '_' prefix, which is used to generate
unique identifiers (for example, _:p).

In RDFa these values are defined as follows:

the set of mappings from prefixes to IRIs is
provided by the current in-scope prefix declarations of the current
element during parsing;

the mapping to use with the default prefix is the
current default prefix mapping;

the mapping to use when there is no prefix is not
defined;

the mapping to use with the '_' prefix, is not
explicitly stated, but since it is used to generate bnodes,
its implementation needs to be compatible with the RDF definition and
rules in Referencing Blank Nodes. A
document SHOULD NOT define a mapping for the '_' prefix. A Conforming
RDFa Processor MUST ignore any definition of a mapping for the '_'
prefix.

A CURIE is a representation of a full IRI. The rules for determining
that IRI are:

If a CURIE consists of an empty prefix and a reference,
the IRI is obtained by taking the current default prefix mapping and
concatenating it with the reference. If there is no
current default prefix mapping, then this is not a valid CURIE and
MUST be ignored.

Otherwise, if a CURIE consists of a non-empty prefix
and a reference, and if there is an in-scope mapping for prefix
(when compared case-insensitively), then the IRI is created by using
that mapping, and concatenating it with the reference.

Finally, if there is no in-scope mapping for prefix,
then the value is not a CURIE.

6.1 Why CURIEs and not QNames?

This section is non-normative.

In many cases, language designers have attempted to use QNames for an
extension mechanism [XMLSCHEMA11-2]. QNames do permit independent
management of the name collection, and can map the names to
a resource. Unfortunately, QNames are unsuitable in most cases because
1) the use of QName as identifiers in attribute values and element
content is problematic as discussed in [QNAMES] and 2) the syntax of
QNames is overly restrictive and does not allow all possible IRIs to
be expressed.

A specific example of the problem this causes comes from attempting
to define the name collection for books. In a QName, the part after
the colon must be a valid element name, making an example such as the
following invalid: isbn:0321154991

This is not a valid QName simply because "0321154991" is not a valid
element name. Yet, in the example given, we don't really want to
define a valid element name anyway. The whole reason for using a QName
was to reference an item in a private scope - that of ISBNs. Moreover,
in this example, we want the names within that scope to map to an IRI
that will reveal the meaning of that ISBN. As you can see, the
definition of QNames and this (relatively common) use case are in
conflict with one another.

This specification addresses the problem by defining CURIEs.
Syntactically, CURIEs are a superset of QNames.

Note that this specification is targeted at language designers, not
document authors. Any language designer considering the use of QNames
as a way to represent IRIs or unique tokens should consider instead
using CURIEs:

CURIEs are designed from the ground up to be used in attribute
values. QNames are designed for unambiguously naming elements and
attributes.

CURIEs expand to IRIs, and any IRI can be represented by such an
expansion. QNames are treated as value pairs, but even if those
pairs are combined into a string, only a subset of IRIs can be
represented.

CURIEs can be used in non-XML grammars, and can even be used in
XML languages that do not support XML Namespaces. QNames are limited
to XML Namespace-aware XML Applications.

7. Processing Model

This section looks at a generic set of processing rules for creating a
set of triples that represent the structured data present in an RDFa
document. Processing need not follow the DOM traversal technique
outlined here, although the effect of following some other manner of
processing must be the same as if the processing outlined here were
followed. The processing model is explained using the idea of DOM
traversal which makes it easier to describe (particularly in relation to
the evaluation context).

Note that in this section, explanations about the
processing model or guidance to implementors are enclosed in sections
like this.

7.1 Overview

Evaluating a document for RDFa triples is carried out by starting at
the document object, and then visiting each of its child elements in
turn, in document order, applying processing rules. Processing is
recursive in that for each child element the processor also visits
each of its child elements, and applies the same processing
rules.

Note

In some environments there will be little difference
between starting at the root element of the document, and starting at
the document object itself. It is defined this way because in some
environments important information is present at the document object
level which is not present on the root element.

As processing continues, rules are applied which may generate
triples, and may also change the evaluation context
information that will then be used when processing descendant
elements.

Note

This specification does not say anything about what
should happen to the triples generated, or whether more triples might
be generated during processing than are outlined here. However, to be
conforming, an RDFa Processor MUST act as if at a minimum the rules in
this section are applied, and a single RDF graph
produced. As described in the RDFa Processor
Conformance section, any additional triples generated MUST NOT
appear in the output graph. They MAY be included in
the processor graph.

7.2 Evaluation Context

During processing, each rule is applied using information provided
by an evaluation context. An initial context
is created when processing begins. That context has the following
members:

The base. This will usually be the IRI of the
document being processed, but it could be some other IRI, set by
some other mechanism, such as the (X)HTML base
element. The important thing is that it establishes an IRI against
which relative paths can be resolved.

The parent subject. The initial
value will be the same as the initial value of base,
but it will usually change during the course of processing.

The parent object. In some
situations the object of a statement becomes the subject of any
nested statements, and this member is used to convey this value.
Note that this value may be a bnode, since in some
situations a number of nested statements are grouped together on one
bnode. This means that the bnode must be
set in the containing statement and passed down.

A list of current, in-scope IRI
mappings.

A list of incomplete triples. A triple can be
incomplete when no object resource is provided alongside a predicate
that requires a resource (i.e., @rel or @rev).
The triples can be completed when a resource becomes available,
which will be when the next subject is specified (part of the
process called chaining).

A list mapping that associates IRIs with lists.

The language. Note that there is no default
language.

The term mappings, a list of terms and their
associated IRIs. This specification does not define an initial list.
Host Languages MAY define an initial list.

The default vocabulary, a value to use as the prefix
IRI when a term unknown to the RDFa
Processor is used. This specification does not
define an initial setting for the default vocabulary. Host Languages
MAY define an initial setting.

During the course of processing, new evaluation contexts
are created which are passed to each child element. The initial rules
described below will determine the values of the items in the context.
Then the core rules will cause new triples to be created by
combining information provided by an element with information from the
evaluation context.

During the course of processing a number of locally scoped values are
needed, as follows:

An initially empty list of IRI mappings, called the
local list of IRI mappings.

An initially empty list of incomplete triples,
called the local list of incomplete triples.

A value for the current
property value, the literal to use when creating triples
that have a literal object, or IRI-s in the absence of @rel
or @rev.

A value for the current
object resource, the resource to use when creating triples
that have a resource object.

A value for the typed resource,
the source for creating rdf:type relationships to
types specified in @typeof.

The local term mappings, a list of terms and their
associated IRIs.

The local list mapping, mapping IRIs to lists

A local default vocabulary, an IRI to use as a
prefix mapping when a term is used.

7.3 Chaining

Statement chaining is an RDFa feature that allows the
author to link RDF statements together while avoiding unnecessary
repetitive markup. For example, if an author were to add statements as
children of an object that was a resource, these statements should be
interpreted as being about that resource:

In this example we can see that an object resource
('German_Empire'), has become the subject for nested statements. This
markup also illustrates the basic chaining pattern of 'A has a B has a
C' (i.e., Einstein has a birth place of the German Empire, which has a
long name of 'the German Empire').

It's also possible for the subject of nested statements to provide
the object for containing statements — essentially the
reverse of the example we have just seen. To illustrate, we'll take an
example of the type of chaining just described, and show how it could
be marked up more efficiently. To start, we mark up the fact that
Albert Einstein had, at some point in his life, a residence both in
the German Empire and in Switzerland:

The subject for 'the German Empire' would remain Albert Einstein (and
that would, of course, be an error). This is the main difference
between @property and @rel: the latter
induces chaining, whereas the former, usually, does not.

7.4 CURIE and IRI Processing

Since RDFa is ultimately a means for transporting RDF, a key concept
is the resource and its manifestation as an IRI. RDF deals
with complete IRIs (not relative paths); when converting RDFa to
triples, any relative IRIs MUST be resolved relative to the base IRI,
using the algorithm defined in section 6.5 of RFC 3987 [RFC3987], Reference
Resolution. The values of RDFa attributes
that refer to IRIs use three different datatypes: IRI,
SafeCURIEorCURIEorIRI, or TERMorCURIEorAbsIRI.
All these attributes are mapped, after processing, to IRIs. The
handling of these attributes is as follows:

IRI

The content is an IRI, and is used as such.

SafeCURIEorCURIEorIRI

When the value is surrounded by square brackets, then the
content within the brackets is evaluated as a CURIE according to
the CURIE Syntax Definition. If it is
not a valid CURIE, the value MUST be ignored.

Otherwise, the value is evaluated as a CURIE. If it is a valid
CURIE, the resulting IRI is used; otherwise, the value is
processed as an IRI.

Note

A consequence of this is that when the value of an attribute of this
datatype is the empty string (e.g., @about=""), that value resolves to an
IRI. An IRI of "" is a relative IRI that is interpreted as being the same as the base.
In other words, a value of "" will usually resolve to the IRI of the current document.

Note

A related consequence of this is that when the value of an attribute of this datatype is an empty SafeCURIE (e.g., @about="[]"), that value does not result in an IRI and therefore the value is ignored.

Note
that it is possible for all values in an attribute to be ignored. When
that happens, the attribute MUST be treated as if it were empty.

For example, the full IRI for Albert Einstein on DBPedia is:

Example 31

http://dbpedia.org/resource/Albert_Einstein

This can be shortened by authors to make the information easier to
manage, using a CURIE. The first step is for the author to create a
prefix mapping that links a prefix to some leading segment of the IRI.
In RDFa these mappings are expressed using @prefix:

Example 32

<div prefix="db: http://dbpedia.org/">
...
</div>

Once the prefix has been established, an author can then use it to
shorten an IRI as follows:

The author is free to split the IRI at any point.
However, since a common use of CURIEs is to
make available libraries of terms and values, the prefix will usually
be mapped to some common segment that provides the most re-use, often
provided by those who manage the library of terms. For example, since
DBPedia contains an enormous list of resources, it is more efficient
to create a prefix mapping that uses the base location of the
resources:

Note that it is generally considered a bad
idea to use relative paths in prefix declarations. Since it is
possible that an author may ignore this guidance, it is further
possible that the IRI obtained from a CURIE is relative. However,
since all IRIs must be resolved relative to base before
being used to create triples, the use of relative paths should not
have any effect on processing.

7.4.1 Scoping of Prefix Mappings

CURIE prefix mappings are defined on the current element and its
descendants. The inner-most mapping for a given prefix takes
precedence. For example, the IRIs expressed by the following two
CURIEs are different, despite the common prefix, because the prefix
mappings are locally scoped:

In general it is a bad practice to redefine prefix
mappings within a document. In particular, while it is permitted, mapping a
prefix to different values at different places within a document could lead to
confusion. The working group recommends that document authors use the same
prefix to map to the same vocabulary throughout a document. Many vocabularies
have recommended prefix names. The working group recommends that these names
are used whenever possible.

7.4.2 General Use of CURIEs in Attributes

There are a number of ways that attributes make use of CURIEs, and
they need to be dealt with differently. These are:

An attribute may allow one or more values that are a mixture of
TERMs, CURIEs, and absolute IRIs.

An attribute may allow one or more values that are a mixture of
CURIEs and IRIs. In this case any value that is not a CURIE, as
outlined in section CURIE Syntax Definition,
will be processed as an IRI.

If the value is surrounded by square brackets, then
the content within the brackets is always evaluated according to
the rules in CURIE Syntax Definition -
and if that content is not a CURIE, then the content MUST be
ignored.

Note

An empty attribute value (e.g., typeof='')
is still a CURIE, and is processed as such. The rules
for this processing are defined in Sequence.
Specifically, however, an empty attribute value is never
treated as a relative IRI by this specification.

An example of an attribute that can contain a CURIEorIRI is @about.
To express an IRI directly, an author might do this:

Example 36

<div about="http://dbpedia.org/resource/Albert_Einstein">
...
</div>

whilst to express the IRI above as a CURIE an author would do this:

Example 37

<div about="dbr:Albert_Einstein">
...
</div>

The author could also use a safe CURIE, as follows:

Example 38

<div about="[dbr:Albert_Einstein]">
...
</div>

Since non-CURIE values MUST be ignored, the following value in @about
would not set a new subject, since @about
does not permit the use of TERMs, and the CURIE
has no prefix separator.

Example 39

<div about="[Albert_Einstein]">
...
</div>

However, this markup would set a subject, since it is not
a CURIE, but a valid relative IRI:

Example 40

<div about="Albert_Einstein">
...
</div>

Note that several RDFa attributes are able to also take TERMS as their value.
This is discussed in the next section.

7.4.3 General Use of Terms in Attributes

Some RDFa attributes have a datatype that permits a term to be referenced.
RDFa defines the syntax of a term as:

Otherwise, check if the term matches an item in the list of local
term mappings. First compare against the list case-sensitively,
and if there is no match then compare case-insensitively.
If there is a match, use the associated IRI.

7.4.5 Referencing Blank Nodes

In RDFa, it is possible to establish relationships using various
types of resource references, including bnodes. If a
subject or object is defined using a CURIE, and that CURIE
explicitly names a bnode, then a Conforming Processor
MUST create the bnode when it is encountered during
parsing. The RDFa Processor MUST also ensure that no bnode
created automatically (e.g., as a result of chaining) has a
name that collides with a bnode that is defined by
explicit reference in a CURIE.

In the above fragment, two bnodes are
explicitly created as the subject of triples. Those bnodes
are then referenced to demonstrate the relationship between the
parties. After processing, the following triples will be generated:

RDFa Processors use, internally, implementation-dependent
identifiers for bnodes. When triples are retrieved, new
bnode indentifiers are used, which usually bear no relation to the
original identifiers. However, implementations do ensure that these
generated bnode identifiers are consistent: each bnode will have its
own identifier, all references to a particular bnode will use the
same identifier, and different bnodes will have different
identifiers.

As a special case, _: is also a valid reference for one
specific bnode.

7.5 Sequence

Processing would normally begin after the document to be parsed has
been completely loaded. However, there is no requirement for this to
be the case, and it is certainly possible to use a stream-based
approach, such as SAX [SAX] to extract the RDFa information.
However, if some approach other than the DOM traversal technique
defined here is used, it is important to ensure that Host
Language-specific processing rules are applied (e.g., XHTML+RDFa
[XHTML-RDFA] indicates the base element can be used,
and base will affect the interpretation of IRIs in meta
or link elements even if those elements are before the base
element in the stream).

Note

In this section the term 'resource' is used to mean 'IRI
or bnode'. It is possible that this term will be replaced with
some other, more formal term after consulting with other groups. Changing this
term will in no way change this processing sequence.

At the beginning of processing, an initial evaluation
context is created, as follows:

the base is set to the IRI of the document (or
another value specified in a language specific manner such as the
HTML base element);

Processing begins by applying the processing rules below to the
document object, in the context of this initial evaluation
context. All elements in the tree are also processed
according to the rules described below, depth-first, although the evaluation
context used for each set of rules will be based on previous
rules that may have been applied.

Note

This specification defines processing rules for optional
attributes that may not be present in all Host Languages (e.g., @href).
If these attributes are not supported in the Host Language, then the
corresponding processing rules are not relevant for that language.

Note that some of the local variables are
temporary containers for values that will be passed to descendant
elements via an evaluation context. In some cases
the containers will have the same name, so to make it clear which
is being acted upon in the following steps, the local version of
an item will generally be referred to as such.

Mappings are defined via @prefix.
Values
in this attribute are evaluated from beginning to end (e.g.,
left to right in typical documents).For
backward compatibility, RDFa Processors SHOULD also permit the
definition of mappings via @xmlns. In
this case, the value to be mapped is set by the XML namespace
prefix, and the value to map is the value of the attribute — an
IRI. (Note that prefix mapping via @xmlns
is deprecated, and may be removed in a future version of this
specification.) When xmlns is
supported, such mappings MUST be processed before processing any
mappings from @prefix on the same element.
Regardless of how the mapping is declared, the value to be
mapped MUST be converted to lower case, and the IRI is
not processed in any way; in particular if it is a relative path
it MUST NOT be resolved against the current base.
Authors SHOULD NOT use relative paths as the IRI.

If in any of the previous steps a typed resource was
set to a non-null value, it is now used to provide a subject for
type values;

One or more 'types' for the typed
resource can be set by using @typeof. If
present, the attribute may contain one or more IRIs, obtained
according to the section on CURIE
and IRI Processing, each of which is used to generate a
triple as follows:

If however current object
resource was set to null, but there are predicates present,
then they must be stored as incomplete triples,
pending the discovery of a subject that can be used as the object.
Also, current object resource should be set to a newly
created bnode (so that the incomplete triples have a
subject to connect to if they are ultimately turned into triples);

The actual literal is either the value of @content
(if present) or a string created by concatenating
the value of all descendant text nodes, of the current
element in turn. The final string includes the
datatype IRI, as described in [RDF-SYNTAX-GRAMMAR], which will
have been obtained according to the section on CURIE
and IRI Processing.

The actual literal is either the value of @content
(if present) or a string created by concatenating
the value of all descendant text nodes, of the current
element in turn.

otherwise, as an XML
literal if @datatype is present and is
set to XMLLiteral in the vocabulary http://www.w3.org/1999/02/22-rdf-syntax-ns#.

The value of the XML literal
is a string created by serializing to text, all nodes that
are descendants of the current element, i.e.,
not including the element itself, and giving it a datatype
of XMLLiteral in the vocabulary http://www.w3.org/1999/02/22-rdf-syntax-ns#.
The format of the resulting serialized content is as defined
in Exclusive XML Canonicalization Version 1.0 [XML-EXC-C14N].

Note

In order to maintain maximum portability of this literal,
any children of the current node that are elements MUST have
the current XML namespace declarations (if any) declared on
the serialized element. Since the child element node could
also declare new XML namespaces, the RDFa Processor MUST be
careful to merge these together when generating the
serialized element definition. For avoidance of doubt, any
re-declarations on the child node MUST take precedence over
declarations that were active on the current node.

Additionally, if there is a value for current
language then the value of the plain literal
should include this language information, as described in
[RDF-SYNTAX-GRAMMAR]. The actual literal is either the value of
@content (if present) or a string
created by concatenating the text content of each of the
descendant elements of the current element in
document order.

Finally, if there is one or more mapping in
the local list mapping, list triples are generated as
follows:

For each IRI in the local list mapping,
if the equivalent list does not exist in the
evaluation context, indicating that the list was
originally instantiated on the current element, use the list as
follows:

If there are zero items in the list associated with the IRI,
generate the following triple:

7.6 Processor Status

The processing rules covered in the previous section are designed to
extract as many triples as possible from a document. The RDFa
Processor is designed to continue processing, even in the event of
errors. For example, failing to resolve a prefix mapping or term
would result in the RDFa Processor skipping the generation of a triple
and continuing with document processing. There are cases where knowing
each RDFa Processor warning or error would be beneficial to authors.
The processor graph is designed as the mechanism
to capture all informational, warning, and error messages as triples
from the RDFa Processor. These status triples may be retrieved and
used to aid RDFa authoring or automated error detection.

If an RDFa Processor supports the generation of a processor graph,
then it MUST generate a set of triples when the following processing
issues occur:

An rdfa:ErrorMUST be generated when the document fails to be
fully processed as a result of non-conformant Host Language markup.

A rdfa:WarningMUST be generated when a CURIE prefix fails to be
resolved.

A rdfa:WarningMUST be generated when a Term fails to be resolved.

Other implementation-specific rdfa:Info, rdfa:Warning,
or rdfa:Error triples MAY be generated by the RDFa Processor.

7.6.1 Accessing the Processor Graph

Accessing the processor graph may be accomplished in
a variety of ways and is dependent on the type of RDFa Processor and
access method that the developer is utilizing.

SAX-based processors or processors that utilize function or method
callbacks to report the generation of triples are classified as event-based
RDFa Processors. For Event-based RDFa Processors, the
software MUST allow the developer to register a function or callback
that is called when a triple is generated for the processor
graph. The callback MAY be the same as the one that is used
for the output graph as long as it can be determined
if a generated triple belongs in the processor graph
or the output graph.

A whole-graph RDFa Processor is defined as any RDFa
Processor that processes the entire document and only
provides the
developer access to the triples after processing has completed. RDFa
Processors that typically fall into this category express their
output via a single call using RDF/XML, N3, TURTLE, or N-Triples
notation. For whole-graph RDFa Processors, the software MUST allow
the developer to specify if they would like to retrieve the output
graph, the processor graph, or both graphs as
a single, combined graph from the RDFa Processor.
If the graph preference is not specified, the output graphMUST be returned.

A web service RDFa Processor is defined as any RDFa
Processor that is capable of processing a document by performing an
HTTP GET, POST or similar action on an RDFa Processor IRI. For this
class of RDFa Processor, the software MUST allow the caller to
specify if they would like to retrieve the output graph,
the processor graph, or both graphs as a single,
combined graph from the web service. The rdfagraph
query parameter MUST be used to specify the value. The allowable
values are output, processor or both
values, in any order, separated by a comma character.
If the graph preference is not specified, the output graphMUST be returned.

7.6.2 Processor Graph Terms

To ensure interoperability, a core hierarchy of classes is defined
for the content of the processor graph. Separate errors or warnings
are resources (typically blank nodes) of a specific type, with
additional properties giving more details on the error condition or
the warning. This specification defines only the top level classes
and the ones referring to the error and warning conditions defined explicitly
by this document. Other, implementation-specific subclasses may be
defined by the RDFa Processor.

The top level classes are rdfa:Error, rdfa:Warning,
and rdfa:Info, defined as part of the RDFa
Vocabulary. Furthermore, a single property is defined on those
classes, namely rdfa:context, that provides an extra
context for the error, e.g., http response, an XPath information, or
simply the IRI to the RDFa resource. Usage of this property is
optional, and more than one triple can be used with this predicate
on the same subject. Finally, error and warning instances SHOULD use
the dc:description and dc:date
properties. dc:description should provide a short,
human readable but implementation dependent description of the
error. dc:date should give the time when the error was
found and it is advised to be as precise as possible to allow the
detection of, for example, possible network errors.

The example below shows the triples that should be minimally
present in the processor graph as a result of an error (the content
of the literal for the dc:description predicate is
implementation dependent):

A slightly more elaborate example makes use of the rdfa:context
property to provide further information, using external vocabularies
to represent HTTP headers or XPointer information (note that a
processor may not have these information in all cases, i.e., these rdfa:context
information are not required):

7.7 Vocabulary Expansion

Processors MAY perform vocabulary expansion by
utilizing limited RDFS and OWL entailment rules,
as described in RDFa
Vocabulary Expansion.

8. RDFa Processing in detail

This section is non-normative.

This section provides an in-depth examination of the processing steps
described in the previous section. It also includes examples which may
help clarify some of the steps involved.

The key to processing is that a triple is generated whenever a
predicate/object combination is detected. The actual triple generated
will include a subject that may have been set previously, so this is
tracked in the current evaluation context and is called
the parent subject. Since the subject will default to the
current document if it hasn't been set explicitly, then a
predicate/object combination is always enough to generate one or more
triples.

The attributes for setting a predicate are @rel, @rev
and @property, whilst the attributes for setting an object
are @resource, @href, @content,
and @src. @typeof is unique in that it sets both
a predicate and an object at the same time (and also a subject when it
appears in the absence of other attributes that would set a subject).
Inline content might also set an object, if @content is not
present, but @property is present.

Note

There are many examples in this section. The examples are
all written using XHTML+RDFa. However, the explanations are relevant
regardless of the Host Language.

8.1 Changing the Evaluation Context

8.1.1 Setting the current subject

When triples are created they will always be in relation to a
subject resource which is provided either by new subject
(if there are rules on the current element that have set a subject)
or parent subject, as passed in via the evaluation
context. This section looks at the specific ways in which
these values are set. Note that it doesn't matter how the subject is
set, so in this section we use the idea of the current
subject which may be eithernew subject
or parent subject.

8.1.1.1 The current document

When parsing begins, the current subject will be
the IRI of the document being parsed, or a value as set by a Host
Language-provided mechanism (e.g., the base element
in (X)HTML). This means that by default any metadata found in the
document will concern the document itself:

As processing progresses, any @about attributes will
change the current subject. The value of @about
is an IRI or a CURIE. If it is a relative IRI then it needs to be
resolved against the current base value. To
illustrate how this affects the statements, note in this markup
how the properties inside the (X)HTML body element
become part of a new calendar event object, rather than referring
to the document as they do in the head of the document:

@typeof defines typing triples. @typeof
works differently to other ways of setting a predicate since the
predicate is always rdf:type, which means that the
processor only requires the value of the type. The
question is: which resource gets these typing information?

If the element has an @about, which creates a new
context for statements, the typing relationships are defined on
that resource. For example, the following:

The @about attribute is the main source for typing;
if it is present on an element, it determines the effect of @typeof
with the highest priority. If @about is not
present, but the element is used only to define possible subject
resources via, e.g., @resource (i.e., there is no@rel, @rev, or @property
present), then that resource is used for the typed resource, just
like @about.

If an @rel is present (and still no @about)
then the explicit object of the triples defined by @rel
is typed. For example, in the case of:

Finally, @typeof also has the additional feature of
creating a new context for statements, in case no other
attributes define any. This involves generating a new bnode
(see below for more about bnodes). For example, an author may wish
to create markup for a person using the FOAF vocabulary, but
without having a clear identifier for the item:

A bnode is simply a unique
identifier that is only available to the processor, not to any
external software. By generating values internally, the processor
is able to keep track of properties for _:a as being
distinct from _:b. But by not exposing these values
to any external software, it is possible to have complete control
over the identifier, as well as preventing further statements
being made about the item.

As emphasized in the section on chaining,
one of the main differences between @property and @rel
(or @rev) is that the former does not induce
chaining. The only exception to this rule is when @typeof
is also present on the element. In that case the effect of @property
is identical to @rel. For example, the previous
example could have been written as:

generating the same triples as before. Here again, a @typeof without
an @about or a @resource can be regarded as a shorthand
for an additional @resource attribute referring to the identifier of a fresh bnode.

As described in the previous two sections, @about
will always take precedence and mark a new subject, but if no @about
value is available then @typeof will do the same job,
although using an implied identifier, i.e., a bnode.

But if neither @about or @typeof are
present, there are a number of ways that the subject could be
arrived at. One of these is to 'inherit' the subject from the
containing statement, with the value to be inherited set either
explicitly, or implicitly.

The most usual way that an inherited subject might get set
would be when the parent statement has an object that is a
resource. Returning to the earlier example, in which the long
name for the German_Empire was added, the following markup was
used:

In this situation, all statements that are 'contained' by the
object resource representing the German Empire (the value in @resource)
will have the same subject, making it easy for authors to add
additional statements:

Note also that the same principle described here applies to @src
and @href.

8.1.1.4.2 Inheriting an anonymous subject

There will be occasions when the author wants to connect the
subject and object as shown above, but is not concerned to name
the resource that is common to the two statements (i.e., the
object of the first statement, which is the subject of the
second). For example, to indicate that Einstein was influenced
by Spinoza the following markup could well be used:

In RDF terms, the item that 'represents' Einstein is anonymous,
since it has no IRI to identify it. However, the item is given
an automatically generated bnode, and it is onto
this identifier that all child statements are attached:

From the point of view of the markup, this latter layout is to
be preferred, since it draws attention to the 'hanging rel'. But
from the point of view of an RDFa Processor, all of these
permutations need to be supported.

8.2 Completing incomplete triples

When a new subject is calculated, it is also used to complete any
incomplete triples that are pending. This situation arises when the
author wants to 'chain' a number of statements together. For example,
an author could have a statement that Albert Einstein was born in the
German Empire:

When this happens the @rel for 'birth place' is
regarded as a 'hanging rel' because it has not yet generated any
triples, but these 'incomplete triples' are completed by the @about
that appears on the next line. The first step is therefore to store
the two parts of the triple that the RDFa Processor does
have, but without an object:

Example 85

<http://dbpedia.org/resource/Albert_Einstein> dbp:birthPlace ? .

Then as processing continues, the RDFa Processor encounters the
subject of the statement about the long name for the German Empire,
and this is used in two ways. First it is used to complete the
'incomplete triple':

<http://dbpedia.org/resource/German_Empire>
dbp:conventionalLongName "the German Empire" .

Note that each occurrence of @about will complete any
incomplete triples. For example, to mark up the fact that Albert
Einstein had a residence both in the German Empire and Switzerland, an
author need only specify one @rel value that is then used
with multiple @about values:

These examples show how @about completes triples, but
there are other situations that can have the same effect. For example,
when @typeof creates a new bnode (as
described above), that will be used to complete any 'incomplete
triples'. To indicate that Spinoza influenced both
Einstein and Schopenhauer, the following markup could be used:

This example has two 'hanging rels', and so two situations when
'incomplete triples' will be created. Processing would proceed as
follows; first an incomplete triple is stored:

Example 95

<http://dbpedia.org/resource/Baruch_Spinoza> dbp-owl:influenced ? .

Next, the RDFa Processor processes the predicate values for foaf:name,
dbp:dateOfBirth and dbp-owl:residence, but
note that only the first needs to 'complete' the 'hanging rel'. So
processing foaf:name generates two triples:

8.3 Object resolution

Although objects have been discussed in the previous sections, as
part of the explanation of subject resolution, chaining, evaluation
contexts, and so on, this section will look at objects in more detail.

A literal object can be set by @content or the inline
text of element if @property to express a predicate.
Note that the use of @content prohibits the inclusion of
rich markup in your literal. If the inline content of an element
accurately represents the object, then documents should rely upon
that rather than duplicating that data using the @content.

An IRI resource object can be set using one of @rel
or @rev to express a predicate, and then either
using one of @href, @resource or @src
to provide an object resource explicitly, or using the
chaining techniques described above to obtain an object from a nested
subject, or from a bnode. Alternatively, the @property
can also be used to define an IRI resource; this requires the presence of a
@resource, @href, or @srcand the
absence of @rel, @rev, @datatype,
or @content.

An object literal will be generated when @property
is present and no resource attribute is present. @property provides the predicate, and the
following sections describe how the actual literal to be generated
is determined.

8.3.1.1.1 Language Tags

In RDFa the Host Language may provide a mechanism for setting
the language tag. In XHTML+RDFa [XHTML-RDFA], for example,
the XML language attribute @xml:lang
or the attribute @lang is used to add
this information, whether the plain literal is designated by @content,
or by the inline text of the element:

This requires that an IRI mapping for the prefix rdf
has been defined.

In the examples given here the sup element is
actually part of the meaning of the literal, but there will be
situations where the extra markup means nothing, and can therefore
be ignored. In this situation omitting the @datatype
attribute or specifying an empty @datatype value can
be used to create a plain literal:

8.3.2 IRI object resolution

Most of the rules governing the processing of objects that are
resources are to be found in the processing descriptions given
above, since they are important for establishing the subject. This
section aims to highlight general concepts, and anything that might
have been missed.

One or more IRI objects are needed when @rel or
@rev is present. Each
attribute will cause triples to be generated when used with @href,
@resource or @src, or with the subject
value of any nested statement if none of these attributes are
present.

It's also possible to use both @rel and @rev
at the same time on an element. This is particularly useful when
two things stand in two different relationships with each other,
for example when a picture is taken by Mark, but that
picture also depicts him:

8.3.2.3 Incomplete triples

When a triple predicate has been expressed using @rel
or @rev, but no @href, @src,
or @resource exists on the same element, there is a
'hanging rel'. This causes the current subject and all possible
predicates (with an indicator of whether they are 'forwards, i.e.,
@rel values, or not, i.e., @rev values),
to be stored as 'incomplete triples' pending discovery of a
subject that could be used to 'complete' those triples.

There is no way for an application to rely on the relative order of
the two triples when, for example, querying a database containing
these triples. For most of the applications and data sets this is not
a problem, but, in some cases, the order is important. A typical case
is publications: when a book or an article has several co-authors, the
order of the authors may be important.

RDF has a set of predefined predicates that have an agreed-upon
semantic of order. For example, the publication: "Semantic Annotation
and Retrieval, by Ben Adida, Mark Birbeck, and Ivan Herman" could be
described in RDF triples using these terms as follows:

It would of course be possible to reproduce the same structure in
RDFa, using the RDF predicates rdf:first, rdf:rest,
as well as the special resource rdf:nil. However, to
make this easier, RDFa provides the @inlist. What this
attribute signals is that the object generated on that element should
be put on a list; the list is used with the common predicate
and subject. Here is how the previous structure could look in
RDFa:

Incomplete Triples can also be
used in conjunction with lists when all list elements are resources
and not literals. For example, the previous example, this time with all
three authors referring to their FOAF profile, could have been written
as:

Note that it is also possible to express an empty list,
without @inlist, using:

Example 129

<span rel="prop" resource="rdf:nil"/>

9. RDFa Initial Contexts

RDFa permits Host Languages to define an initial context.
Such a context is a collection of terms, prefix mappings, and/or a default
vocabulary declaration. An initial context is either intrinsically
known to the parser, or it is loaded as external documents and
processed. These documents MUST be defined in an approved RDFa Host
Language (currently XML+RDFa, XHTML+RDFa [XHTML-RDFA], and HTML+RDFa [HTML-RDFA]).
They MAY also be defined in other formats (e.g., RDF/XML
[RDF-SYNTAX-GRAMMAR], or Turtle [TURTLE]). When an initial
context document is processed, it is evaluated as follows:

Parse the content (according to the processing rules for that
document type) and extract the triples into a collection associated
with that IRI. Note: These triples MUST NOT be co-mingled with the
triples being extracted from any other IRI.

For every subject with a pair of predicates that have the values rdfa:prefix
and rdfa:uri, create a key-value mapping from the rdfa:prefix
object literal (the key) to the rdfa:uri object literal
(the value). Add this mapping to the list of IRI mappings
of the initial evaluation context, after
transforming the 'prefix' component to lower-case.

For every subject with a pair of predicates that have the values rdfa:term
and rdfa:uri, create a key-value mapping from the rdfa:term
object literal (the key) to the rdfa:uri object literal
(the value). Add this mapping to the term mappings of
the initial evaluation context.

When an RDFa Initial Context is defined using an RDF serialization, it
MUST use the vocabulary terms above to declare the components of the
context.

Note

Caching of the relevant triples retrieved via this
mechanism is RECOMMENDED. Embedding definitions for well known, stable
RDFa Initial Contexts in the implementation is RECOMMENDED.

Note

The object literal for the rdfa:uri
predicate MUST be an absolute IRI.

The object literal for the rdfa:term
predicate MUST match the production for term.

The
object literal for the rdfa:prefix predicate must match
the production for prefix.

The object literal
for the rdfa:vocabulary predicate MUST be an
absolute IRI.

If one of the objects is not a literal, does not match its associated
production, if there is more than one rdfa:vocabulary
predicate, or if there are additional rdfa:uri or rdfa:term
predicates sharing the same subject, an RDFa Processor MUST NOT create
the associated mapping.

10. RDFa Vocabulary Expansion

Since RDFa is based on RDF, the semantics of RDF vocabularies can be
used to gain more knowledge about data. Vocabularies, properties and
classes are identified by IRIs, which enables them to be discoverable.
RDF data published at the location of these IRIs can be retrieved, and
descriptions of the properties and classes using specified semantics can
be applied.

RDFa Vocabulary Expansion is an optional processing step which may be
added once the normal processing steps described in Processing
Model are complete. Vocabulary expansion relies on a very small
sub-set of OWL entailment [OWL2-OVERVIEW] to add triples to the output
graph based on rules and property/class relationships described
in referenced vocabularies. Vocabulary expansion MAY be performed as
part of a larger RDF toolset including, for example, an OWL 2 RL
reasoner. Alternatively, using vocabulary data added to the output
graph in processing step 2 of Sequence,
expansion MAY also be done using a separate and dedicated (e.g., rule
based) reasoner after the output graph has been generated,
or as the last processing step by an RDFa processor.

It can be very useful to make generalized data available for
subsequent usage of RDFa-embedded data by expanding inferred statements
entailed by these semantics. This provides for existing vocabularies
that extend well-known vocabularies to have those properties added to
the output graph automatically. For example, the namespace document of
the Creative Commons vocabulary, i.e., http://creativecommons.org/ns,
defines cc:license to be a sub-property of dc:license.
By using the @vocab attribute, one can describe a licensing
information as follows:

Other vocabularies, specifically intended to provide relations to
multiple vocabularies, could also be defined by publishers, allowing use
of terms in a single namespace which result in properties and/or classes
from other primary vocabularies being imported. This benefits publishers
as data is now more widely searchable and encourages the practice of
referencing well-known vocabularies.

A vocabulary graph is created as follows:
Each object IRI in the output graph that has a subject the current
document (base) IRI and a predicate of
rdfa:usesVocabulary is dereferenced.
If the dereferencing yields the serialization of an RDF
graph, that serialization is parsed and the resulting graph is merged
with the vocabulary graph. (An RDFa processor capable of vocabulary
expansion MUST accept an RDF graph serialized in RDFa, and SHOULD
accept other standard serialization formats of RDF such as RDF/XML
[RDF-SYNTAX-GRAMMAR] and Turtle [TURTLE].)

Note

Note that if, in the second step, a particular
vocabulary is serialized in RDFa, that particular graph is not
expected to undergo any vocabulary expansion on its own.

The goal of the second step is to avoid adding
the "axioms", e.g., the sub-property definitions to the output graph.
Applications usually do not require any of this additional information.

10.1.1 RDFa Vocabulary Entailment

For the purpose of vocabulary processing, RDFa used a very
restricted subset of the OWL vocabulary and is based on the RDF-Based
Semantics of OWL [OWL2-RDF-BASED-SEMANTICS]. The RDFa
Vocabulary Entailment uses the following terms:

rdf:type

rdfs:subClassOf

rdfs:subPropertyOf

owl:equivalentClass

owl:equivalentProperty

Note

RDFa Vocabulary Entailment considers only the entailment on individuals
(i.e., not on the relationships that can be deduced on the
properties or the classes themselves.)

Note

While the formal definition of the RDFa Entailment
refers to the general OWL 2 Semantics, practical implementations may
rely on a subset of the OWL 2 RL Profile’s entailment expressed in
rules (section
4.3 of [OWL2-PROFILES]). The
relevant rules are, using the rule identifications in section
4.3 of [OWL2-PROFILES]): prp-spo1, prp-eqp1,
prp-eqp2, cax-sco, cax-eqc1,
and cax-eqc2.

The entailment described in this section is the minimum
useful level for RDFa. Processors may, of course, choose to follow
more powerful entailment regimes, e.g., include full RDFS [RDF11-MT]
or OWL [OWL2-OVERVIEW] entailments. Using those entailments
applications may perform datatype validation by checking rdfs:range
of a property, or use the advanced facilities offered by, e.g., OWL’s
property chains to interlink vocabularies further.

10.2 Vocabulary Expansion Control of RDFa Processors

Conforming RDFa processors are not required to provide vocabulary
expansion.

If an RDFa processor provides vocabulary expansion, it MUST NOT be
performed by default. Instead, the processor MUST provide an option, vocab_expansion,
which, when used, instructs the RDFa processor to perform a vocabulary
expansion before returning the output graph.

Note

Although vocabulary expansion is described in terms of
a vocabulary graph and OWL 2 entailment rules, processors
are free to use any process which obtains equivalent results.

10.2.1 Notes to RDFa Vocabulary Implementations and Publishing

This section is non-normative.

For RDFa Processors caching the relevant graphs retrieved via this
mechanism is RECOMMENDED. Caching is usually based on HTTP response
headers like expiration time, cache control, etc.

For publishers of vocabularies, the IRI for the vocabularies SHOULD
be dereferenceable, and should return an RDF graph with the
vocabulary description. This vocabulary description SHOULD be
available encoded in RDFa, and MAY also be available in other RDF
serialization syntaxes (using content negotiation to choose among
the different formats). If possible, vocabulary descriptions SHOULD
include subproperty and subclass statements linking the vocabulary
terms to other, well-known vocabularies. Finally, HTTP responses
SHOULD include fields usable for cache control, e.g., expiration
date.

A. CURIE Datatypes

In order to facilitate the use of CURIEs in markup languages, this
specification defines some additional datatypes in the XHTML datatype
space (http://www.w3.org/1999/xhtml/datatypes/). Markup
languages that want to import these definitions can find them in the
"datatypes" file for their schema grammar:

B. The RDFa Vocabulary

The RDFa Vocabulary has three roles: it contains the predicates to
define the terms and prefixes in initial context
documents, it contains the classes and predicates for the messages that
a processor graph may contain and, finally, it contains
the predicate necessary for vocabulary processing. The IRI of the
vocabulary is http://www.w3.org/ns/rdfa#; the usual prefix
used in this document is rdfa.

B.1 Term and Prefix Assignments

The RDFa Vocabulary includes the following triples (shown here in
Turtle [TURTLE] format):

Example 135

@prefix dc: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<http://www.w3.org/ns/rdfa#> a owl:Ontology .
rdfa:PrefixOrTermMapping a rdfs:Class, owl:Class ;
dc:description "The top level class for prefix or term mappings." .
rdfa:PrefixMapping dc:description "The class for prefix mappings." .
rdfs:subClassOf rdfa:PrefixOrTermMapping .
rdfa:TermMapping dc:description "The class for term mappings." .
rdfs:subClassOf rdfa:PrefixOrTermMapping .
rdfa:prefix a rdf:Property, owl:DatatypeProperty ;
rdfs:domain rdfa:PrefixMapping ;
dc:description "Defines a prefix mapping for an IRI; the value is supposed to be a NMTOKEN." .
rdfa:term a rdf:Property, owl:DatatypeProperty ;
rdfs:domain rdfa:TermMapping ;
dc:description "Defines a term mapping for an IRI; the value is supposed to be a NMTOKEN." .
rdfa:uri a rdf:Property, owl:DatatypeProperty ;
rdfs:domain rdfa:PrefixOrTermMapping ;
dc:description """Defines the IRI for either a prefix or a term mapping;
the value is supposed to be an absolute IRI.""" .
rdfa:vocabulary a rdf:Property, owl:DatatypeProperty ;
dc:description """Defines an IRI to be used as a default vocabulary;
the value is can be any string; for documentation purposes it is advised to use
the string ‘true’ or ‘True’.""" .

These predicates can be used to define the initial context
for a given Host Language.

These predicates are used to 'pair' IRI strings and their usage in
the form of a prefix and/or a term as part of, for example, a blank
node. An example can be as follows:

Example 136

[] rdfa:uri "http://xmlns.com/foaf/0.1/name" ;
rdfa:prefix "foaf" .

which defines a prefix for the FOAF IRI.

B.2 Processor Graph Reporting

The Vocabulary includes the following term definitions (shown here in
Turtle [TURTLE] format):

Example 137

@prefix dc: <http://purl.org/dc/terms/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdfa: <http://www.w3.org/ns/rdfa#> .
rdfa:PGClass a rdfs:Class, owl:Class;
dc:description "The top level class of the hierarchy." .
rdfa:Error dcterms:description "The class for all error conditions.";
rdfs:subClassOf rdfa:PGClass .
rdfa:Warning dcterms:description "The class for all warnings.";
rdfs:subClassOf rdfa:PGClass .
rdfa:Info dcterms:description "The class for all informations.";
rdfs:subClassOf rdfa:PGClass .
rdfa:DocumentError dc:description "An error condition to be used when the document
fails to be fully processed as a result of non-conformant host language markup.";
rdfs:subClassOf rdfa:Error .
rdfa:VocabReferenceError dc:description "A warning to be used
when the value of a @vocab attribute cannot be dereferenced, hence the vocabulary expansion
cannot be completed.";
rdfs:subClassOf rdfa:Warning .
rdfa:UnresolvedTerm dc:description "A warning to be used when a Term fails to be resolved.";
rdfs:subClassOf rdfa:Warning .
rdfa:UnresolvedCURIE dc:description "A warning to be used when a CURIE prefix
fails to be resolved.";
rdfs:subClassOf rdfa:Warning .
rdfa:context a owl:ObjectProperty, rdf:Property;
dc:description "Provides extra context for the error, e.g., http response,
an XPointer/XPath information, or simply the IRI that created the error.";
rdfs:domain rdfa:PGClass .

B.3 Term for vocabulary expansion

The Vocabulary includes the following term definitions (shown here in
Turtle [TURTLE] format):

C. Changes

C.1 Major differences since the Last Published Recommentation

References to the other RDFa 1.1 documents, as well as to RDF 1.1 documents, have been updated

A minor clarification has been added to section 4.1. to the processors can return processor and output graphs

C.2 Major differences with RDFa Syntax 1.0

This specification introduces a number of new features, and extends
the behavior of some features from the previous version. The
following summary may be helpful to RDFa Processor developers, but
is not meant to be comprehensive.

Specific rules about XHTML have been moved into a companion
specification: [XHTML-RDFA].

Prefix mappings can now be declared using @prefix
in addition to @xmlns. The usage of @xmlns
has been deprecated.

Prefix names are now required to be converted to lower-case when
the mapping is defined. Prefixes are checked in a case-insensitive
manner during CURIE expansion.

You can now use an Absolute IRI everywhere you could previously
only use a CURIE (e.g., in the value of @datatype).

There is now a concept of a term. This concept has
replaced the concept of a 'reserved word'. It is possible now to
use a 'term' in most places where you could previously only use a
CURIE.

You can define a default prefix mapping (via @vocab)
that will be used on undefined terms.

When a triple would include an object literal, and there is no
explicit datatype attribute, the object literal will now be a
'plain literal'. In version 1.0 it would have been an
'XMLLiteral'.

The @inlist attribute can be used to instruct the
processor to generate RDF lists with the resources rather than
simple triples.

The effect of @src is now identical to @href
rather than @about like in version 1.0.

While this specification strives to be as backward compatible as
possible with [RDFA-SYNTAX], the changes above mean that there are
some circumstances where it is possible for different RDF triples to
be output for the same document when processed by an RDFa 1.0
processor vs. an RDFa 1.1 processor. In order to minimize these
differences, a document author can do the following: