The Resource Description Framework (RDF) is a framework for
representing information in the Web.

RDF Concepts and Abstract Syntax defines an abstract syntax
on which RDF is based, and which serves to link its concrete
syntax to its formal semantics. It also includes discussion of
design goals, key concepts, datatyping, character normalization
and handling of URI references.

This section describes the status of this document at the time of its
publication. Other documents may supersede this document. A list of current W3C
publications and the latest revision of this technical report can be found in
the W3C technical reports index at
http://www.w3.org/TR/.

Publication as a Proposed Recommendation
does not imply endorsement by the W3C Membership. This is a draft
document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite
this document as other than work in progress.

W3C Advisory Committee Representatives are now invited to submit their
formal review via Web form, as described in the Call
for Review. Additional comments may be sent to a Team-only list, w3t-semweb-review@w3.org.
The public is invited to send comments to
www-rdf-comments@w3.org
(archive)
and to participate in general discussion of related technology
on www-rdf-interest@w3.org (archive).
The review period extends until 19 January 2004.

The Resource Description Framework (RDF) is a framework for
representing information in the Web.

This document defines an abstract syntax on which RDF is based,
and which serves to link its concrete syntax to its formal
semantics.
This abstract syntax is quite distinct from XML's tree-based infoset [XML-INFOSET]. It also includes discussion of design goals,
key concepts, datatyping, character normalization
and handling of URI references.

Within this document, normative sections are explicitly labelled as such.
Explicit notes are informative.

The framework is designed so that vocabularies can be layered.
The RDF and RDF vocabulary definition (RDF schema)
languages
[RDF-VOCABULARY] are the first
such vocabularies.
Others (cf. OWL [OWL] and
the applications mentioned in the primer
[RDF-PRIMER]) are in development.

To do for machine processable information (application data)
what the World Wide Web has done for hypertext: to allow data to
be processed outside the particular environment in which it was
created, in a fashion that can work at Internet scale.

Interworking among applications: combining data from several
applications to arrive at new information.

Automated processing of Web information by software agents:
the Web is moving from having just human-readable information to
being a world-wide network of cooperating processes. RDF provides
a world-wide lingua franca for these processes.

RDF is designed to represent information in a minimally
constraining, flexible way. It can be used in isolated
applications, where individually designed formats
might be more direct and easily understood, but RDF's generality offers greater value from
sharing. The value of information thus increases as it becomes
accessible to more applications across the entire Internet.

RDF has a simple data model that is easy for applications to
process and manipulate. The data model is independent of any
specific serialization syntax.

Note: the term "model" used here in "data model" has a
completely different sense to its use in the term "model theory".
See [RDF-SEMANTICS]
for more information about "model
theory" as used in the literature of mathematics and logic.

RDF has a formal semantics which provides a dependable basis for
reasoning about the meaning of an RDF expression. In particular, it
supports rigorously defined notions of entailment which provide a
basis for defining reliable rules of inference in RDF data.

To facilitate operation at Internet scale, RDF is an
open-world framework that allows anyone to make statements
about any resource.

In general, it is not assumed that complete information
about any resource is available. RDF does not prevent anyone
from making assertions that are nonsensical or inconsistent
with other statements, or the world as people see it. Designers
of applications that use RDF should be aware of this and may
design their applications to tolerate incomplete or
inconsistent sources of information.

The underlying structure of any expression in RDF is a
collection of triples, each consisting of a subject, a
predicate and an object. A set of such triples is called an RDF
graph (defined more formally in
section 6). This can be
illustrated by a node and directed-arc diagram, in which each
triple is represented as a node-arc-node link (hence the term
"graph").

Each triple represents a statement of a relationship between
the things denoted by the nodes that it links. Each triple has
three parts:

The assertion of an RDF triple says that some relationship,
indicated by the predicate, holds between the things denoted by
subject and object of the triple. The assertion of an RDF graph
amounts to asserting all the triples in it, so the meaning of
an RDF graph is the conjunction (logical AND) of the statements
corresponding to all the triples it contains. A formal account
of the meaning of RDF graphs is given in [RDF-SEMANTICS].

A node may be a URI with optional fragment identifier (URI reference, or URIref), a literal,
or blank (having no separate form of identification).
Properties are URI references. (See [URI], section 4, for a description of URI
reference forms, noting that relative URIs are not used in an
RDF graph. See also section
6.4.)

A URI reference or literal used as a node identifies what
that node represents. A URI reference used as a predicate
identifies a relationship between the things represented by the nodes it connects. A
predicate URI reference may also be a node in the graph.

A blank node is a node that is
not a URI reference or a literal. In the RDF abstract syntax, a
blank node is just a unique node that can be used in one or
more RDF statements, but has no intrinsic name.

A convention used by some linear representations of an RDF
graph to allow several statements to reference the same
unidentified resource is to use a blank node
identifier, which is a local identifier that can be
distinguished from all URIs and literals. When graphs are
merged, their blank nodes must be kept distinct if meaning is
to be preserved; this may call for re-allocation of blank node
identifiers. Note that such blank node identifiers are not part
of the RDF abstract syntax, and the representation of triples
containing blank nodes is entirely dependent on the particular
concrete syntax used.

Datatypes are used by RDF in the representation of values such
as integers, floating point numbers and dates.

A datatype consists of a lexical space, a value space and a lexical-to-value
mapping, see section 5.

For example, the lexical-to-value mapping for the XML Schema datatype
xsd:boolean, where each member of the value space
(represented here as 'T' and 'F') has two lexical representations,
is as follows:

There is no built-in concept of numbers or dates or other common
values. Rather, RDF defers to datatypes that are defined
separately, and identified with URI references.
The predefined XML Schema
datatypes [XML-SCHEMA2] are expected
to be widely used for this purpose.

RDF provides no mechanism for defining new datatypes. XML Schema
Datatypes [XML-SCHEMA2] provides an
extensibility framework suitable for defining new datatypes for use
in RDF.

Literals are used to identify values such as numbers and dates
by means of a lexical representation. Anything represented by a
literal could also be represented by a URI, but it is often more
convenient or intuitive to use literals.

A literal may be the object of an RDF statement, but not the
subject or the predicate.

Literals may be plain or typed :

A plain literal is a string combined
with an optional language tag. This may be used for
plain text in a natural language. As recommended in the RDF
formal semantics [RDF-SEMANTICS], these plain literals are
self-denoting.

A typed literal is a string combined with a
datatype URI. It denotes the
member of the identified datatype's value space obtained by
applying the lexical-to-value mapping to the literal string.

Continuing the example from section
3.3, the typed literals that can be defined using the XML
Schema datatype xsd:boolean are:

Typed Literal

Lexical-to-Value Mapping

Value

<xsd:boolean, "true">

<"true", T>

T

<xsd:boolean, "1">

<"1", T>

T

<xsd:boolean, "false">

<"false", F>

F

<xsd:boolean, "0">

<"0", F>

F

For text that may contain
markup, use typed literals
with type rdf:XMLLiteral.
If language annotation is required,
it must be explicitly included as markup, usually by means of an
xml:lang attribute.
[XHTML] may be included within RDF
in this way. Sometimes, in this latter case,
an additional span or div
element is needed to carry an
xml:lang or lang attribute.

Some simple facts indicate a relationship between
two things.
Such a fact may be represented as an RDF triple in which the predicate
names the relationship, and the subject and object denote the two things.
A familiar representation of such a fact might be
as a row in a table in a relational database. The table has
two columns, corresponding to the subject and the object of the
RDF triple.
The name of the table corresponds to the predicate
of the RDF triple. A further familiar representation may be as a
two place predicate
in first order logic.

Relational databases permit a table to have an arbitrary number of columns,
a row of which expresses information corresponding to a predicate in first
order logic with an arbitrary number of places. Such a row, or predicate,
has to be decomposed for representation as RDF triples. A simple form of
decomposition introduces a new blank node, corresponding to the row, and a
new triple is introduced for each cell in the row. The subject of each
triple is the new blank node, the predicate corresponds to the column name,
and object corresponds to the value in the cell. The new blank node may
also have an rdf:type property whose value corresponds
to the table name.

This information might correspond to a row in a table "STAFFADDRESSES",
with a primary key
STAFFID,
and additional columns
STREET,
STATE,
CITY and
POSTALCODE.

Thus, a more complex fact is expressed in RDF using a
conjunction (logical-AND) of simple binary relationships. RDF does not
provide means to express negation (NOT) or disjunction (OR).

Through its use of extensible URI-based vocabularies, RDF
provides for expression of facts about arbitrary subjects; i.e.
assertions of named properties about specific named things. A URI
can be constructed for any thing that can be named, so RDF facts
can be about any such things.

The ideas on meaning and inference in RDF are underpinned by the
formal concept of entailment, as
discussed in the RDF
semantics document [RDF-SEMANTICS].
In brief, an RDF expression A is said to
entail another RDF expression B if every possible
arrangement of things in the world that makes A true also makes B
true. On this basis, if the truth of A is presumed or demonstrated
then the truth of B can be inferred .

RDF uses URI references to identify resources and properties. Certain
URI references are given specific meaning by RDF. Specifically, URI
references with the following leading substring are defined by the RDF
specifications:

Used with the RDF/XML serialization, this URI prefix
string corresponds to XML namespace names [XML-NS] associated with the RDF
vocabulary terms.

Note: this namespace name is the same
as that used in the earlier RDF recommendation [RDF-MS].

Vocabulary terms in the rdf:
namespace are listed in
section 5.1 of the RDF syntax specification [RDF-SYNTAX]. Some of these terms are
defined by the RDF specifications to denote specific concepts.
Others have syntactic purpose (e.g. rdf:ID is part of
the RDF/XML syntax).

Each member of the lexical space is paired with (maps to) exactly one member
of the value space.

Each member of the value space may be paired with any number (including
zero) of members of the lexical space (lexical representations for that
value).

A datatype is identified by one or more URI references.

RDF may be used with any datatype definition that conforms to this
abstraction, even if not defined in terms of XML Schema.

Certain XML Schema built-in datatypes are not suitable for use
within RDF. For example, the
QName
datatype requires a namespace declaration to be in scope during
the mapping, and is not recommended for use in RDF.
[RDF-SEMANTICS] contains
a
more detailed discussion
of specific XML Schema built-in datatypes.

Note: When the datatype is defined using XML Schema:

All values correspond to some lexical form, either using
the lexical-to-value mapping of the datatype or if it is a union
datatype with a lexical mapping associated with one of the member
datatypes.

XML Schema facets remain part of the datatype and are used by the XML
Schema mechanisms that control the lexical space and the value space;
however, RDF does not define a standard mechanism to access these facets.

This section defines the RDF abstract syntax. The RDF abstract
syntax is a set of triples, called the RDF graph.

This section also defines equivalence between RDF graphs. A
definition of equivalence is needed to support the RDF Test Cases [RDF-TESTS] specification.

Implementation Note:
This abstract syntax is the
syntax over which the formal semantics are defined.
Implementations are free to represent RDF graphs in
any other equivalent form. As an example:
in an RDF graph,
literals with datatype rdf:XMLLiteral can be represented
in a non-canonical
format, and canonicalization performed during the comparison between two
such literals. In this example the comparisons may be
being performed either between syntactic structures or
between their denotations in the domain of discourse.
Implementations that do not require any such comparisons can
hence be optimized.

and
would produce a
valid URI character sequence (per RFC2396 [URI], sections 2.1)
representing an absolute URI with optional
fragment identifier
when subjected to the encoding described below.

The encoding consists of:

encoding the Unicode string as UTF-8
[RFC-2279], giving a sequence of octet values.

%-escaping octets that do not correspond to permitted US-ASCII characters.

The disallowed octets that must be %-escaped include all those that do not
correspond to US-ASCII characters, and the excluded characters listed in
Section 2.4 of [URI], except for the number sign (#), percent sign (%),
and the square bracket characters re-allowed in [RFC-2732].

Disallowed octets must be escaped with the URI escaping mechanism (that is, converted to %HH,
where HH is the 2-digit hexadecimal numeral corresponding to the octet value).

Two RDF URI references are equal if and only if they compare as
equal, character by character, as Unicode strings.

Note: RDF URI references are compatible with the
anyURI datatype as defined by XML schema datatypes [XML-SCHEMA2], constrained to be an
absolute rather than a relative URI reference.

Note: this section anticipates an RFC on Internationalized Resource
Identifiers. Implementations may issue warnings concerning the use
of RDF URI References that do not conform with [IRI draft] or its
successors.

Note: The restriction to absolute URI references is
found in this abstract syntax. When there is a well-defined base
URI, concrete syntaxes, such as RDF/XML, may permit relative URIs
as a shorthand for such absolute URI references.

Note: Because of the risk of confusion between
RDF URI references that would
be equivalent if derefenced, the use of %-escaped characters in RDF URI
references is strongly discouraged. See also the
URI equivalence issue of
the Technical Architecture Group [TAG].

Note: Literals in which the lexical form begins with a
composing character (as defined by [CHARMOD]) are allowed however they may cause
interoperability problems, particularly with XML version 1.1 [XML 1.1].

Note: When using the language tag, care must be
taken not to confuse language with locale. The language
tag relates only to human language text. Presentational
issues should
be addressed in end-user applications.

Note: The case normalization of
language tags is part of
the description of the abstract syntax, and consequently the abstract
behaviour of RDF applications. It does not constrain an
RDF implementation to actually normalize the case. Crucially, the result
of comparing two language tags should not be sensitive to the case of
the original input.

The strings of the two lexical forms compare equal, character
by character.

Either both or neither have language tags.

The language tags, if any, compare
equal.

Either both or neither have datatype URIs.

The two datatype URIs, if any, compare equal, character by
character.

Note: RDF Literals are distinct and distinguishable
from RDF URI references; e.g. http://example.org as an RDF
Literal (untyped, without a language tag) is not equal to
http://example.org as an RDF URI reference.

The datatype URI refers to a datatype. For XML Schema
built-in datatypes, URIs such as
http://www.w3.org/2001/XMLSchema#int are used. The URI
of the datatype rdf:XMLLiteral may be used.
There may be other, implementation dependent, mechanisms by which
URIs refer to datatypes.

The value associated with a typed literal is found by
applying the lexical-to-value mapping associated with the datatype URI to
the lexical form.

If the lexical form is not in
the lexical space of the datatype associated with the datatype URI,
then no literal value can be associated with the typed literal.
Such a case, while in error, is not syntactically ill-formed.

Note:
In application contexts, comparing the values of typed literals (see
section
6.5.2)
is usually more helpful than comparing their syntactic forms (see
section
6.5.1).
Similarly, for comparing RDF Graphs,
semantic notions of entailment (see
[RDF-SEMANTICS]) are usually
more helpful than syntactic equality (see
section
6.3).

RDF uses an RDF URI
Reference, which may include a fragment identifier, as a
context free identifier for a resource. RFC 2396 [URI] states that the meaning of a fragment
identifier depends on the MIME content-type of a document, i.e.
is context dependent.

These apparently conflicting views are reconciled by
considering that a URI reference in an RDF graph is treated
with respect to the MIME type application/rdf+xml [RDF-MIME-TYPE]. Given an RDF URI
reference consisting of an absolute URI and a fragment
identifier, the fragment identifer identifies the same thing
that it does in an application/rdf+xml representation of the
resource identified by the absolute URI component. Thus:

we assume that the URI part (i.e. excluding fragment
identifier) identifies a resource, which is presumed to have
an RDF representation. So when eg:someurl#frag is used in an RDF
document, eg:someurl is taken to
designate some RDF document (even when no such document can
be retrieved).

eg:someurl#frag means the thing
that is indicated, according to the rules of the application/rdf+xml MIME content-type as
a "fragment" or "view" of the RDF document at eg:someurl. If the document does not
exist, or cannot be retrieved, or is available only in
formats other than application/rdf+xml, then exactly what
that view may be is somewhat undetermined, but that does not
prevent use of RDF to say things about it.

the RDF treatment of a fragment identifier allows it to
indicate a thing that is entirely external to the document,
or even to the "shared information space" known as the Web.
That is, it can be a more general idea, like some particular
car or a mythical Unicorn.

in this way, an application/rdf+xml document acts as an
intermediary between some Web retrievable documents (itself,
at least, also any other Web retrievable URIs that it may
use, possibly including schema URIs and references to other
RDF documents), and some set of possibly abstract or non-Web
entities that the RDF may describe.

This provides a handling of URI references and their
denotation that is consistent with the RDF model theory and
usage, and also with conventional Web behavior. Note that
nothing here requires that an RDF application be able to
retrieve any representation of resources identified by the URIs
in an RDF graph.

This document contains a significant contribution from Pat
Hayes, Sergey Melnik and Patrick Stickler, under whose leadership
was developed the framework described in the RDF family of
specifications for representing datatyped values, such as integers
and dates.

9.1 Normative References

[RDF-SEMANTICS]

RDF
Semantics, Hayes P. (Editor), W3C Proposed Recommendation (work in progress), 15 December 2003. This
version is http://www.w3.org/TR/2003/PR-rdf-mt-20031215/.
The latest version
is http://www.w3.org/TR/rdf-mt/.

Namespaces in
XML, T. Bray, D. Hollander and A. Layman, Editors.
World Wide Web Consortium. 14 January 1999. This version is
http://www.w3.org/TR/1999/REC-xml-names-19990114/.
The latest version
of Namespaces in XML is available at
http://www.w3.org/TR/REC-xml-names/.

The Unicode Standard, Version 3, The Unicode
Consortium, Addison-Wesley, 2000. ISBN 0-201-61633-5, as updated
from time to time by the publication of new versions. (See http://www.unicode.org/unicode/standard/versions/
for the latest version and additional information on versions of
the standard and of the Unicode Character Database).

XML Schema
Part 2: Datatypes, W3C Recommendation, World Wide Web
Consortium, 2 May 2001.This version is
http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/. The latest version is available at
http://www.w3.org/TR/xmlschema-2/.

9.2 Informational
References

[RDF-TESTS]

RDF
Test Cases, Grant J., Beckett D. (Editors), W3C Proposed Recommendation (work in progress), 15 December 2003. This
version is
http://www.w3.org/TR/2003/PR-rdf-testcases-20031215/. The latest
version is http://www.w3.org/TR/rdf-testcases/.

RDF
Primer, Manola F., Miller E., Editors, W3C Proposed
Recommendation (work in progress), 15 December 2003. This
version is
http://www.w3.org/TR/2003/PR-rdf-primer-20031215/. The latest version is at
http://www.w3.org/TR/rdf-primer/.

XML Schema Part 1: Structures
W3C Recommendation, World Wide Web
Consortium, 2 May 2001.
This version is
http://www.w3.org/TR/2001/REC-xmlschema-1-20010502/. The latest version is available at
http://www.w3.org/TR/xmlschema-1/.

[XML-NAMESPACES-1.1]

Namespaces
in XML 1.1, Tim Bray, Dave Hollander, Andrew Layman,
Richard Tobin, Editors. W3C Proposed Recommendation 05 November 2003.
This version is
http://www.w3.org/TR/2003/PR-xml-names11-20031105/. The latest version is available at
http://www.w3.org/TR/xml-names11/.

[XML-INFOSET]

XML
Information Set, John Cowan and Richard Tobin, W3C
Recommendation, 24 October 2001. This document is
http://www.w3.org/TR/2001/REC-xml-infoset-20011024/.
The latest version is available at
http://www.w3.org/TR/xml-infoset/.