Commenting on a snapshot, taken at about dusk UK time 11/Nov/2002,
of
C:\data\HP\dynamic\public\2002\20021030-reviewingRDFCoreDocs\mt\RDF_Semantics_latest.html

Critical issues which must be
addressed at this publication are strong orange and are enumerated here.
Suggested replacement text is in normal orange.

There are no critical issues at this time.

Issues which should be addressed
before last call are in strong purple - with suggested replacement
text in normal purple.

Stylistic issues are in strong green.
These are suggestions to the editor, which the editor may accept or reject
at his discretion without further discussion. Suggested replacement
text in normal green.

Status of this Document

This document is a W3C Working Draft for review by W3C members and other
interested parties. It is a draft document and may be updated, replaced, or
obsoleted by other documents at any time. It is inappropriate to use W3C
Working Drafts as reference material or to cite them as other than "work in
progress". A list of current public W3C Working Drafts can be found as part
of the W3C Technical Reports and
Publications.

[[[Editor's Note: further work on the overlap with
[RDF_CONCEPTS] is needed. Some of the following discussion may be replaced by
content in other documents.]]]

RDF is intended to be used to convey meanings. It is hard to give a
precise characterization of what this means, and the semantics given here
restricts itself to a formal notion of meaning which could be characterized
as the part that is common to all other accounts of meaning. Exactly what is
considered to be the 'meaning' of an assertion in RDF in some broad sense may
depend on many factors, including social conventions, comments in natural
language or links to other content-bearing documents. Most of this meaning
will be inaccessible to machine processing and is mentioned here only to
emphasize that the formal semantics described here is not intended to provide
an analysis of 'meaning' in this broad sense. However, users should operate
under the basic assumption that any such meaning is preserved by any formal
inference processes which preserve truth in the formal sense, so that any
terms in a formally sanctioned conclusion from a set of RDF graphs can be
interpreted as carrying the same informal meanings that they had in the
original graphs. We note in passing that this condition raises many complex
questions about how such informal meanings derived from several sources
should be combined, and that these questions are also not addressed by the
formal semantics. See [RDF_CONCEPTS] for further discussion. It has been
argued that certain sources of RDF assertions should be taken as more
authoritative or more reliable than others, in particular that assertions
made by the 'owner' of a uriref should be considered to be definitive in
determining the meaning of those urirefs. The semantics given here takes no
position on issues like this.

We use a basic technique for specifying the formal meaning of a formal
language called model-theoretic
semantics. This assumes that the language refers to a 'world', and
describes the minimal conditions that a world must satisfy in order to assign
an appropriate meaning for every expression in the language. A particular
world is called an interpretation, so that model theory might be
better called 'interpretation theory'. The idea is to provide an abstract,
mathematical account of the properties that any such interpretation must
have, making as few assumptions as possible about its actual nature or
intrinsic structure. Model theory tries to be metaphysically and
ontologically neutral. It is typically couched in the language of set theory
simply because that is the normal language of mathematics - for example, this
semantics assumes that names denote things in a set IR called the
'universe' - but the use of set-theoretic language here is not supposed to
imply that the things in the universe are set-theoretic in nature. Model
theory is usually most relevant to implementation via the notion of
entailment, described later, which makes it possible to define valid
inference rules.The chief utility of a formal semantic theory is not to
provide any deep analysis of the nature of the things being described by the
language or to suggest any particular processing model, but rather to provide
a technical way to determine when inference processes are valid, i.e. when
they preserve truth.

In this document we give two versions of the same semantic theory:
directly, and also (in an informative appendix) an 'axiomatic semantics' in
the form of a translation from RDF and RDFS into another formal language,
Lbase [LBASE] which has a predefined model-theoretic semantics.
The translation technique offers some advantages for machine processing and
may be found easier to read by some readers, so is described here as a
convenience. We believe that both of these descriptions, and also the closure
rules described in section 4, are all in exact correspondence, but only the
directly described model theory should be taken as normative.

There are several aspects of meaning in RDF which are ignored by this
semantics; in particular, it treats URI references as simple names, ignoring
aspects of meaning encoded in particular URI forms [RFC
2396] and does not provide any analysis of time-varying data or of
changes to URI references. It does not provide any analysis of indexical uses
of URI references, for example to mean 'this document'. It does not assign
any particular meaning to some parts of the RDF and RDFS namespaces, and in
some cases, notably the reification and container vocabularies, it assigns
less meaning than one might expect. These cases are noted in the text and the
limitations discussed in more detail. The semantics treats RDF and RDFS as
simple assertional languages, in which each triple expresses a simple
proposition. This imposes a fairly strict monotonic discipline on the
language, so that it cannot express closed world assumptions, local default
preferences, and several other commonly-used non-monotonic constructs.

Particular uses of RDF, including as a basis for more expressive languages
such as DAML and OWL, may impose further semantic conditions in addition to
those described here, and such extra semantic conditions can also be imposed
on the meanings of terms in particular RDF namespaces. We refer to these as semantic extensions. Semantic
extensions to RDF are constrained in this recommendation using the language
of RFC 2119. An example semantic extension is RDF Schema, the semantics of
which are defined in later parts of this document. All such extensions MUST
conform to the semantic conditions in this document. In more operational
terms, any entailment which is valid according to the semantics described
here MUST continue to be valid in any extended semantics imposed on an RDF
namespace. Any name for entailment in a semantic extension MUST be indicated
by the use of a namespace does it have
to be a namespace? entailment term, as introduced in section
3 below.

Any semantic theory must be attached to a syntax. This semantics is
defined as a mapping on the abstract syntax of RDF [RDF_CONCEPTS]. We use the
following terminology defined there: uriref, defined as RDF URI
Reference; literal, plain literal, typed literal,
blank node and triple.

The convention that relates a set of triples to a picture of an RDF graph
can be stated as follows. Draw one oval for each blank node and uriref, and
one rectangle for each literal, which occur in either the S or O position in
any triple in the set, and write each uriref or literal as the label of its
shape. Then for each triple <S,P,O>, draw an arrowed line from the
shape produced from S to the shape produced from O, and label it with P.
Technically, this is a picture of a mathematical structure which can be
described as a partially labeled directed pseudograph with unique node
labels, but we will simply refer to a set of triples as
an RDF graph.

In this document we will use the N-triples syntax
described in [RDFTestCases] to describe RDF
graphs. This notation uses a nodeID convention to indicate blank nodes in the
triples of a graph. Note that while
node
identifiers such as _:xxx serve to identify blank nodes in
the surface syntax, these expressions are not considered to be the
label of the graph node they identify; they are not names, and do not occur
in the actual graph. In particular, two N-triples
documents which differ only by re-naming their node identifiers will be
understood to describe identical RDF graphs.

The N-triples syntax requires that urirefs be given in full, enclosed in
angle brackets. In the interests of brevity, we use the imaginary URI scheme
do you
really mean to use a scheme here? Does it exist? is it not better to stick
to the qname form?In the interests of brevity we
use a shorter notation for urirefs based on XML QNames. Names of the form
prefix:name should be understood to have the 'prefix:' replaced by a string
of characters associated with the prefix. The following prefixes and
character strings are used in this document:'ex:' to provide illustrative examples.
To obtain a more realistic view of the normal appearance of the N-triples
syntax, the reader should imagine this replaced with something like
'http://example.org/rdf/mt/artificialExample/'. We will also make extensive
use of the Qname prefixes rdf: and rdfs: defined as follows:

Since Qname syntax is not legal in the N-triples syntax, and in the
interests of brevity and readability, we will use the convention whereby a
Qname is used without surrounding angle brackets to indicate the
corresponding uriref enclosed in angle brackets, eg the triple

Notice that one does not, in general, obtain the merge of a
set of graphs by concatenating their corresponding N-triples documents
and constructing the graph described by the merged document, since if some of
the documents use the same node identifiers, the merged document will
describe a graph in which some of the blank nodes have been 'accidentally'
merged. To merge Ntriples documents it is necessary to check if the same
nodeID is used in two or more documents, and to replace it with a distinct
nodeID in each of them, before merging the documents. Similar cautions apply
to merging graphs described by RDF/XML documents which contain nodeIDs..
s/.././

An instance of an RDF
graph is, intuitively, a similar graph in which some blank nodes may have
been replaced by urirefs or literals. However, it is technically
convenient to also allow blank nodes to be replaced by other blank nodes, so
we need to state this rather more precisely. Say that one triple is an
instance of another if it can be obtained by substituting zero or more
urirefs, literals or blank nodes for blank nodes in the original; and that a
graph is an instance of another just when every triple in the first graph is
an instance of a triple in the second graph, and every
triple in the second graph has an instance in the first graph. Note
that any graph is an instance of itself.

This allows blank nodes in the second graph to be replaced by
names in the instance (which might cause some nodes to be identified that
were previously distinct) but it also allows them to be replaced by other
blank nodes. In particular, this means that the two graphs:

<ex:a> <ex:b> _:xxx .

<ex:a> <ex:b> _:yyy .

and

<ex:a> <ex:b> _:zzz .

with, respectively, three nodes and two arcs, and two nodes
and one arc, are instances of each other. Similarly,

We do not impose any logical restrictions on the domains and ranges of
properties; in particular, a property may be applied to itself. When classes
are introduced in RDFS, we will allow them to contain themselves. This might
seem to violate one of the axioms of standard (Zermelo-Fraenkel) set theory,
the axiom of foundation, which forbids infinitely descending chains of
membership. However, the semantic model given here distinguishes properties
and classes considered as objects from their extensions - the sets of object-value pairs which
satisfy the property, or things that are 'in' instance of or members of the
class - thereby allowing the extension of a property or class to
contain the property or class itself without violating the axiom of
foundation. In particular, this use of a class extension mapping allows
classes to contain themselves. For example, it is quite OK for (the extension
of) a 'universal' class to contain the class itself as a member, a convention
that is often adopted at the top of a classification hierarchy. (If an
extension contained itself then the axiom would be violated, but that case
never arises.) The technique is described more fully in Hayes&Menzel.

In this respect, RDFS differs from many conventional ontology frameworks
such as UML which assume a more structured system of 'layers', or draw a
distinction between data and meta-data. However, while RDFS does not assume
the existence of such structure, it does not prohibit it. RDF allows such
loops, but it does not mandate their use for all parts of a user
vocabulary.If this aspect of RDFS is found worrying, then it is possible to
restrict oneself to a subset of RDF graphs which do not contain any such
'loops' of class membership or property application, and still retain much of
the expressive power of RDFS for many practical purposes.

The use of the explicit extension mapping also makes it possible for two
properties to have exactly the same values, or two classes to contain the
same instances, and still be considered distinct. This means that RDFS
classes can be considered to be rather more than simple sets; they can be
thought of as 'classifications' or 'concepts' which have a robust notion of
identity which goes beyond a simple extensional correspondence. This property
of the model theory has significant consequences in more expressive languages
built on top of RDF, such as OWL, which are capable of expressing identity
between properties and classes directly. This 'intensional' nature of classes
and properties is sometimes claimed to be a useful property of a descriptive
language, but a full discussion of this issue is beyond the scope of this
document.

Notice that the question of whether or not a class contains itself as a
member is quite different from the question of whether or not it is a
subclass of itself. All classes are subclasses of themselves.

As report by Arjon is
http://lists.w3.org/Archives/Public/www-rdf-comments/2002OctDec/0097.html
this rule, nor the equivalent one for subproperties is reflected in the
closure rules below.

RDF uses two kinds of referring expression, urirefs and literals. We make
very simple and basic assumptions about these. Urirefs are treated as logical
constants, i.e. as names which denote things (the things are called
'resources', following RFC 2396, but no assumptions
are made here about the nature of resources.) The meaning of a literal is
principally determined by its character string: it either refers to the value
mapped from the string by the associated datatype, or if no datatype is
provided then it refers to the Unicode string
itself. unless there is a lang code,
right? We do not take any position here on the way that
urirefs may be composed from other expressions, Do you need to say this - the uri part of an
RDF URI reference is always absolute - there are no relative URI's allowed in
the abstract syntax. e.g. from relative URIs or Qnames; the
semantics simply assumes that such lexical issues have been resolved in some
way that is globally coherent, so that a single uriref can be taken to have
the same meaning wherever it occurs. Similarly, the semantics has no special
provision for tracking temporal changes. It assumes, implicitly, that urirefs
have the same meaning whenever they occur. To provide an adequate
semantics which would be sensitive to temporal changes is a research problem
which is beyond the scope of this document.

We do not make any assumptions about the relationship between
the denotation of a uriref and a document or network resource which can be
obtained by using that uriref in an HTTP transfer protocol. It has been
argued that urirefs in the form of HTTP URIs should be required to denote the
document that results from such a retrieval. Such a requirement could be
added as an extra semantic condition, but this condition is not assumed in
this document.

The basic intuition of model-theoretic semantics is that
asserting a sentence makes a claim about the world: it is another way of
saying that the world is, in fact, so arranged as to be an interpretation
which makes the sentence true. In other words, an assertion amounts to
stating a constraint on the possible ways the world might be.
Notice that there is no presumption here that any assertion contains enough
information to specify a single unique interpretation. It is usually
impossible to assert enough in any language to completely constrain the
interpretations to a single possible world, so there is no such thing as
'the' unique RDF interpretation. In general, the larger an RDF graph is - the
more it says about the world - then the smaller the set of interpretations
that an assertion of the graph allows to be true - there are fewer ways the
world could be, while making the asserted graph true of it.

The following definition of an interpretation is couched in
mathematical language, but what it amounts to intuitively is that an
interpretation provides just enough information about a possible way the
world might be - a 'possible world' - in order to fix the truth-value (true
or false) of any ground RDF triple. It does this by specifying for each
uriref, what it is supposed to be a name of; and also, if it is used to
indicate a property, what values that property has for each thing in the
universe; and if it used to indicate a datatype, we assume that the datatype
defines a mapping between lexical forms and datatype values. This is just
enough information to fix the truth-value of any ground
triple, and hence any ground RDF graph.(We will show how to determine the
truthvalues of non-ground graphs in the following section.) Notice that if we
left any of this information out, it would be possible for some well-formed
triple to be left without a determinate value; and also that any other
information - such as the exact nature of the things in the universe - would,
regardless of its intrinsic interest, be irrelevant to the actual
truth-values of any triple.

All interpretations will be relative to a set of urirefs,
called the vocabulary of the interpretation; so that one should speak,
strictly, of an interpretation of an RDF vocabulary, rather than of RDF
itself. Some interpretations may assign special meanings to the symbols in a
particular namespace, which we will call a reserved vocabulary.
Interpretations which share the special meaning of a particular reserved
vocabulary will be named for that vocabulary, so that we will speak of
'rdf-interpretations' , 'rdfs-interpretations', etc.. An interpretation with
no reserved vocabulary will be called a simple interpretation, or
simply an interpretation. A simple interpretation can be viewed as having an
empty reserved vocabulary.

RDF uses several forms of literal. The chief semantic characteristic of
literals is that their meaning is determined by the form of the string they
contain. In the case of typed literals, however, the full specification of
the meaning depends on being able to access the datatype information which is
external to RDF itself; for this reason we postpone a full discussion of the
meaning of typed literals until later sections, where we introduce a special
notion of datatype interpretation. For now, we will assume that each
interpretation defines a mapping IL from typed literals to their
interpretations, and will impose stronger conditions on IL as the notion of
'interpretation' is extended in later sections. Simple what term did we decide to use for these?
"plain literals"? literals, without embedded datatypes, are
always interpreted as referring to themselves: either a character string or a
pair consisting of two character strings, the second of which is a language
tag.

The set of all possible values of all literals is assumed to be a set
called LV which is common to all RDF interpretations. Since the set of
datatypes is not restricted by RDF syntax, it is impossible to give a sharp
definition of LV, but it is required to contain all literal strings and also
all pairs consisting of a literal string and a language tag.

A simpleinterpretation I of a vocabulary V is defined
by:

1. A non-empty set IR of resources, called the domain or universe of I,
which is a superset of LV.

2. A distinguished subset IP of IR, the set of properties.

3. A mapping IEXT from IP into the powerset of IR x IR i.e. the set of
sets of pairs <x,y> with x and y in IR .

4. A mapping IS from V into IR

5. A mapping IL from typed literals into IR.

IEXT(x) is a set of pairs which identify the arguments for which the
property is true, i.e. a binary relational extension, called the extension of x. This trick of distinguishing a
relation as an object from its relational extension allows a property to
occur in its own extension, as noted earlier.

The assumption the
grammar? s/the/that/ IR is a superset of LV amounts to saying
that literal values are thought of as real entities that 'exist'. This
assumption may seem controversial, since it amounts to saying that literal
values are resources. We note however that this does not imply that literals
should be identified with urirefs. There is a technical reason why the range
of IL is IR rather than being restricted to LV. When we consider
interpretations which take account of datatype information, it is
syntactically possible for a typed literal to be internally inconsistent, and
we will require such badly typed literals to denote a non-literal value.

In the next sections we give the exact rules for how an interpretation of
a vocabulary determines the truth-values of any RDF graph, by a recursive
definition of the denotation - the semantic "value" - of any RDF expression
in terms of those of its immediate subexpressions. RDF has two kinds of
denotation: names denote things in the universe, and sets of triples denote
truthvalues.

The denotation of a ground RDF graph in I is given recursively by the
following rules, which extend the interpretation mapping I from labels to
ground graphs. These rules (and extensions of them given later) work by
defining the denotation of any piece of RDF syntax E in terms of the
denotations of the immediate syntactic constitutents of E, hence allowing the
denotation of any piece of RDF to be determined by a kind of syntactic
recursion.

if E is a plain literal then I(E) = E

if E is a typed literal than I(E) = IL(E)

if E is a uriref then I(E) = IS(E)

if E is a triple s p o . then I(E) = true if <I(s),I(o)> is
in IEXT(I(p)), otherwise I(E)= false.

if E is a ground RDF graph then I(E) = false if I(E') = false for
some triple E' in E, otherwise I(E) =true.

Notice that if the vocabulary of an RDF graph contains urirefs that are
not in the vocabulary of an interpretation I - that is, if I simply does not
give a semantic value to some name that is used in the graph - then these
truth-conditions will always yield the value false for some triple in the
graph, and hence for the graph itself. Turned around, this means that any
assertion of a graph implicitly asserts that all the names in the graph
actually refer to something in the world. Note that the
final condition implies that an empty graph (an empty set of triples) is
trivially true.

As an illustrative example, the following is a small
interpretation for the artificial vocabulary {ex:a, ex:b, ex:c}.
We use integers to indicate the 'things' in the universe. This is
not meant to imply that RDF interpretations should be interpreted as being
about arithmetic, but more to emphasize that the exact nature of the things
in the universe is irrelevant.(In this and subsequent examples we use the
greater-than and less-than symbols in several ways: following mathematical
usage to indicate abstract pairs and n-tuples; following Ntriples syntax to
enclose urirefs, and also as arrowheads when indicating mappings. We
apologize for any confusion.)

IR = LV union{1, 2};

IEXT: 1->{<1,2>,<2,1>}

IS: ex:a->1, ex:b->1,
ex:c->2

IL: any typed literal -> 2

Figure 1: An example of an interpretation. Note, this is not a picture
of an RDF graph.

This interpretation makes these triples true:

<ex:a> <ex:b> <ex:c> .

<ex:c> <ex:a> <ex:a> .

<ex:c> <ex:b> <ex:a> .

<ex:a> <ex:b> "whatever"^^ex:b .

For example, I(<ex:a> <ex:b> <ex:c> .) =
true if <I(ex:a),I(ex:c)> is in
IEXT(I(<ex:b>)), i.e. if <1,2> is in IEXT(1), which
is {<1,2>,<2,1>} and so does contain <1,2> and so
I(<ex:a <ex:b> ex:c>) is true.

The truth of the fourth literal is a consequence of the rather
idiosyncratic interpretation chosen here for typed literals; this kind of
oddity will be ruled out when we consider datatyped intepretations.

It makes these triples false:

<ex:a> <ex:c> <ex:b> .

<ex:a> <ex:b> <ex:b> .

<ex:c> <ex:a> <ex:c> .

<ex:a> <ex:b> "whatever" .

For example, I(<ex:a> <ex:c> <ex:b> .) =
true if <I(ex:a),I(<ex:b>)>,
i.e.<1,1>, is in IEXT(I(ex:c)); but I(ex:c)=2
and IEXT is not defined on 2, so the condition fails and I(<ex:a>
<ex:c> <ex:b> .) = false.

It makes all literals containing a plain literal false, since the property
extension does not have any pairs containing a character string.

To emphasize; this is only one possible interpretation of this
vocabulary; there are (infinitely) many others. For example, if we modified
this interpretation by attaching the property extension to 2 instead of 1,
none of the above six triples would be true.

Blank nodes are treated as simply indicating the
existence of a thing, without using, or saying anything about, the name of
that thing. (This is not the same as assuming that the blank node indicates
an 'unknown' uriref; for example, it does not assume that there is any uriref
which refers to the thing. See
http://www.w3.org/2000/03/rdf-tracking/#rdfms-identity-anon-resources
for a summary and pointers to further discussions on this issue. The
discussion of skolemization in the proof appendix is also
relevant.)

We now show how an interpretation can specify the truth-value of a graph
containing blank nodes. This will require some definitions, as the theory so
far provides no meaning for unlabeled nodes. Suppose I is an interpretation
and A is a mapping from some set of unlabeled nodes to the universe IR of I,
and define I+A to be an extended interpretation which is like I except that
it uses A to give the interpretation of unlabeled nodes. Define anon(E) to be
the set of unlabeled nodes in E. Then we can extend the above rules to
include the two new cases that are introduced when unlabeled nodes occur in
the graph:

If E is an unlabeled node then I+A(E) = A(E)

If E is an RDF graph then I(E) = true if I+A'(E) = true
for some mapping A' from anon(E) to IR, otherwise I(E)= false.

Notice that we have not changed the definition of an
interpretation; it still consists of the same values IR, IP, IEXT, IS and IL.
We have simply extended the rules for defining denotations under an
interpretation, so that the same interpretation that provides a truth-value
for ground graphs also assigns truth-values to graphs with unlabeled nodes,
even though it provides no denotation for the unlabeled nodes
themselves. Notice also that the unlabeled nodes themselves are
perfectlyon the grounds of deep
suspicion of all claims to perfection, d/perfectly/
well-defined entities; they differ from other nodes only in not being
assigned a denotation by an interpretation, reflecting the intuition that
they have no 'global' meaning (i.e. outside the graph in which they
occur).

This effectively treats all blank nodes as having the same meaning as
existentially quantified variables in the RDF graph in which they occur.
Notice however that that since two unlabeled nodes cannot have the same
label, there is no need to specify the scope of the quantifier within a
graph, and no need to use any explicit quantifier syntax.( If we were to
apply the semantics directly to N-triples syntax, we would need to
indicate the quantifier scope, since in this lexicalization syntax the same
node identifier may occur several times corresponding to a single blank node
in the graph. The above rule amounts to the convention that would place the
quantifiers just outside, or at the outer edge of, the N-triple document
corresponding to the graph.)

For example, with this convention, the graph defined by the following
triples is false in the interpretation shown in figure 1:

_:xxx <ex:a> <ex:b> .

<ex:c> <ex:b> _:xxx .

since if A' maps the unlabeled node to 1 then the first triple is false in
I+A', and if it maps it to 2 then the second triple is false.

Note that each of these triples, if thought of as a single graph, would be
true in I, but the whole graph is not; and that if a different nodeID were
used in the two triples, indicating that the RDF graph had two blank nodes
instead of one, then A' could map one node to 2 and the other to 1, and the
resulting graph would be true under the interpretation I.

Entailment is the key idea which connects model-theoretic
semantics to real-world applications. As noted earlier, making an assertion
amounts to claiming that the my cpu is a
bit overloaded at this point. Do you mean *the* world as in the real world,
do you mean *a* world, or do you mean some other specific world but I haven't
figured out which one it is. world is an interpretation which
assigns the value true to the assertion. If A entails B, then any
interpretation that makes A true also makes B true, so that an assertion of A
already contains the same "meaning" as an assertion of B; we could say that
the meaning of B is somehow contained in, or subsumed by, that of A. If A and
B entail each other, then they both "mean" the same thing, in the sense that
asserting either of them makes the same claim about the world. The interest
of this observation arises most vividly when A and B are different
expressions, since then the relation of entailment is exactly the appropriate
semantic licence to justify an application inferring or generating one of
them from the other. Through the notions of satisfaction, entailment and
validity, formal semantics gives a rigorous definition to a notion of
"meaning" that can be related directly to computable methods of determining
whether or not meaning is preserved by some transformation on a
representation of knowledge.

In this section we give a few basic results about simple entailment and
valid inference. Simple entailment can be recognized by relatively simple
syntactic comparisons. The two basic forms of simply valid proof step in RDF
are, in logical terms, the inference from (P and Q) to P, and the inference
from (foo baz) to (exists (?x) foo(?x)).

I can't parse that. I guess I don't
know the background logical stuff well enough. {P, Q} entails P I
understand. But I don't know how to read (foo baz). No doubt I'll get to
it.

Note, these results apply only to simple entailment, not to the more
subtle notions of entailment introduced in later sections. Proofs, all of
which are straightforward, are given in appendix B, which
also describes some other properties of entailment which may be of
interest.

The relationship between merging and entailment is simple,
and obvious from the definitions:

Merging lemma. The
merge of a set S of RDF graphs is entailed by S, and entails every member of
S.

This means that a set of graphs can be treated as
equivalent to its merge, i.e. a single graph, as far as the model theory is
concerned. In what follows, therefore, we will often not bother to
distinguish between a set of graphs and a single graph. This can be used to
simplify the terminology somewhat: for example, we can
paraphrase the definition of S entails E, above, by saying that S entails E
when every interpretation which satisfies S also satisfies E.

The interpolation lemma completely characterizes simple RDF entailment in
syntactic terms. To tell whether a set of RDF graphs entails another, find a
subgraph of their merge and replace names by unlabeled nodes to get the
second. Of course, there is no need to actually construct the merge. If
working backwards from the consequent E (the graph that may be entailed by
the others), the most efficient technique would be to treat unlabeled nodes
as variables in a process of subgraph-matching, allowing them to bind to
'matching' names in the antecedent graph(s) in S, i.e. those which may entail
the consequent graph. The interpolation lemma shows that this process is
valid, and is also complete if the subgraph-matching algorithm is. The
existence of complete subgraph-checking algorithms also shows that RDF entailment is decidable, i.e. there is a terminating
algorithm which will determine for any finite set S and any graph E, whether
or not S entails E.

Notice however that such a variable-binding process would only be
appropriate when applied to the conclusion of a proposed entailment.
This corresponds to using the document as a goal or a query, in contrast to
asserting it, i.e. claiming it to be true. If an RDF document is asserted,
then it would be invalid to bind new values to any of its unlabeled nodes,
since the resulting graph would not be entailed by the assertion, as
explained in the next section.

It might be thought that the operation of changing a bound variable would
be an example of an inference which was valid but not covered by the
interpolation lemma, e.g. the inference of

_:x <ex:a> <ex:b> .

from

_:y <ex:a> <ex:b> .

Recall however that by our conventions, these
two expressions describe identical RDF graphs.

Finally, the following is a trival consequence of the definition of
entailment, but it may be worth stating explicitly since many implemented
systems fail to satisfy it:

Monotonicity Lemma. Suppose S is a subgraph or subset of S' and S entails E. Then S' entails
E.

Notice that unlabeled nodes are not identified with other nodes in a
merge, and indeed this reflects a basic principle of RDF graph inference: in
contrast to names, which have a global identity which carries across all
graphs, blank nodes should not be identified with other nodes or re-labeled
with urirefs, in order to ensure that the resulting graph is entailed by what
one starts with. To state this condition precisely, we need
to first exclude a counterexample. It is possible for a graph to contain two
triples one of which is an instance of the other, for example:

This means that there is no valid RDF inference process which can produce
an RDF graph in which a single unlabeled node occurs in triples originating
from several different graphs. (Of course, such a graph can be constructed,
but it will not be entailed by the original documents. An assertion of such a
graph would reflect the addition of new information about the identity of two
unlabeled nodes.)

We emphasise again that these results apply only to simple entailment, not
to the namespace entailment relationships defined in rest of the document.

So far we have considered only the model theory of what might be called
the logical form of the RDF graph itself, without imposing any special
interpretations on any reserved vocabulary. In the rest of the document we
will extend the model theory to describe the semantic conditions reflecting
the intended meanings of the rdf: and rdfs: namespaces. Some of these
interpretations diverge in some ways from the meanings described in RDFMS and RDFSchema , and we
will note these differences as they arise. Some
stylistic care to ensure reader does not think this reference applies to the
current schema spec. ... described in the RDF Model and Syntax
specification [RDFMS] and the first RDF Schema candidate recommendation
[RDF-SCHEMA-1], ...

Although we will do this in stages, the same general technique is used
throughout. First we describe a reserved vocabulary, i.e. a set of urirefs
which will be given a special meaning; then we give the extra conditions on
an interpretation which capture those meanings; then we restrict the notions
of satisfiability and entailment to apply to these interpretations only. This
essentially imposes an a priori restriction on the world being
described that it satisfy the extra conditions. The new semantic conditions
are automatically assumed to be true; an interpretation which would violate
them is simply not allowed to count as a possible world. We call these restricted notions of satisfiability and
entailement vocabulary specific satisfiability and vocabulary specific
entailment respectively.

Since there are now many distinct notions of interpretation, entailment
and satisfiability, we use the conventional Qname namespace prefixes to
identify the various distinctions, eg an
rdf-interpretation is an interpretation satisfying the rdf semantic
conditions, rdf-entailment means entailment relative to such interpretations,
and so on. I'm not sure we need the
namespace prefix idea hear at all. Since there are now many
distinct notions of interpretation, entailment and satisfiability, we will
name them, e.g. an rdf-interpretation is an interpretation satisfying the rdf
semantic conditions, rdf-entailment means ...We call this general idea
vocabulary
entailment , i.e. entailment relative to a set of interpretations
which satisfy extra semantic conditions on a reserved vocabulary. Vocabulary
specific entailment is more powerful than
simple entailment, in the sense that a given set of premises entails more
consequences. In general, as the reserved vocabulary is increased and extra
semantic conditions imposed, the class of satisfying interpretations is
restricted, and hence the corresponding notion of entailment becomes more
powerful. For example, if S simply entails E then it also rdf-entails E,
since every rdf-interpretation is also a simple interpretation; but S may
rdf-entail E even though it does not simply entail it. Intuitively, a
conclusion may follow from some of the extra
assumptions incorporated in the semantic conditions imposed on the
reserved vocabulary.

Another way of expressing this is that any restriction on
interpretations decreases the number of possible ways that an
interpretation might be a counterexample to E's following from S.
I can't parse this. I keep thinking
about it to confirm I believe it, but it makes my head hurt. I understood, I
think, the para above. If this para is meant to make this idea easier to
understand, it doesn't work for me. This could be an editorial comment -
please delete. Or if its thought to be worth keeping, I'd like some help to
figure out what it means.

Simple entailment is the vocabulary entailment of the empty vocabulary. It
is therefore the weakest form of RDF entailment, which holds for any
reserved vocabulary; it is the entailment which depends only on the basic
logical form of RDF graphs, without making any further assumptions about the
meaning of any urirefs.

We will consider syntactic criteria for recognizing vocabulary entailment
in the next section.

I think one of the things that makes
me nervous about vocab entailment is the notion that folks can make new
vocabs from old, e.g. PRISM reuses DC, or at least some of it. There is
potentially a complicated structure of vocabularies. I'm not saying there is
anything wrong I can put my finger on here, I'm just abit twitchy. Could be
the sherry.

You've done it again. I've just
written down a worry and the next section deals with it.

This forces every rdf interpretation to contain a thing which can be
interpreted as the 'type' of properties. The second condition could be
regarded as defining IP to be the set of resources in the universe
of the interpretation which have the value I(rdf:Property) of
the property I(rdf:type). This way of construing subsets of the
universe will be central in interpretations of RDFS. The
third condition says that the empty list object is classified as being a
list: this is the only formal condition the RDF semantics places on the
collection vocabulary, described later.

For example, the following rdf-interpretation extends the simple
interpretation in figure 1:

This is not the smallest rdf-interpretation which extends the earlier
example, since we could have made I(rdf:Property) be 2 and
IEXT(T) be {<1,2>,<T,2>}, and managed without having P in the
universe. In general, a given entity in an interpretation may play several
'roles' at the same time, as long as this can be done without violating any
of the required semantic conditions. The above interpretation identifies
properties with lists, for example; of course, other interpretations might
not make such an identification.

It is important to note that every rdf-interpretation is also
a simple interpretation.The 'extra' structure does not prevent it acting in
the simpler role.

RDF provides vocabularies which are intended for use in describing
containers and bounded collections, and a reification vocabulary to enable an
RDF graph to describe, as well as exhibit, triples. Although these
vocabularies have reasonably clear informally intended conventional meanings,
we do not impose any further formal semantic
conditions on them, so the notions of rdf-entailment and rdf-interpretation
apply to them without further change. They are discussed here in order to
explain both the intuitive meanings intended, and also to note the intuitive
consequences which are not supported by the formal model theory.
Constraints are imposed on the meanings of these vocabularies in semantic
extensions. When the RDFS vocabulary is added, this part of the RDF
vocabulary will have nontrivial consequences; but until then, it can be added
to the RDF vocabulary without needing any further semantic conditions.We will refer to the complete set of all rdf urirefs,
consisting of the RDF reserved vocabulary, all of the reification, container
and collection vocabularies and the uriref rdf:value, as the
RDF vocabulary, rdfV.

The lack of a formal semantics for these vocabularies does not reflect any
technical semantic problems, but rather is a design decision to make it
easier to implement RDF reasoning engines which can check formal RDF
entailment. Since no extra formal semantic conditions are imposed on them,
they are not considered to be restricted vocabularies in
RDF. In RDFS, however, the entire RDF vocabulary
is considered to be a restricted vocabulary. I don't think we have met the term "restricted
vocabulary" yet. Confirmed.

The RDF reification vocabulary consists of a class name and three property
names.

RDF reification vocabulary

rdf:Statement rdf:subject rdf:predicate rdf:object

Semantic extensions MAY limit the interpretation of these so that a triple
of the form

aaa rdf:type rdf:Statement .

is true in I just when I(aaa) is a token of an RDF triple in some RDF
document, and the three properties, when applied to such a denoted triple,
have the same values as the respective components of that triple.

This may be illustrated by considering the following two RDF graphs, the
first of which consists of a single triple.

The second graph is called a reification of the triple in the first
graph, and the node which is intended to refer to the first triple - the
blank node in the second graph - is called, rather confusingly, a reified
triplenot in m&s. its a
reified statement. . (This can be a blank node or a
uriref.) In the intended interpretation of the
reification vocabulary, the second graph would be made true in an
interpretation I by interpreting the reified triple to refer to a token of
the triple in the first graph in some concrete RDF document, considering that
token to be valid RDF syntax, and then using I to interpret the syntactic
triple which the token instantiates, so that the subject, predicate
and object of that triple are interpreted in the same way in the reification
as in the reified triple. Formally, <x,y> is in
IEXT(I(rdf:subject)) just when x is a token of an RDF triple of
the form

aaa bbb ccc .

and y is I(aaa); similarly for predicate and object. Notice that the value
of the rdf:subject property is not the subject uriref itself but
its interpretation, and so this condition involves a two-stage interpretation
process: we have to interpret the reified node - the subject of the triples
in the reification - to refer to another triple, then treat that triple as
RDF syntax and apply the interpretation mapping again to get to the referent
of its subject. This requires triple tokens to exist as first-class entities
in the universe IR of an interpretation. In sum: the
meaning of the reification is that a document exists containing a triple
token which means whatever the first graph means.

We emphasize that the semantic extension described here requires the
reified triple that the reification describes - I(_:xxx) in the
above example - to be a particular token or instance of a
triple in a (real or notional) RDF document, rather than an 'abstract' triple
considered as a grammatical form. There could be
several such entities which have the same subject, predicate and object
properties. Although a graph is defined as a set of triples, several such
tokens with the same triple structure might occur in different documents.
Thus, it would be meaningful to claim that the blank node in the second graph
above does not refer to the triple in the first graph, but to some other
triple with the same structure. This particular interpretation of reification
was chosen on the basis of use cases where properties such as dates of
composition or provenance information have been applied to the reified
triple, which are meaningful only when thought of as referring to a
particular instance or token of a triple.

Although RDF applications may use reification to refer to triple tokens in RDF documents, the connection between the
document and its reification must be maintained by some means external to
RDF. RDF syntax provides no means to 'connect' an RDF triple to its
reification. Since an assertion of a reification of a triple does not
implicitly assert the triple itself, this means that there are no
entailment relationships which hold between a triple and a reification of it.
Thus the reification vocabulary has no effective semantic constraints on it,
other than those that apply to an RDF interpretation. The chief facts that
are worthy of note about RDF reification, in fact, are examples of
non-entailments.

A reification of a triple does not entail the triple, and is not entailed
by it. (The reason for first is clear, since the reification only asserts
that the triple token exists, not that it is
true. The second non-entailment is a consequence of the fact that asserting a
triple does not automatically assert that any triple tokens exist in the universe being described by the
triple. For example, the triple might be part of an ontology describing
animals, which could be satisfied by an interpretation in which the universe
contained only animals.)

Since the relation between triples and reifications of triples in any RDF
graph or graphs need not be one-to-one, asserting a property about some
entity described by a reification need not entail that the same property
holds of another such entity, even if it has the same components. For
example,

RDF provides vocabularies for describing three classes of
containers. A container is an entity whose 'members' are thought of
as the values of properties, each of which relates a particular 'position' in
the container to the entity, if there is one, which is 'at' that position.
(The rdfs vocabulary, described below, adds a generic membership property
which holds regardless of position, and classes containing all the containers
and all the membership properties.)

RDF Container Vocabulary

rdf:Seq rdf:Bag rdf:Alt rdf:_1 rdf:_2 ...

One should understand this RDF vocabulary as describing containers, rather
than as a vocabulary for constructing them, as would typically be supplied by
a programming language. On this view, the actual containers are entities in
the semantic universe, and RDF graphs which use the vocabulary simply provide
very basic information about these entities, enabling an RDF graph to
characterize the container type and give partial information about the
members of a container. Since the RDF container vocabulary is so limited,
many 'natural' assumptions concerning RDF containers are not formally
sanctioned by the RDF model theory. This should not be taken as meaning that
these assumptions are false, but only that RDF does not formally entail that
they must be true.

There are no special semantic conditions on the container vocabulary: the
only 'structure' which RDF presumes its containers to have is what can be
inferred from the use of this vocabulary and the semantic conditions on the
rest of the RDF vocabulary. Since the membership properties rdf:_1,
rdf:_2, ... are implicitly ordered by their very names, that order can
be thought of as an ordering of the elements of the container. This implicit
ordering of members of a container applies to all three kinds of container,
even though bags are normally thought of as unordered. RDF does not support
any entailments which could arise from re-ordering the elements of an
rdf:Bag. For example,

_:xxx rdf:type rdf:Bag .
_:xxx rdf:_1 <ex:a> .
_:xxx rdf:_2 <ex:b> .

does not entail

_:xxx rdf:_1 <ex:b> .
_:xxx rdf:_2 <ex:a> .

Notice that if this conclusion were valid, then the result of conjoining
it to the original graph would also be a valid entailment, which would assert
that both elements were in both positions. (This is a consequence of the fact
that RDF is a purely assertional language.)

There is no assumption that a property of a container applies to any of
the elements of the container, or that if a property applies to a container
then the property applies to any of the members of the container, or vice
versa. There is no requirement that the three container classes are disjoint,
so that for example something can be asserted to be both an rdf:Bag and an
rdf:Seq. There is no assumption that containers are gap-free, so that for
example

_:xxx rdf:type rdf:Seq.
_:xxx rdf:_1 <ex:a> .
_:xxx rdf:_3 <ex:c> .

does not entail

_:xxx rdf:_2 _:yyy .

That jars with M&S which says
they are contiguous. My, potentially flawed understanding, is that we
decided that containers are contiguous as per m&s, we just allow partial
descriptions of them. Would the above entailment cause problems?
i.e.

_:xxx rdf:type rdf:Seq
.

_:xxx rdf:_nnn _:yyy .

entails

_:xxx rdf:_mmm _:zzz .

when nnn > 1 and mmm =
nnn-1

There is no way in RDF to 'close' a container, i.e. to assert that it
contains only a fixed number of members. This is a reflection of the fact
that it is always consistent to add a triple to a graph asserting a
membership property of any container. And finally, there is no built-in
assumption that an RDF container has only finitely many members.

The informal purpose of the three container types is to allow applications
to encode the various intentions or expectations about different kinds of
containers. Sequences are thought of as totally ordered, bags as unordered
(that is, equivalent under re-orderings) and rdf:Alt containers are intended
to convey a series of alternative values of a property, which an application
can choose from. However, these informal interpretations are not reflected in
any RDF entailments. In particular, a triple with a rdf:Alt as a
subject or object should not be thought of as an encoding of a logical
disjunction.

3.2.3 Collections

RDF provides a vocabulary for describing collections, ie.'list structures'
in terms of head-tail links. Collections differ from containers in allowing
branching structure and in having an explicit terminator, allowing
applications to determine the exact set of items in the collection.

RDF Collection Vocabulary

rdf:List rdf:first rdf:rest rdf:nil

As with containers, no special semantic conditions are imposed on this
vocabulary other than the type of nil being
List. It is intended for use typically in a context where a
'well-formed' container is described using blank nodes to connect a sequence
of items, each described by three triples of the form

_:c1 rdf:type rdf:List .
_:c1 rdf:first aaa .
_:c1 rdf:rest _:c2

where the final item is indicated by the use of rdf:nil as
the value of the property rdf:rest. In a familiar convention,
rdf:nil can be thought of as the empty collection. Clearly,
any such graph amounts to an assertion that the collection, and all its
sub-collections, exist, and since the members of the collection can be
determined by inspection, this is often sufficient to enable applications to
determine what is meant. Note however that the semantics does not require any
collections to exist other than those mentioned explicitly in a graph (and
the empty collection). For example, the existence of a collection containing
two items does not automatically guarantee that the similar collection with
the items permuted also exists:

Also, RDF imposes no 'wellformedness' conditions on the use of this
vocabulary, so that it is possible to write RDF graphs which assert the
existence of highly peculiar objects such as lists with forked or non-list
tails, or multiple heads:

As this example shows, it is also possible to write a set of triples which
underspecify a collection by failing to specify its rdf:rest
property value.

Semantic extensions MAY place extra syntactic well-formedness restrictions
on the use of this vocabulary in order to rule out such graphs, and MAY
exclude interpretations of the collection vocabulary which violate the
convention that the subject of a 'linked' collection of three-triple items of
the form described above, ending with an item ending with
rdf:nil, denotes a totally ordered sequence whose members are
the denotations of the rdf:first values of the items, in the
order got by tracing the rdf:rest properties from the subject to
rdf:nil. This permits sequences which
contain other sequences.

(rdfs:comment, rdfs:seeAlso,
rdfs:isDefinedBy and rdfs:label are included here
because some constraints which apply to their use can be stated using
rdfs:domain, rdfs:range and rdfs:subPropertyOf.
Other than this, the formal semantics does not assign them any particular
meanings.)

Although not strictly necessary, it is convenient to state the RDFS
semantics in terms of a new semantic construct, a 'class', i.e. a resource
which represents a set of things in the universe which all have that class as
the value of their rdf:type property. Classes are defined to be
things of type rdfs:Class. We will assume that there is a
mapping ICEXT (for the Class Extension in I) from classes to their
extensions; the first semantic condition in the table below amounts to the
following definition of this mapping in terms of the relational extension of
rdf:type:

ICEXT(x) = {y | <y,x> is in IEXT(I(rdf:type)) }

Notice that a class may have an empty class extension, and that (as noted
earlier) two different class entities could have the same class extension;
and that given the above definition, the class extension of
rdfs:Class contains the class rdfs:Class.

An
rdfs-interpretation of V is an rdf-interpretation I of (V union rdfV union rdfsV) which satisfies the following
semantic conditions and all the triples in the subsequent table, which we
will call axiomatic triples. The first condition can be understood
as a definition of ICEXT and hence of IC, the set of classes. Since I is an
rdf-interpretation, this means that IP =
ICEXT(I(rdf:Property))

x is in ICEXT(y) iff <x,y> is in
IEXT(I(rdf:type))

IC = ICEXT(I(rdfs:Class))

IR = ICEXT(I(rdfs:Resource))

If <x,y> is in IEXT(I(rdfs:range)) and
<u,v> is in IEXT(x) then v is in ICEXT(y)

If <x,y> is in IEXT(I(rdfs:domain)) and
<u,v> is in IEXT(x) then u is in ICEXT(y)

<x,y> is in
IEXT(I(rdfs:subClassOf)) if and only if x and y are in
IC and ICEXT(x) is a subset of ICEXT(y)

<x,y> is in IEXT(I(rdfs:subPropertyOf)) if
and only if x and y are in IP and IEXT(x) is a subset of IEXT(y)

If x is in ICEXT(I(rdfs:ContainerMembershipProperty))
then <x,I(rdfs:member)> is in
IEXT(I(rdfs:subPropertyOf))

The truth of the axiomatic triples could be stated as conditions on IEXT
and ICEXT, but it is convenient to use the truth-of-triples formulation.
Similarly, the conditions on IC and IP in the first table could be stated as
axiomatic triples with property rdf:type and objects
rdfs:Class and rdfs:Property respectively.

The semantics given here for rdfs:range and
rdfs:domain do not entail that superclasses of domains or
ranges of a property must also be domains and ranges of that property. For
some purposes it may be more convenient to phrase the semantic conditions as
'iff' conditions, which would require imposing this as an extra condition on
the meanings of rdfs:range and rdfs:domain. This
extra condition will not effect any class-membership entailments on the
elements of the domains and ranges of the property. The semantics given here
was chosen because it is sufficient for all normal uses of these terms,
allows some subtleties in class reasoning, and places a lesser burden on
implementors. Some domain and range assertions are
omitted from the above table; in those cases, the domain or range of the
property may be taken to be rdfs:Resource, i.e. the universe;
such range and domain assertions are essentially vacuous.

We will not attempt to give a pictorial diagram of an
rdfs-interpretation.

The semantic conditions on rdfs-interpretations do not
include the condition that ICEXT(I(rdfs:Literal)) must be a
subset of LV. While this would seem to be required for
conformance with RDFMS, there is no way to impose this
condition by any RDF assertion or syntactic closure rule. This
limitation is due to the fact that RDF does not allow literals to occur in
the subject position of a triple, so there are severe restrictions on what
can be said about literals in RDF. Similarly, while properties may be
asserted of the the class rdfs:Literal, none of these can be
rdfs-validly transferred to literals
themselves.

For example, a triple of the form

<ex:a> rdf:type rdfs:Literal .

is consistent even though 'ex:a' is a uriref rather than a
literal. What it says is that I(ex:a) is a literal value, ie
that the uriref 'ex:a' denotes a literal value. There is
however no way in current RDF to specify exactly which literal value it
denotes.

Note that the interpolation lemma guarantees that any triple containing a
literal object entails a similar triple with a bnode as object:

<ex:a> <ex:b> "10" .

entails

<ex:a> <ex:b> _:xxx .

This means that literal denotes 'something', which could therefore also be
named, at least in principle, by a uriref.

3.4 Datatyped interpretations

A datatype is identified by a uriref and defines a set of character
strings called lexical forms and a mapping from that set to a set of values.
Exactly how these are defined is a matter external to RDF, but this is the
minimal structure required in order to state a semantics. In operational
terms, a reasoning engine would require that the uriref of a datatype
provides access to a process which can determine, for any character string,
whether or not it is a valid lexical form for that datatype, and for any two
such valid character strings, whether or not they map to the same value under
the lexical-to-value mapping. It may also use information about the identity
of datatype values from different datatypes, if that information is
available.

Since the set of possible datatypes is open-ended, we will assume that
datatype interpretations are defined relative to a particular set of
datatypes, and refer to D-interpretations where D is some set of
datatypes, which we will call recognized datatypes. A 'datatype-aware' RDF
engine SHOULD be competent to recognize at least the rdfs:XMLLiteral
datatype and the set of all the XML Schema primitive datatypes. We will call this set XSD. and
use the Qname prefix xsd: to refer to XML Schema datatypes in examples.

This is equisitely expressed, but
does not seem to belong in the semantics doc. Move to
concepts?

We will assume that there is a global mapping L2V from datatypes to their
lexical-to-value mappings; the valid lexical forms of a datatype d constitute
the domain of L2V(d), and the value spaces of d Care here: after your recent discussion with
Henry, are we sure that the xsd notion of a value space is exactly the set of
values of a datatype? Oh, and how many value spaces does a datatype have?
s/spaces/space/. ... and the range of L2V(d) is the set of values
in the value space of d. is the range of L2V(d). Recall that the set
LV is defined to include all members of all datatype value spaces, so that
L2V(d) must be a subset of LV.

A D-interpretation of a graph G is an rdfs-interpretation I of V,
where V contains the vocabulary of G, which satisfies the following extra
conditions:

ICEXT(I(rdfs:Datatype)) = D

For any typed literal "sss"^^ddd in G, if I(ddd) is in D and 'sss'
is a valid lexical form for I(ddd) then IL("sss"^^ddd) =
L2V(I(ddd))(sss)

For any typed literal "sss"^^ddd in G, if I(ddd) is in D and 'sss'
is not a valid lexical form for I(ddd) then IL("sss"^^ddd) is not in
LV

If x is in D, then ICEXT(x) is the value space of L2V(x) ditto

The first condition says that membership in the class
rdfs:Datatype means the same as being a recognized datatype.
Thus, the inclusion of a triple of the form

<ex:ddd> rdf:type rdfs:Datatype .

in an RDF graph can be understood by a datatype-aware RDF
reasoner as a claim that ex:ddd identifies a recognized
datatype. Such reasoners MAY post a warning or an error condition when they
are unable to access the relevant datatype information. While normal RDFS
reasoning is valid when applied to the datatype vocabulary, other
implications which depend on the properties of the datatype spaces may be
missed, and datatype clashes or other error conditions may be
undetectable.

The second condition says that the meaning of any typed literal which uses
a 'recognized' datatype is the value of the literal character string under
that datatype. For example, if I is an XSD-interpretation then
I("15"^^xsd:integer) must be the number fifteen. Notice that this applies
only to datatypes in D; typed literals whose type is not a recognized
datatype are treated as before, i.e. as denoting some unknown thing. This
means that their meanings can be further restricted by adding a suitable
extra datatype to the set of recognized datatypes.

The third condition requires that an 'ill-formed' typed literal, i.e. one
where the literal string is not in the lexical space of the datatype, i/does/ not denote any literal
value. Intuitively, such a name does not denote any value, but in order to
avoid the semantic complexities which arise from empty names, we require such
a typed literal to denote an 'arbitrary' value. Thus for example, if D
contains the XML schema datatypes, then all that can be concluded about
I("arthur"^^xsd:integer) is that it is not in
ICEXT(I(rdfs:Literal)). Datatype-aware RDF reasoners SHOULD post a warning or an error condition when such
literals are detected.

This comment is technical and beyond
my competence. We have LV which is a subset of IR. "arthur"^^xsd:integer
must denote something which is not in LV, you say here, I think. Are we sure
there is anything in IR not in LV? Nowhere so far have we required IR to be
a strict superset of LV. Or is that what you have just done with this
requirement. I'm waiting to get the next para
with baited breath, to see if the telepathy is still
working.

The final condition indicates that RDF uses a datatype uriref in two ways:
as a name for the datatype itself, and (when used as a class name) to
indicate the class containing the elements of the value space ditto of the datatype.

Users should take care to distinguish the value space of a datatype from
the class of its members. There you go
again. telepathy in action. Different issue though. For
example, the XML schema spec [XMLS] allows primitive datatypes whose elements
are considered unequal as elements of the value space but which are identical
when viewed as class members.

I would really like to stay away from
this issue. You no doubt understood Henry better than I, but I get the
impression there is still debate inside schema wg on this and the situation
may change. Could it be that xsd datatypes don't really denote say integers,
but pairs (primitive type, value). Or could it be that when we are talking
about equality, xsd are using a notion of equals that takes four arguments
xsd.equals(type1, val1, type2, val2). I dunno. I'm just thinking it would
be good if we could write our stuff to stay completely away from that issue
and as you say below, its up to the schema folks to explain, with our help of
course, what exactly their model is.

RDF does not impose any identity conditions on elements in value spaces,
nor assume any subclass relationships between datatype value classes.
Information about such relationships should be obtained from the
specifications of the datatypes themselves. Similarly, RDF does not assume
that its literal strings are identical to elements of the class
xsd:string, even though both are defined as sequences of unicode
characters. Users may wish to make such identifications, but are cautioned
that other users may disagree with any such claims or assumptions.

The treatment of unknown types provides a trivial proof of the following
lemma:

Datatype monotonicity lemma. If D is a subset of D' and S
D-entails E, then S D'-entails E.

It is possible for an RDF graph to have no D-interpretation which
satisfies it. For example, since XML Schema requires that the value spaces of
xsd:string and xsd:integer to be disjoint, so it is impossible to construct a
XSD-interpretation satisfying the graph

<ex:a> <ex:b> "25"^^xsd:integer .
<ex:b> rdfs:range xsd:string .

This situation could be characterized by saying that the graph is
XSD-inconsistent, or as a datatype clash. Note that it is possible
to construct a satisfying rdfs-interpretation for this graph, but any such
interpretation, when extended to an XSD-interpretation, would violate the XSD
conditions, since the class extensions of I(xsd:string) and
I(xsd:string) would have a nonempty intersection. Although the definition of entailment means that a
D-inconsistent graph D-entails any RDF graph, datatype-aware RDF reasoners
SHOULD NOT publish conclusions derived from a recognized datatype-clash
contradiction.

This semantics for datatypes is minimal. It makes no provision for
assigning a datatype to the range of a property,

Que?

prop rdfs:range xsd:integer .

We can say that
right?

for example, and does not provide any way of explicitly asserting that a
blank node denotes a particular value under a datatype mapping. There are
several technical difficulties in extending this to a broader class of
datatype usages while also preserving the simple 'core' of RDF. We expect
that the datatyping machinery will be extended in later versions of RDF.

We will say that S rdf-entails E (S rdfs-entails E, S
D-entails E) when every rdf-interpretation (every
rdfs-interpretation, every interpretation datatyped with respect to D) which
satisfies every member of S also satisfies E. This follows
the wording of the definition of simple entailment
in section 2, but refers only to rdf- , rdfs- or D-interpretations instead of
all simple interpretations. These are examples of vocabulary entailment, i.e. entailment relative
to a set of interpretations which satisfy extra semantic conditions on a
reserved vocabulary.

It is easy to see that the lemmas in section 2 do not hold for vocabulary
entailment. For example, the triple

rdf:type rdf:type rdf:Property .

is true in every rdf-interpretation, and hence rdf-entailed by the empty
set, which immediately contradicts the interpolation lemma for
rdf-entailment.

Rather than develop a separate theory of the syntactic conditions for
recognising entailment for each reserved vocabulary, we will use a general
technique for reducing these broader notions of entailment to simple
entailment, by defining the closure of an RDF graph relative to a set
of semantic conditions. The basic idea is to rewrite the semantic conditions
as a set of syntactic inference rules, and define the closure to be the
result of applying those rules to exhaustion. The resulting graphs will
contain RDF triples which explicitly state all the special meanings embodied
in the extra semantic conditions, in effect axiomatizing them in RDF itself.
A graph rdf-entails (rdfs-entails) another just when its rdf-closure
(rdfs-closure) simply entails it. It is not possible to provide such a tight
result for D-entailment closures since the relevant semantic conditions
require identities which cannot be stated in RDF.

The notion of closure used here is purely a formal device to relate two
notions of entailment. We do not mean to suggest that closure rules should be
used as a computational technique, or that actually generating the full
closure would be the best process to use in order to determine vocabulary
entailment. Implementors who wish to check any kind of entailment should use
a process which is optimised for the combinatorics of the particular set of
use cases that are most likely to arise in a given application area. In many
cases it may be more efficient to use a process of backchaining on the
closure rules, for example.

Closure rules correspond directly to implication axioms in the Lbase
translation given in the appendix.

1. Add the following triple (which is true in any rdf-interpretation):

rdf:nil rdf:type rdf:List .

2. Apply the following rule recursively to generate all legal RDF triples
(i.e. until none of the rules apply or the graph is unchanged.) Here
xxx and yyy stand for any uriref, bNode or
literal, aaa for any uriref.

if E contains

then add

rdf1

xxx aaa yyy .

aaa rdf:type rdf:Property .

Notice that this immediately generates the triple

rdf:type rdf:type rdf:Property .

which expresses a central semantic property of rdf
interpretations.

The following lemma is the basic result on rdf-entailment, and illustrates
a general pattern of how to characterize vocabulary entailment
syntactically.

1. Add the RDFS axiomatic triples from the table in section 3.3 and all
the following triples. There are many other triples which are true in every
rdfs-interpretation, but they will be generated from these by the closure
rules.

2. Add all triples of the following forms. This is an infinite set because
the RDF container vocabulary is infinite. However, since none of these
triples entail any of the others, it is only necessary, in practice, to add
the triples which use those container properties which actually occur in any
particular graph or set of graphs in order to check the rdfs-entailment
relation between those graphs.

3. Apply the following rules recursively to generate all legal RDF triples
(i.e. until none of the rules apply or the graph is unchanged.) Here,
xxx, yyy and zzz stand for
any uriref, bNode or literal, aaa for any uriref, and
uuu for any uriref or bNode (but not a literal).

If E contains:

then add:

rdf1

xxx aaa yyy .

aaa rdf:typerdf:Property
.

rdfs2

xxx aaa yyy .
aaa rdfs:domain zzz .

xxx rdf:type zzz .

rdfs3

xxx aaa uuu .
aaa rdfs:range zzz .

uuu rdf:type zzz .

rdfs4a

xxx aaa yyy .

xxx rdf:typerdfs:Resource
.

rdfs4b

xxx aaa uuu .

uuu rdf:typerdfs:Resource
.

rdfs5

aaa rdfs:subPropertyOf bbb .
bbb rdfs:subPropertyOf ccc .

aaa rdfs:subPropertyOf ccc .

rdfs6

xxx aaa yyy .
aaa rdfs:subPropertyOf bbb .

xxx bbb yyy .

rdfs7

xxx rdf:type
rdfs:Class .

xxx rdfs:subClassOfrdfs:Resource .

rdfs8

xxx rdfs:subClassOf yyy .
yyy rdfs:subClassOf zzz .

xxx rdfs:subClassOf zzz .

rdfs9

xxx rdfs:subClassOf yyy .
aaa rdf:type xxx .

aaa rdf:type yyy .

rdfs 10

xxx rdf:type rdfs:ContainerMembershipProperty .

xxx rdfs:subPropertyOf rdfs:member .

rdfs 11

xxx rdf:type rdfs:Datatype .

xxx rdfs:subClassOf rdfs:Literal .

The outputs of these rules will often trigger others. For example, these
rules will generate the complete transitive closures of all subclass and
subproperty heirarchies, together with all of the resulting type information
about everything which can be inferred to be a member of any of the classes,
and will propagate all assertions in the graph up the subproperty heirarchy,
re-asserting them for all super-properties. rdfs1 will
generate type assertions for all the property names used in the graph, and
rdfs3 together with the first triple in the above list will add all the types
for all the class names used. Any subproperty or subclass assertion will
generate appropriate type assertions for its subject and object via rdfs 2
and 3 and the domain and range assertions in the RDFS axiomatic triple set.
The rules will generate all assertions of the form

xxx rdf:type rdfs:Resource .

for every xxx in V, and of the form

xxx rdfs:subClassOf rdfs:Resource .

for every class name; and several more 'universal' facts, such as

rdf:Propertyrdf:typerdfs:Class
.

However, it is easy to see that (with the restriction noted of the
infinite sets to those membership properties which occur in the graph) the
rules will indeed terminate on any finite RDF graph, since there are only
finitely many triples that can be formed from a given finite vocabulary.

A similar result applies here as in the case of rdf-entailment, though it
takes considerably more work to prove:

We note in passing that the stronger 'iff' semantic conditions on
rdfs:domain and rdfs:range mentioned in section
3.3 would be captured by adding the additional rules

rdfs 2a

xxx rdfs:domain yyy .
yyy rdfs:subClassOf zzz .

xxx rdfs:domain zzz

rdfs 3a

xxx rdfs:range yyy .
yyy rdfs:subClassOf zzz .

xxx rdfs:range zzz

and that these would provide a redundant inference path to the conclusions
of rdfs 2 and 3.

4.3 Datatype entailments
(Informative)

In order to capture datatype entailment in terms of assertions and closure
rules, the rules need to refer to information about identity supplied by the
datatypes themselves; and to state the rules it is necessary to assume
syntactic conditions which can only be checked by consulting the datatype
sources. Since such questions are beyond the scope of RDF, it is impossible
to prove an entailment lemma for datatype closures analogous to those for RDF
and RDFS.

Datatype information can be characterized abstractly as a (demunerably
infinite) set of assertions about Unicode strings which specify, for each
legal lexical form of each datatype, the fact that it is indeed legal, and
which other lexical forms map to the same value under each datatype. Often
the content of these infinite sets can be captured by familiar recursive
rules on the lexical form of the equations and inequations, such as leading
zero suppression for numerals, but for our purposes we can characterize
the

The following rules are valid in every D-interpretation, provided that ddd
indicates a datatype in D, sss and ttt are character strings which are both
valid lexical forms for that datatype, and for the second rule that sss and
ttt are mapped into the same value by the datatype.

In the first rule, _:xxx is a new bnode, i.e. one that does not appear
elsewhere in the graph.

rdfD 1

ddd rdf:type rdfs:Datatype .
aaa ppp "sss"^^ddd .

aaa ppp _:xxx .
_:xxx rdf:type ddd .

rdfD 2

ddd rdf:type rdfs:Datatype .
aaa ppp "sss"^^ddd .

aaa ppp "ttt"^^ddd .

These rules come as close as one can get in RDF to asserting that the
typed literal denotes a literal value and that literals which map into the
same values are equal. Notice that it would be invalid to make these
inferences without checking that the literal string sss is in the lexical
space of the datatype, so these cannot be considered valid
rdfs-entailments.

These rules do not support any entailments based on identity between
values of different datatypes. An obvious generalization of the second rule
would permit such conclusions, but questions of identity between items in
value spaces of two different datatypes should be referred to the authorities
who defined the datatypes. These rules do however suffice to expose a
datatype clash, by a chain of reasoning of the following form:

As noted in the introduction, an alternative way to specify RDF
interpretations is to give a translation from RDF into a logical langauge
with a model theory already attached, as it were. This 'axiomatic semantics'
approach has been suggested and used previously (@@refs) with various
alternative versions of the target logical language. Here we use a version of
first-order logic which was designed to provide a semantic reference for such
translations from web-based languages, called Lbase
[LBASE].

To translate an RDF graph into the semantic reference language
Lbase, apply the following rules to each expression
noted. Each rule gives a translation TR[E] for the expression E, to
be applied recursively. To achieve a translation which reflects a namespace
entailment, add the axioms specified. Each namespace includes all axioms and
rules for preceding namespaces, so that the RDFS translation of a graph
should include the RDF translation as well as the RDFS axioms, and so on.

This translation uses the Lbase logical expressions
Lbase:String and
Lbase:XMLthing, which are true respectively of unicode
character strings, and anything that is denoted by a piece of well-formed XML
syntax; and it introduces some ad-hoc terminology in order to give a logical
account of the meanings implicit in the various literal constructions; for
example, the RDFS-D axioms use a predicate 'badLiteral' to flag cases of
typed literals which are illegally formed according to their attached
datatype.. The axioms given are sufficient to define the intended meanings of
the nonlogical vocabulary used.

The RDF-D axioms use two logical expressions whose meanings are not given
axiomatically but are intended to be described by the datatypes in D.
LegalLexicalForm(?y,?x) is true when ?y is a character string
which is a legal lexical for for the datatype ?x, and L2V(?y,?x)
is a term denoting the value of the legal lexical form ?y under the
L2V mapping of the datatype ?x. Such meanings can usually be defined by a
countable set of Lbase axioms all conforming to some
regular pattern; for example, the datatype xsd:integer can be defined by
infinitely many axioms of the form

LegalLexicalForm('345', xsd:integer) and L2V('345',
xsd:integer)=345

We give the axioms for the built-in RDF datatype rdfs:XMLLiteral, which
are trivial since they refer to the built-in Lbase
category Lbase:XMLthing. The use of the datatype as a function name is
defined by the axioms given: in effect, it means the same as the use of L2V
when the argument is a legal lexical form, but has a different meaning if
not.

Lbase translation rules

RDF expression E

Lbase expression
TR[E]

a plain literal "sss"

'sss', with all occurrences of the symbols ',/,(,),<,>
prefixed with /

a plain literal "sss"@tag

the term pair(TR["sss"],
'tag')

a typed literal "sss"^^ddd

the term
TR[ddd](TR["sss"])

rdfs:Resource

T

any other uriref aaa

aaa

a blank node

a variable (one distinct variable per blank node)

a triple aaa bbb ccc .

TR[bbb](TR[aaa],
TR[ccc])

an RDF graph

The existential closure of the conjunction of the translations of
all the triples in the graph.

Note, we did simply identify the use of the datatype function with the
assertion of the datatype value, in order to avoid contradictions when
describing a datatype error. This illustrates a general technique for
handling error conditions when translating into
Lbase.

Proof. Suppose I satisfies E'
and E' is an instance of E. Then for some mapping A
on the blank nodes of E', I+A satisfies every triple in E'. For each blank
node b in E, define B(b)=I+A(c), where c is the blank node or name that is
substituted for b in E', or c=b if nothing was substituted for it. Then
I+B(E)=I+A(E')=true, so I satisfies E. But I was arbitrary; so E' entails E. QED.

If an instance of a graph E' is a subgraph of another graph
E then E entails E'; this follows from the subgraph and instance lemmas. As
we show below, this is in fact a necessary as well as sufficient condition
for entailment, so it is useful to give a name to the syntactic condition
that captures non-entailment. Say that a
graph E' is separable from a graph E if no instance of E' is a
subgraph of E. In particular, a ground graph is separable from E just
when it is not a subgraph of E, and a ground triple is separable just in case
it isn't in the graph. Graphs which are not separable from E are entailed by
E; but for all others, there is a way to arrange the world so that they are
false and E true.

For ground graphs, the subgraph lemma can be
strengthened to provide simple necessary and sufficient conditions for
entailment.

Proof. We will construct the interpretation from the
graph, by providing 'just enough' entities and extensions to make the graph
true. Since the exact nature of the things in the universe is irrelevant,
it is convenient to use the nodes of the graph themselves as their own
denotations. (That was
Herbrand's idea.)

The universe of I is defined to be the set of names and blank nodes in
the graph.

Define IS to be the identity mapping on the vocabulary of the graph, and
IEXT as follows: <x,y> is in IEXT(z) just when there is a triple in
the graph of the form x z y . Define the mapping A to be the identity
mapping on blank nodes of the graph.

Clearly I satisfies all ground triples in the graph, and I+A satisfies
the entire graph; so I satisfies the graph. QED

An interpretation constructed in this way, so that the IS mapping is the
identity mapping, is called a Herbrand interpretation. A Herbrand
interpretation treats urirefs in the same way as literals, i.e. as denoting
their own syntactic form. Of course this may not be what was intended by the
writer of the RDF, but the lemma shows that any graph can be
interpreted in this way.

If I satisfies E, then I may contain more information than is necessary to
specify the truth of E; an interpretation - a world - can be larger than
strictly needed to establish the truthvalues of a particular set of triples.
It is therefore useful to define a notion of the minimal part of an
interpretation which is just enough to make a given graph true.

Say that I' is a subinterpretation of I when vocab(I')
is a subset of vocab(I), IR'is a subset of IR, I'(x)=I(x) wherever I'(x) is
defined, and IEXT'(x) is a subset of IEXT(x) wherever IEXT'(x) is defined.
Intuitively, I' defines a 'part' of the world defined by I. If a
subinterpretation of I satisfies E, then I must also satisfy E.

Herbrand interpretations are minimal, and every minimal interpretation has
a corresponding Herbrand interpretation which assigns the same truthvalues to
every triple, and hence to every graph.

It is clear that if I satisfies E, then a minimal satisfying
subinterpretation of I exists with a vocabulary precisely the vocabulary of
E. The minimal interpretations can be characterized by the following
lemma.

Minimality lemma. If I is a
minimal satisfying interpretation of E, then I fails to satisfy every triple which has no instance in E.

Proof. We will argue by reductio. Suppose I satisfies
some such triple S P O, i.e.. IEXT(I(P)) contains <I(S),I(O)>, and
consider the subinterpretation I' which is like I except that IEXT(I'(P))
does not contain that pair. Since S P O has no instances in E,
I'+A(x)=I+A(x) for any mapping A from blank nodes and any triple x in E,
and I satisfies E, so I' satisfies E; so I is not minimal.
QED

The property extensions in a minimal interpretation are 'shrink-wrapped'
onto the assertions in the graph. Notice that every thing in the universe of
a minimal interpretation of E must be the denotation of at least one node in
E, and that every pair in any property extension must have at least one
corresponding triple in E that it makes true; for if not, one could delete
some of the interpretation and still satisfy E. We will make use of this
property in later proofs.

Proof.The construction in the proof of the Herbrand
Lemma in fact establishes this result for arbitrary separable graphs.
Consider the Herbrand interpretation I constructed in the proof of the
Herbrand lemma, and let <S P O> be a triple which has no instances in
E. Then either S is a name and there are no triples of the form S P O' in
E, or O is a name and there are no triples of the form S' P O in E.
Consider the first case (the other case is similar); then by the
construction in the earlier proof, IEXT(I(P)) contains no pairs of the form
<I(S), x>; so there is no mapping A from blank nodes to IR that could
make the triple true in I+A; so the triple is false in I. Similarly for the
other case. QED

Merging lemma.
The merge of a set S of RDF graphs is entailed by S,
and entails every member of S.

Proof. Obvious, from definitions of entailment and merge. All members of S are true iff
all triples in the merge of S are true. QED.

Proof. Since E' is a proper
instance and E is lean, E' contains a triple S' P'
O' obtained by replacing a blank node by a name in the triple S P O from E,
while S' P' O' has no instances in E; for if not, the triple S P O would
have had a proper instance in E. By the strong Herbrand lemma, there exists
an interpretation which satisfies E but not E'. So E does not entail E'.
QED

Proof. First we assume that the blank nodes occur in
two distinct triples in E. Suppose that E contains the triples

S1 P1 _:x1 .

S2 P2 _:x2 .

where E' contains the triples

S1 P1 _:x .

S2 P2 _:x .

(The arguments for the cases where the blank nodes occur in other
positions in the triples are similar.) Since E is lean, it contains no
other triples of the form S1 P1 O' or S2 P2 O'. Let I be a Herbrand
interpretation of E; then I(S1) is distinct from I(S2) and IEXT(I(P1))
={<I(S1), _:x1>}and IEXT(I(P2))={<I(S2), _:x2>}. Let A be any
mapping from the blank nodes of E' to IR, then in order for both triples to
be true in I+A, I+A(_:x) would have to equal both _:x1 and _:x2; but these
are distinct; so I does not satisfy E'.

The only remaining case is where E contains a single triple with two
blank nodes which are identified in E':

_:x1 P _:x2 .

where E' contains

_:x P _:x .

The argument here is similar; the Herbrand interpretation I now has
IEXT(I(P)) = {<_:x1,_:x2>} and there is no mapping from the second
triple that could satisfy this, so again I satisfies E but not E'.
QED.

Note that the 'minimal' nature of the Herbrand construction provides an
interpretation that is sufficient to make a graph true, but only just
sufficient. This is a basic technique for showing that one graph does not
entail another and for establishing a precise correspondence between
syntactic relationships and entailment.

Proof. 'if' is a direct consequence of the merging and
instance lemmas.

To prove 'only if' we will show the converse. This is just a
re-statement of the strong Herbrand lemma. Assume that no subgraph of the
merge of S is an instance of E, i.e. that all subgraphs of the merge of S
fail to be instances of E; i.e., that E is separable from the merge of S.
Then by the strong Herbrand lemma the merge of S does not entail E. So, by
the merging lemma, S does not entail E. QED.

Skolemization is a syntactic transformation routinely used in automatic
inference systems in which existential variables are replaced by 'new'
functions - function names not used elsewhere - applied to any enclosing
universal variables. While not itself strictly a valid operation,
skolemization adds no new content to an expression, in the sense that a
skolemized expression has the same entailments as the original expression
provided they do not contain the new skolem functions.

The following lemma shows that skolemization has the same properties in
RDF as it has in conventional logics. Intuitively, this lemma shows that
asserting a skolemization expresses a similar content to asserting the
original graph, in many respects. In effect, it simply gives 'arbitrary'
names to the anonymous entities whose existence was asserted by the use of
blank nodes. However, care is needed, since these 'arbitrary' names have the
same status as any other urirefs once published. Also, skolemization would
not be an appropriate operation when applied to anything other than the
antecendent of an entailment. A skolemization of a query would represent a
completely different query.

Now, suppose that sk(E) entails F where F shares no vocabulary with V;
and suppose I is some interpretation satisfying E. Then for some mapping A
from the blank nodes of E, I+A satisfies E. Define an interpretation I' of
the vocabulary of sk(E) by: IR'=IR, IEXT'=IEXT, I'(x)=I(x) for x in the
vocabulary of E, and I'(x)=I+A(y) for x in V, where y is the blank node in
E that is replaced by x in sk(E).Clearly I' satisfies sk(E), so I'
satisfies F. But I'(F)=I+A(F) since the vocabulary of F is disjoint from
that of V; so I satisfies F. So E entails F. QED.

RDF closure lemma. Any satisfying
rdf-interpretation of E satisfies the rdf-closure of E; and any minimal
simple satisfying interpretation of the rdf-closure of E is a satisfying
rdf-interpretation of E.

The first part follows from the fact that the closure rules are all
rdf-valid. To show this, suppose I is an rdf-interpretation; then for any
aaa in the vocabulary of I, if a triple of the form xxx aaa yyy is true in
I, then IEXT(I(aaa)) is nonempty then I(aaa) is in IP, so IEXT(I(rdf:type))
contains <I(aaa),I(rdf:Property)>, so the triple aaa rdf:type
rdf:Property is true in I. Since I is an rdf-interpretation, its vocabulary
contains rdf:type and IP contains I(rdf:type), so in particular the
triple

rdf:type rdf:type rdf:Property .

is true in I. That establishes that the closure rules are rdf-valid.

To prove the other part of the lemma we must show that the closure rules
are together sufficient to force any minimal interpretation to be an
rdf-interpretation of E. The simplest way to argue this is to show the
converse, viz. that any minimal simple interpretation of the rdf-closure
that violates one of the semantic conditions for an rdf-interpretation of E
would thereby fail to satisfy the closure. Suppose therefore that I is a
minimal simple interpretation of the rdf-closure of E.

If I violates the first constraint then IP does not contain I(rdf:type);
in that case, the added triple in the first closure rule is false in I. So
assume that I violates the second constraint. Then there is some x in IP
for which IEXT(I(rdf:type)) does not contain <x,I(rdf:Property)>.
Since I is minimal, there is some node aaa in E with I(aaa)=x; and since
I(aaa) is in IP, there is a pair <y,z> in IEXT(I(aaa)) and a
triple

bbb aaa ccc .

in E with I(bbb)=y and I(ccc)=z. Then the closure of E contains the
triple

aaa rdf:type rdf:Property .

which is false in I. So I fails to satisfy the rdf-closure.
QED.

Notice the need for the minimality assumption, which
'forces' the semantic violation to be made explicit in the syntax of the
graph itself. The second part of the lemma could be false for an arbitrary
simple interpretation of the closure, which might fail to meet the required
semantic conditions on some part of the universe that was not referred to in
the graph itself. In general, one cannot infer, from the lack of an assertion
in a graph, that what that assertion would say if it were in the graph must
be false in a satisfying interpretation of the graph. Minimal
interpretations, however, embody a 'closed
world assumption' which would sanction such an inference. To prove an
entailment we need to prove something about all interpretations; but to prove
the converse, it is enough to show that a single interpretation exists
with the right properties, and this is where the special properties of
minimal interpretations are useful.

RDF entailment
lemma. S rdf-entails E if and only if the rdf-closure of the merge
of S simply entails E.

Proof. Follows from the merging lemma, the RDF closure
lemma and the definition of entailment. By the merging lemma, we can
identify S with the merge of S, i.e. we can treat a set of graphs as a
single graph MS.

So suppose that MS rdf-entails E, and let I be a simple interpretation
of the rdf-closure c(MS) of MS. Then there is a minimal simple
subinterpretation I' of I which satisfies c(MS); so, by the previous lemma,
I' is a satisfying rdf-interpretation of E. Therefore I satisfies E, since
I' is a subinterpretation of I.

Conversely, suppose that c(MS) simply entails E, and let I be an
rdf-interpretation of MS; then by the previous lemma, I satisfies c(MS), so
I satisfies E (since every rdf-interpretation is a simple interpretation).
QED.

RDFS Closure
Lemma. Any satisfying rdfs-interpretation of E satisfies the
rdfs-closure of E; and any minimal simple satisfying interpretation of the
rdf-closure of E is a satisfying rdfs-interpretation of E.

The first part follows from the fact that the rdfs closure rules are all
rdfs-valid, which can be checked case by case. For example, consider the
closure rule rdfs5, and suppose I is an rdfs-interpretation which
satisfies

aaa rdfs:subPropertyOf bbb .

bbb rdfs:subPropertyOf ccc .

Then by the semantic conditions on an rdfs-interpretation, IEXT(I(aaa))
is a subset of IEXT(I(bbb)) and IEXT(I(bbb)) is a subset of IEXT(I(ccc));
so IEXT(I(aaa)) is a subset of IEXT(I(ccc)); so I satisfies

aaa rdfs:subPropertyOf ccc .

The other cases are similarly straightforward.

To demonstrate the other part of the lemma we must show that the closure
rules are together sufficient to restrict any minimal interpretation to be
an rdfs-interpretation. The simplest way to argue this is to show the
converse, by demonstrating that any minimal simple interpretation of the
rdfs-closure c(E) of E that violates one of the semantic conditions for an
rdfs-interpretation of E would thereby make some triple in c(E) false.
Again, this can be checked by a detailed examination of the cases that
arise.

Suppose that I is a minimal simple interpretation of E. If I violates
any of the conditions involving the rdfs-vocabulary then it is easy to
check that one of the 'added' triples would be false, eg if
IEXT(I(rdfs:domain) does not contain <I(rdfs:domain),I(rdf:Property)>
then the triple

rdfs:domain rdfs:domain rdf:Property .

is false in I.

If I violates the condition on IEXT(I(rdfs:range)), then
there exist x, y, u and v in IR with <x,y> in
IEXT(I(rdfs:range)), <u,v> in IEXT(x) but v not in
ICEXT(y). Since I is a minimal interpretation and satisfies c(E), the
closure must contain two triples

aaa rdfs:range bbb .

ccc aaa ddd .

where I(aaa)=x, I(bbb)=y, I(ccc)=u and I(ddd)=v; but I makes the
triple

ddd rdf:type bbb .

false, since I(ddd) is not in ICEXT(I(bbb)); but by the closure rule
rdfs3, this triple is in c(E); so I fails to satisfy c(E). The
IEXT(I(rdfs:domain)) case is similar.

Finally, suppose that I violates the condition on
rdfs:subClassOf. Then for some x and y in IR, <x,y> is
in IEXT(I(rdfs:subClassOf)) but there is some z in ICEXT(x)
but not in ICEXT(y). Again, since I is minimal, these entities must occur
in triples in c(E) of the form respectively

aaa rdfs:subClassOf bbb .

ccc rdf:type aaa .

where I(aaa)=x, I(bbb)=y and I(ccc)=z, and where the triple

ccc rdf:type bbb .

is false in I; so I does not satisfy c(E) by a similar argument, using
the closure rule rdfs9. Again, the case for a violation of the the
condition on rdfs:subPropertyOf is similar.

There is one final complication. The semantic conditions for
rdfs:subClassOf require that if the class extension of aaa is a subset of
that of bbb, then aaa rdfs:SubClassOf bbb is true. This condition however
is not

The use of an explicit extension mapping to allow self-application without
violating the axiom of foundation was suggested by Chris Menzel. Peter
Patel-Schneider found several errors in earlier drafts, and suggested several
important technical improvements.