This document, developed by the Rule
Interchange Format (RIF) Working Group, defines a general RIF
Framework for Logic Dialects (RIF-FLD). The framework describes
mechanisms for specifying the syntax and semantics of logic RIF
dialects through a number of generic concepts such as signatures,
symbol spaces, semantic structures, and so on. The actual dialects
are required to specialize this framework to produce their syntaxes
and semantics.

May Be Superseded

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

No Endorsement

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

1 Overview of RIF-FLD

The RIF Framework for Logic Dialects (RIF-FLD) is a formalism
for specifying all logic dialects of RIF, including the RIF Basic
Logic Dialect [RIF-BLD]. It is
a logic in which both syntax and semantics are described through a
number of mechanisms that are commonly used for various logic
languages, but are rarely brought all together. Amalgamation of
several different mechanisms is required because the framework must
be broad enough to accommodate several different types of logic
languages and because various advanced mechanisms are needed to
facilitate translation into a common framework. RIF-FLD gives
precise definitions to these mechanisms, but allows certain details
to vary. The design of RIF envisages that future standard logic
dialects will be based on RIF-FLD. Therefore, any logic dialect
being developed to become a stardard should either be a
specialization of FLD or justify its deviations from (or extensions
to) FLD.

The framework described in this document is very general and
captures most of the popular logic rule languages found in
Databases, Logic Programming, and on the Semantic Web. However, it
is anticipated that the needs of future dialects might stimulate
further evolution of RIF-FLD. In particular, future extensions
might include a logic rendering of actions as found in production
and reactive rule languages. This would support semantic Web
services languages such as [SWSL-Rules] and [WSML-Rules].

This document is mostly intended for the designers of
future RIF dialects. All logic RIF dialects are required to be
derived from RIF-FLD by specialization, as explained
in Sections Syntax of a RIF
Dialect as a Specialization of RIF-FLD and Semantics of a RIF Dialect as
a Specialization of RIF-FLD. In addition to specialization, to
lower the barrier of entry for their intended audiences, a dialect
designer may choose to specify the syntax and semantics in a
direct, but equivalent, way, which does not require familiarity
with RIF-FLD. For instance, the RIF Basic Logic Dialect [RIF-BLD] is specified by specialization
from RIF-FLD and also directly, without relying on the framework.
Thus, the reader who is interested in RIF-BLD only can proceed
directly to that document.

RIF-FLD has the following main components:

Syntactic framework. This framework defines the
mechanisms for specifying the formal presentation syntax of RIF
logic dialects by specializing the presentation syntax of the
framework. The presentation syntax is used in RIF to define the
semantics of the dialects and to illustrate the main ideas with
examples. This syntax is not intended to be a concrete
syntax for the dialects; it leaves out details such as the
delimiters of the various syntactic components, parenthesizing,
precedence of operators, and the like. Since RIF is an interchange
format, it uses XML as its concrete syntax.

Semantic framework. The semantic framework describes the
mechanisms that are used for specifying the models of RIF logic
dialects.

XML serialization framework. This framework defines the
general principles that logic dialects are to use in specifying
their concrete XML-based syntaxes. For each dialect, its concrete
XML syntax is a derivative of the dialect's presentation syntax. It
can be seen as a serialization of that syntax.

Constants and variables. These terms are common to most
logic languages.

Positional terms. These terms are commonly used in
first-order logic. RIF-FLD defines positional terms in a slightly
more general way in order to enable dialects with higher-order
syntax, such as HiLog [CKW93].

Terms with named arguments. These are like positional
terms except that each argument of a term is named and the order of
the arguments is immaterial. Terms with named arguments generalize
the notion of rows in relational tables, where column headings
correspond to argument names.

Frames. A frame term represents an assertion about an
object and its properties. These terms correspond to molecules of
F-logic [KLW95]. There is
syntactic similarity between terms with named arguments and frames,
since object properties resemble named arguments. However, the
semantics of these terms are different.

Classification. These terms are used to define the
subclass and class membership relationships. Like frames, they are
also borrowed from F-logic [KLW95].

Equality. These terms are used to equate other
terms.

Terms are then used to define several types of RIF-BLD
formulas. RIF dialects can choose to permit all or some of
the aforesaid categories of terms. The syntactic framework also
defines the following specialization mechanisms:

Symbol spaces.

Symbol spaces partition the set of non-logical symbols that
correspond to individual constants, predicates, and functions, and
each partition is then given its own semantics. A symbol space has
an identifier and a lexical space, which defines the "shape"
of the symbols in that symbol space. Some symbol spaces in RIF are
used to identify Web entities and their lexical space consists of
strings that syntactically look like internationalized resource
identifiers [RFC-3987], or
IRIs (e.g., http://www.w3.org/2007/rif#iri). Other symbol spaces are
used to represent the datatypes required by RIF (for example,
http://www.w3.org/2001/XMLSchema#integer).

Signatures.

Signatures determine which terms and formulas are
well-formed. It is a generalization of the notion of a
sort in classical first-order logic [Enderton01]. Each nonlogical symbol
(and some logical symbols, like =) has an associated
signature. A signature defines, in a precise way, the syntactic
contexts in which the symbol is allowed to occur.

For instance, the signature associated with a symbol p
might allow p to appear in a term of the form
f(p), but disallow it to occur in a term like
p(a,b). The signature for f, on the other hand,
might allow that symbol to appear in f(p) and
f(p,q), but disallow f(p,q,r) and f(f).
In this way, it is possible to control which symbols are used for
predicates and which for functions, where variables can occur, and
so on.

Restriction.

A dialect might impose further restrictions on the form of a
particular kind of terms or formulas.

Semantic framework. This framework defines the notion of
a semantic structure (also knows as interpretation in
the literature [Enderton01,
Mendelson97]). Semantic
structures are used to interpret formulas and to define logical
entailment. As with the syntax, this framework includes a
number of mechanisms that RIF logic dialects can specialize to suit
their needs. These mechanisms include:

Truth values. RIF-FLD is designed to accommodate
dialects that support reasoning with inconsistent and uncertain
information. Most of the logics that are designed to deal with
these situations are multi-valued. Consequently, RIF-FLD postulates
that there is a set of truth values, TV, which
includes the values t (true) and f
(false) and possibly others. For example, RIF Basic Logic Dialect
[RIF-BLD] is two-valued, but
other dialects can have additional truth values.

Datatypes. Some symbol spaces that are part of the RIF
syntactic framework have fixed interpretations. For instance,
symbols in the symbol space http://www.w3.org/2001/XMLSchema#string are always
interpreted as sequences of unicode characters, and a ≠
b for any pair of distinct symbols. A symbol space whose
symbols have a fixed interpretation in any semantic structure is
called a datatype.

Entailment. This notion is fundamental to logic-based
dialects. Given a set of formulas (e.g., facts and rules)
G, entailment determines which other formulas necessarily
follow from G. Entailment is the main mechanism underlying
query answering in databases, logic programming, and the various
reasoning tasks in Description Logics.

A set of formulas G logically entails another formula
g if for every semantic structure I in some
set S, if G is true in I then
g is also true in I. Almost all logics
define entailment this way. The difference lies in which set
S they use. For instance, logics that are based on the
classical first-order predicate calculus, such as most Description
Logics, assume that S is the set of all semantic
structures. In contrast, most logic programming languages use
default negation. Accordingly, the set S contains
only the so-called "minimal" Herbrand models of G and,
furthermore, only the minimal models of a special kind. See
[Shoham87] for a more
detailed exposition of this subject.

XML serialization framework. This framework defines the
general principles for mapping the presentation syntax of RIF-FLD
to the concrete XML interchange format. This includes:

A specification of the XML syntax for RIF-FLD, including the
associated XML Schema document.

A specification of a one-to-one mapping from the presentation
syntax of RIF-FLD to its XML syntax. This mapping must map any
well-formed group formula of RIF-FLD to an XML document that is
valid with respect to the aforesaid XML Schema document.

This document is the latest draft of the RIF-FLD specification.
Each RIF dialect that is derived from RIF-FLD will be described in
its own document. The first of such dialects, RIF Basic Logic
Dialect, is described in [RIF-BLD].

2 Syntactic Framework

The next subsection explains how to derive the presentation
syntax of a RIF dialect from the presentation syntax of the RIF
framework. The actual syntax of the RIF framework is given in
subsequent subsections.

2.1 Syntax of a RIF Dialect as a
Specialization of RIF-FLD

The presentation syntax for a RIF dialect can be
obtained from the general syntactic framework of RIF by
specializing the following parameters, which are defined later in
this document:

The alphabet of RIF-FLD can be restricted (by omitting
symbols).

An assignment of signatures to each constant and
variable symbol.

Signatures determine which terms in the dialect are well-formed
and which are not.

The exact way signatures are assigned depends on the dialect. An
assignment can be explicit or implicit (for instance, derived from
the context in which each symbol is used).

The choice of the types of terms supported by the
dialect.

The RIF logic framework introduces the following types of
terms:

constant

variable

positional

with named arguments

equality

frame

class membership

subclass

external

A dialect might support all of these terms or just a subset. For
instance, some dialects might not support terms with named
arguments or frame terms or certain forms of external terms (e.g.,
external frames).

The choice of symbol spaces supported by the dialect.

Symbol spaces determine the syntax of the constant symbols that
are allowed in the dialect.

The choice of the formulas supported by the dialect.

RIF-FLD allows formulas of the following kind:

Atomic

Conjunction

Disjunction

Classical negation

Default negation (as in logic programming)

Rule (as in logic programming as opposed to the classical
material implication)

Quantification (universal and existential)

A dialect might support all of these formulas or it might impose
various restrictions. For instance, the formulas allowed in the
conclusion and/or premises of implications might be restricted
(e.g., [RIF-BLD] essentially
allows Horn rules only), certain types of quantification might be
prohibited (e.g., [RIF-BLD]
disallows existential quantification in the rule head), classical
or default negation (or both) might not be allowed (as in RIF-BLD),
etc. A subdialect of RIF-BLD might disallow equality formulas in
the conclusions of the rules.

Note that although the presentation syntax of a RIF logic
dialect is normative, since semantics is defined in terms of that
syntax, the presentation syntax is not intended as a concrete
syntax, and conformant
systems are not required to implement it.

2.2 Alphabet

Definition
(Alphabet). The alphabet of the presentation
language of RIF-FLD consists of

a countably infinite set of constant symbolsConst

a countably infinite set of variable symbolsVar (disjoint from Const)

a countably infinite set of argument namesArgNames (disjoint from both Const and
Var)

connective symbols And, Or, Naf,
Neg, and :-

quantifiers Exists and Forall

the symbols =, #, ##,
->, External, Dialect, Base,
Prefix, and Import

the symbols Group and Document

auxiliary symbols (, ), [,
], <, >, and ^^.

The set of connective symbols, quantifiers, =, etc., is
disjoint from Const and Var. Variables are
written as Unicode strings preceded by the symbol "?". The
argument names in ArgNames are written as Unicode strings
that do not start with a "?". The syntax for constant
symbols is given in Section Symbol Spaces.

The symbols =, #, and ## are used in
formulas that define equality, class membership, and subclass
relationships. The symbol -> is used in terms that have
named arguments and in frame terms. The symbol External
indicates that an atomic formula or a function term is defined
externally (e.g., a builtin), Dialect is a directive used
to indicate the dialect of a RIF document (for those dialects that
require this), the symbols Base and Prefix enable
abridged representations of IRIs, and the symbol Import is
an import directive.

Finally, the symbol Document is used for specifying
RIF-FLD documents and the symbol Group is used to organize
RIF-FLD formulas into collections. ☐

2.3 Symbol Spaces

Throughout this document, we will be using the following
abbreviations:

xs: stands for the XML Schema URI
http://www.w3.org/2001/XMLSchema#

rdf: stands for
http://www.w3.org/1999/02/22-rdf-syntax-ns#

pred: stands for
http://www.w3.org/2007/rif-builtin-predicates#

rif: stands for the URI of RIF,
http://www.w3.org/2007/rif#

These and other abbreviations will be used as prefixes in the
compact URI notation [CURIE], a notation for succinct representation of IRIs. The
precise meaning of this notation in RIF is defined in [RIF-DTB].

The set of all constant symbols in a RIF dialect is partitioned
into a number of subsets, called symbol spaces, which are
used to represent XML Schema datatypes, datatypes defined in other
W3C specifications, such as rdf:XMLLiteral, and to distinguish other sets of
constants. All constant symbols have a syntax (and sometimes also
semantics) imposed by the symbol space to which they belong.

Definition
(Symbol space). A symbol space is a named subset
of the set of all constants, Const. The semantic aspects
of symbol spaces will be described in Section Semantic Framework. Each
symbol in Const belongs to exactly one symbol space.

Each symbol space has an associated lexical space and a unique
identifier. More precisely,

The lexical space of a symbol space is a
non-empty set of Unicode character strings.

The identifier of a symbol space is a sequence of
Unicode characters that form an absolute IRI.

Different symbol spaces cannot share the same identifier.

The identifiers for symbol spaces are not themselves
constant symbols in RIF. ☐

To simplify the language, we will often use symbol space
identifiers to refer to the actual symbol spaces (for instance, we
may use "symbol space xs:string" instead of "symbol space
identified byxs:string").

To refer to a constant in a particular RIF symbol space, we use
the following presentation syntax:

"literal"^^symspace

where literal is called the lexical part
of the symbol, and symspace is an identifier of the symbol
space. Here literal is a sequence of Unicode characters
that must be an element in the lexical space of the symbol
space symspace. For instance, "1.2"^^xs:decimal
and "1"^^xs:decimal are syntactically valid constants
because 1.2 and 1 are members of the lexical space of the XML
Schema datatype xs:decimal. On the other hand,
"a+2"^^xs:decimal is not a syntactically valid symbol,
since a+2 is not part of the lexical space of
xs:decimal.

The set of all symbol spaces that partition Const is
considered to be part of the logic language of RIF-FLD.

RIF requires that all dialects include the symbol spaces listed
and described in Section Constants and
Symbol Spaces of [RIF-DTB]
as part of their language. These symbol spaces include constants
that belong to several important XML Schema datatypes, certain RDF
datatypes, and constant symbols specific to RIF. The latter include
the symbol spaces rif:iri and rif:local, which are used to represent
internationalized resource identifiers (IRIs) and constant symbols
that are not visible outside of the RIF document in which they
occur, respectively. Documents that are exchanged through RIF can
use additional symbol spaces.

2.4 Terms

The most basic construct of a logic language is a term.
RIF-FLD supports several kinds of terms: constants, variables, the
regular positional terms, plus terms with named
arguments, equality, classification terms, and
frames. The word "term" will be used to refer to any
kind of term.

Definition
(Term). A term is a statement of one of the
following forms:

Constants and variables. If t ∈ Const
or t ∈ Var then t is a simple
term.

Positional terms. If t and
t1, ..., tn are terms then
t(t1 ... tn) is a positional
term.

Positional terms in RIF-FLD generalize the regular notion of a
term used in first-order logic. For instance, the above definition
allows variables everywhere, as in ?X(?Y ?Z(?V
"12"^^xs:integer)), where ?X, ?Y,
?Z, and ?V are variables. Even
?X("abc"^^xs:string ?W)(?Y ?Z(?V
"33"^^xs:integer)) is a positional term (as in HiLog [CKW93]).

Terms with named arguments. A term with named
arguments is of the form
t(s1->v1 ...
sn->vn), where t,
v1, ..., vn are terms, and
s1, ..., sn are (not
necessarily distinct) symbols from the set ArgNames.

The term t here represents a predicate or a function;
s1, ..., sn represent
argument names; and v1, ...,
vn represent argument values. Terms with named
arguments are like regular positional terms except that the
arguments are named and their order is immaterial. Note that a term
with no arguments, like f(), is, trivially, both a positional term
and a term with named arguments.

For instance, "person"^^xs:string(name->?Y
address->?Z), ?X("123"^^xs:integer ?W)(arg->?Y
arg2->?Z(?V)), and
"Closure"^^rif:local(relation->"http://example.com/Flight"^^rif:iri)(from->?X
to->?Y) are terms with named arguments. The second of these
terms has a positional term ?X(abc,?W), which occurs in
the position of a function, and the third term's function is
represented by a named arguments term.

Equality terms. An equality term has the
form t = s, where t and s are
terms.

Classification terms. There are two kinds of
classification terms: class membership terms (or just
membership terms) and subclass terms.

Frame terms are used to describe properties of objects. As in
the case of the terms with named arguments, the order of the
properties pi->vi in a frame is
immaterial.

Externally defined terms. If t is a constant,
positional term, a term with named arguments, or a frame term then
External(t) is an externally defined term.

Such terms are used for representing builtin functions and
predicates as well as "procedurally attached" terms or predicates,
which might exist in various rule-based systems, but are not
specified by RIF.

This syntax enables very flexible representations for externally
defined information sources: not only predicates and functions, but
also frames can be used. In this way, external sources can be
modeled as frames in an object-oriented way. For instance,
External("http://example.com/acme"^^rif:iri["http://example.com/mycompany/president"^^rif:iri(?Year)
-> ?Pres]) could be a representation for an external
method "http://example.com/mycompany/president"^^rif:iri
in an external object "http://example.com/acme"^^rif:iri.
☐

The above definitions are very general. They make no distinction
between constant symbols that represent individuals, predicates,
and function symbols. The same symbol can occur in multiple
contexts at the same time. For instance, if p, a,
and b are symbols then
p(p(a) p(a p c)) is a term. Even variables
and general terms are allowed to occur in the position of
predicates and function symbols, so
p(a)(?v(a c) p) is also a term.

Frame, classification, and other terms can be freely nested, as
exemplified by
p(?X q#r[p(1,2)->s](d->e f->g)).
Some language environments, like FLORA-2 [FL2], OO jDREW [OOjD], NxBRE [NxBRE], and
CycL [CycL] support fairly large
(partially overlapping) subsets of RIF-FLD terms, but most
languages support much smaller subsets. RIF dialects are expected
to carve out the appropriate subsets of RIF-FLD terms, and the
general form of the RIF logic framework allows a considerable
degree of freedom.

Observe that the argument names of frame terms,
p1, ..., pn, are terms and,
as a special case, can be variables. In contrast, terms with named
arguments can use only the symbols from ArgNames to
represent their argument names. They cannot be constants from
Const or variables from Var. The reason for this
restriction has to do with the complexity of unification, which is
integral part of many inference rules underlying first-order logic.
We are not aware of any rule language where terms with named
arguments use anything more general than what is defined here.

Dialects can restrict the contexts in which the various terms
are allowed by using the mechanism of signatures. The RIF-FLD language associates a
signature with each symbol (both constant and variable symbols) and
uses signatures to define well-formed terms. Each RIF
dialect is expected to select appropriate signatures for the
symbols in its alphabet, and only the terms that are well-formed
according to the selected signatures are allowed in that particular
dialect.

2.5 Schemas for Externally Defined
Terms

This section introduces the notion of external schemas,
which serve as templates for externally defined terms. These
schemas determine which externally defined terms are acceptable in
a RIF dialect. Externally defined terms include RIF builtins, which
are specified in [RIF-DTB], but
are more general. They are designed to accommodate the ideas of
procedural attachments and querying of external data sources.
Because of the need to accommodate many difference possibilities,
the RIF logical framework supports a very general notion of an
externally defined term. Such a term is not necessarily a function
or a predicate -- it can be a frame, a classification term, and so
on.

Definition (Schema for external term). An external
schema is a statement of the form (?X1
... ?Xn; τ) where

τ is a term of
one of these kinds: constant, positional, named-argument,
frame.

?X1 ... ?Xn is a list of
all distinct variables that occur in τ

The names of the variables in an external schema are immaterial,
but their order is important. For instance,
(?X ?Y; ?X[foo->?Y]) and
(?V ?W; ?V[foo->?W]) are considered to
be indistinguishable, but
(?X ?Y; ?X[foo->?Y]) and
(?Y ?X; ?X[foo->?Y]) are viewed as
different schemas.

A term t is an instance of an external
schema (?X1 ... ?Xn; τ)
iff t can be obtained from τ by a simultaneous
substitution ?X1/s1
... ?Xn/sn of the variables
?X1 ... ?Xn with terms
s1 ... sn, respectively. Some of the
terms si can be variables themselves. For
example, ?Z[foo->f(a ?P)] is an instance of
(?X ?Y; ?X[foo->?Y]) by the substitution
?X/?Z ?Y/f(a ?P). ☐

Observe that a variable cannot be an instance of an external
schema, since τ in the above definition cannot be a
variable. It will be seen later that this implies that a term of
the form External(?X) is not well-formed in RIF.

The intuition behind the notion of an external schema, such as
(?X ?Y; ?X["foo"^^xs:string->?Y]) or
(?V; "pred:isTime"^^rif:iri(?V)), is that
?X["foo"^^xs:string->?Y] or
"pred:isTime"^^rif:iri(?V) are invocation patterns for
querying external sources, and instances of those schemas
correspond to concrete invocations. Thus,
External("http://foo.bar.com"^^rif:iri["foo"^^xs:string->"123"^^xs:integer])
and External("pred:isTime"^^rif:iri("22:33:44"^^xs:time)
are examples of invocations of external terms -- one querying an
external source and another invoking a builtin.

Definition (Coherent set of external schemas). A set of
external schemas is coherent if there is no term,
t, that is an instance of two distinct schemas in the set.
☐

The intuition behind this notion is to ensure that any use of an
external term is associated with at most one external schema. This
assumption is relied upon in the definition of the semantics of
externally defined terms. Note that the coherence condition is easy
to verify syntactically and that it implies that schemas like
(?X ?Y; ?X[foo->?Y]) and
(?Y ?X; ?X[foo->?Y]), which differ only
in the order of their variables, cannot be in the same coherent
set.

It is important to keep in mind that external schemas are
not part of the language in RIF, since they do not appear
anywhere in RIF statements. Instead, like signatures, which are
defined below, they are best thought of as part of the grammar of
the language. In particular, they will be used to determine which
external terms, i.e., the terms of the form External(t)
are well-formed.

2.6 Signatures

In this section we introduce the concept of a signature,
which is a key mechanism that allows RIF-FLD to control the context
in which the various symbols are allowed to occur. For instance, a
symbol f with signature {(term term) => term,
(term) => term} can occur in terms like f(a b),
f(f(a b) a), f(f(a)), etc., ifa
and b have signature term. But f is not
allowed to appear in the context f(a b a) because there is
no =>-expression in the signature of f to
support such a context.

The above example provides intuition behind the use of
signatures in RIF-FLD. Much of the development, below, is inspired
by [CK95]. It should be
kept in mind that signatures are not part of the logic
language in RIF, since they do not appear anywhere in RIF-FLD
formulas. Instead they are part of the grammar: they are used to
determine which sequences of tokens are in the language and which
are not. The actual way by which signatures are assigned to the
symbols of the language may vary from dialect to dialect. In some
dialects (for example [RIF-BLD]), this assignment is derived from the context in
which each symbol occurs and no separate language for signatures is
used. Other dialects may choose to assign signatures explicitly. In
that case, they would require a concrete language for signatures
(which would be separate from the language for specifying the logic
formulas of the dialect).

Definition
(Signature name). Let SigNames be a non-empty,
partially-ordered finite or countably infinite set of symbols,
called signature names. Since signatures are not part
of the logic language, their names do not have to be disjoint from
Const, Var, and ArgNames. We require
that this set includes at least the following signature names:

atomic -- used to represent the syntactic context
where atomic formulas are allowed to appear.

= -- used for representing contexts where equality
terms can appear.

# -- a signature name reserved for membership
terms.

## -- a signature reserved for subclass terms.

-> -- a signature reserved for frame terms.
☐

Dialects may introduce additional signature names. For instance,
RIF Basic Logic Dialect [RIF-BLD] introduces one other signature name,
individual. The partial order on SigNames is
dialect-specific; it is used in the definition of well-formed terms
below.

We use the symbol < to represent the partial order
on SigNames. Informally, α < β means that
terms with signature α can be used wherever terms with
signature β are allowed. We will write α ≤ β if
either α = β or α < β.

Definition (Signature). A signature is a
statement of the form η{e1, ..., en,
...} where η ∈ SigNames is the name of the
signature and {e1, ..., en, ...} is
a countable set of arrow expressions. Such a set can thus be
infinite, finite, or even empty. In RIF-BLD, signatures can have at
most one arrow expression. Other dialects (such as HiLog [CKW93], for example) may require
polymorphic symbols and thus allow signatures with more than one
arrow expression in them.

For instance, (arg1->term arg2->term) => term
is an arrow signature expression with named arguments. The order of
the arguments in arrow expressions with named arguments is
immaterial, so any permutation of arguments yields the same
expression. ☐

RIF dialects are always associated with sets of coherent
signatures, defined next. The overall idea is that a coherent set
of signatures must include all the predefined signatures (such as
signatures for equality and classification terms) and the
signatures included in a coherent set should not conflict with each
other. For instance, two different signatures should not have
identical names and if one signature is said to extend another then
the arrow expressions of the supersignature should be included
among the arrow expressions of the subsignature (a kind of an arrow
expression "inheritance").

Definition (Coherent signature set). A set Σ of
signatures is coherent iff

Σ contains the special signature
atomic{ }, which represents the context of atomic
formulas.

Σ contains the signature ={e1, ...,
en, ...} for the equality symbol.

All arrow expressions ei here have the form
(κ κ) ⇒ γ (the arguments in an equation must be
compatible) and at least one of these expressions must have the
form (κ κ) ⇒ atomic (i.e., equation terms are also atomic
formulas). Dialects may further specialize this signature.

Σ contains the signature #{e1, ...,
en...}.

Here all arrow expressions ei are binary
(have two arguments) and at least one has the form (κ γ) ⇒
atomic. Dialects may further specialize this signature.

Σ contains the signature ##{e1, ...,
en...}.

Here all arrow expressions ei have the form
(κ κ) ⇒ γ (the arguments must be compatible) and at least
one of these arrow expressions has the form (κ κ) ⇒
atomic. Dialects may further specialize this signature.

Σ contains the signature ->{e1, ...,
en...}.

Here all arrow expressions ei are ternary
(have three arguments) and at least one of them is of the form
(κ1 κ2 κ3) ⇒ atomic.
Dialects may further specialize this signature.

Σ has at most one signature for any given signature
name.

Whenever Σ contains a pair of signatures, ηA
and κB, such that η<κ then
B⊆A.

Here ηA denotes a signature with the name η
and the associated set of arrow expressions A; similarly
κB is a signature named κ with the set of
expressions B. The requirement that B⊆A
ensures that symbols that have signature η can be used
wherever the symbols with signature κ are allowed.
☐

The requirement that coherent sets of signatures must include the
signatures for =, #, ->, and so on is
just a technicality needed to simplify the definitions. Some of
these signatures may go "unused" in a dialect even though,
technically speaking, they must be present in the signature set
associated with that dialect. If a dialect disallows equality,
classification terms, or frames in its syntax then the
corresponding signatures will remain unused. Such restrictions can
be imposed by specializing RIF-FLD -- see Section Syntax of a RIF Dialect as a
Specialization of RIF-FLD.

An incoherent set of signatures would be one that
includes signatures mysig{() ⇒ atomic} and
mysig{atomic ⇒ atomic} because it has two different
signatures with the same name. Likewise, if this set contains
mysig1{() ⇒ atomic} and
mysig2{atomic ⇒ atomic} and
mysig1 < mysig1 then it is
incoherent because the set of arrow expressions of
mysig1 does not contain the set of arrow
expressions of mysig2.

Each variable symbol is associated with exactly one
signature from a coherent set of signatures. A constant symbol can
have one or more signatures, and different symbols can be
associated with the same signature. (If variables were allowed to
have multiple signatures then well-formed terms would not be closed
under substitutions. For instance, a term like f(?X,?X)
could be well-formed, but f(a,a) could be ill-formed.)

Restrictions on the classes of terms allowed in the language of
the dialect.

Restrictions on the classes of formulas allowed in the language
of the dialect.

We have already seen how the alphabet and the symbol spaces are
used to define RIF terms. The
next section shows how signatures and external schemas are used to
further specialize this notion to define well-formed RIF-FLD
terms.

2.8 Well-formed Terms and
Formulas

Since signature names uniquely identify signatures in coherent
signature sets, we will often refer to signatures simply by their
names. For instance, if one of f's signatures is
atomic{ }, we may simply say that symbol fhas signature atomic.

Definition (Well-formed term).

A constant or variable symbol with signature
η is a well-formed term with signature η.

A positional termt(t1 ...
tn), 0≤n, is well-formed and has a
signature σ iff

t is a well-formed term that has a signature that
contains an arrow expression of the form (σ1 ...
σn) ⇒ σ; and

Each ti is a well-formed term whose
signature is γi such that γi, ≤
σi.

As a special case, when n=0 we obtain that
t( ) is a well-formed term with signature σ,
if t's signature contains the arrow expression () ⇒
σ.

A term with named argumentst(p1->t1 ...
pn->tn), 0≤n, is well-formed
and has a signature σ iff

t is a well-formed term that has a signature that
contains an arrow expression with named arguments of the form
(p1->σ1 ...
pn->σn) ⇒ σ; and

Each ti is a well-formed term whose
signature is γi, such that γi ≤
σi.

As a special case, when n=0 we obtain that
t( ) is a well-formed term with signature σ,
if t's signature contains the arrow expression () ⇒
σ.

An equality term of the form
t1=t2 is well-formed and has a
signature κ iff

The signature = has an arrow expression (σ σ) ⇒
κ

ti and t2 are
well-formed terms with signatures γ1 and
γ2, respectively, such that γi ≤
σ, i=1,2.

A membership term of the form
t1#t2 is well-formed and has a
signature κ iff

The signature # has an arrow expression
(σ1 σ2) ⇒ κ

t1 and t2 are
well-formed terms with signatures γ1 and
γ2, respectively, such that γi ≤
σi, i=1,2.

A subclass term of the form
t1##t2 is well-formed and has a
signature κ iff

The signature ## has an arrow expression (σ σ) ⇒
κ

t1 and t2 are
well-formed terms with signatures γ1 and
γ2, respectively, such that γi ≤
σ, i=1,2.

A frame term of the form
t[s1->v1 ...
sn->vn] is well-formed and has a
signature κ iff

Note that, according to the definition of coherent sets of
schemas, a term can be an instance of at most one external schema.
☐

Note that, like constant symbols, well-formed terms can have
more than one signature. Also note that, according to the above
definition, f() and f are distinct terms.

Definition (Well-formed formula). A well-formed term is also
a well-formed atomic formula iff one of its
signatures is atomic or is ≤ atomic. Note
that equality, membership, subclass, and frame terms are atomic
formulas, since atomic is one of their signatures.

More general formulas are constructed out of atomic formulas
with the help of logical connectives. A well-formed
formula is a statement that can have one of the forms (1)
-- (9) below. The Group and Document formulas,
defined in (8) and (9), are aggregate formulas while the
formulas in (1) -- (7) are non-aggregate. This distinction
manifests itself in that Group and Document
cannot be part of non-aggregate formulas, and Document
cannot be part of a group.

Atomic: If φ is a well-formed atomic formula
then it is also a well-formed formula.

As a special case, Or() is treated as a contradiction,
i.e., a formula that is always false.

Classical negation: If φ is a non-aggregate
well-formed formula then Neg φ is also a well-formed
formula.

Default negation: If φ is a non-aggregate
well-formed formula then Naf φ is also a well-formed
formula.

Rule implication: If φ and ψ are
non-aggregate well-formed formulas then φ :- ψ is
also a well-formed formula.

Quantification: If φ is a non-aggregate
well-formed formula and ?V1, ...,
?Vn are variables then the following formulas
are also well-formed:

Exists ?V1
... ?Vn(φ)

Forall ?V1
... ?Vn(φ)

Group: If φ1, ...,
φn are well-formed non-Document
formulas then Group(φ1 ... φn) is a
well-formed RIF-FLD group formula (or simply a group
formula when context is clear).

Group formulas are intended to represent sets of formulas. Note
that some of the φi's can be group formulas
themselves, which means that groups can be nested.

Document: An expression of the form
Document(directive1 ...
directiven Γ) is a well-formed RIF-FLD
document formula (or simply a document formula if no
ambiguity arises), if

Γ is an optional well-formed group formula that
encompasses the logical content of the document.

directive1, ...,
directiven is an optional sequence of
directives. A directive can be a dialect directive, a
base directive, a prefix directive, or an import
directive.

A dialect directive has the form
Dialect(D), where D is a Unicode string that
specifies the name of a dialect. This directive specifies the
dialect of a RIF document. Some dialects may require this directive
in all of its documents, while others (notably, RIF-BLD) may not
allow it and instead may entirely rely on other syntax. (Purely
syntactic identification may not always be possible for dialects
that are syntactically identical but semantically different, such
as deductive databases with stable model semantics [GL88] and with well-founded
semantics [GRS91]. These two
dialects are examples where the Dialect directive might be
necessary.)

A base directive has the form Base(iri),
where iri is a unicode string in the form of an IRI.

The Base directive does not affect the semantics. It
defines a syntactic shortcut for expanding relative IRIs into full
IRIs, as described in in Section Constants and
Symbol Spaces of [RIF-DTB].

A prefix directive has the form Prefix(p
v), where p is an alphanumeric string that serves as
the prefix name and v is a macro-expansion for p
-- a string that forms an IRI.

Like the Base directive, the Prefix directives
do not affect the semantics of RIF documents. Instead, they define
shorthands to allow more concise representation of IRI constants.
This mechanism is explained in [RIF-DTB], Section Constants and
Symbol Spaces.

An import directive can have one of these two
forms: Import(t) or Import(t p). Here t
is an IRI constant and p is a term. The constant
t indicates the address of another document to be imported
and p is called the profile of import.

RIF-FLD defines the semantics for the directive
Import(t) only. The directive Import(t p) is
reserved for RIF dialects, which might use it to import non-RIF
logical entities, such as RDF data and OWL ontologies [RIF-RDF+OWL]. The profile might specify
what kind of entity is being imported and under what semantics (for
instance, the various RDF entailment regimes can be specified using
different profiles).

A document formula can contain at most one Dialect and
at most one Base directive. The Dialect
directive, if present, must be first, followed by an optional
Base directive, followed by any number of Prefix
directives, followed by any number of Import
directives.

In the definition of a formula, the component formulas
φ, φi, ψi, and
Γ are said to be subformulas of the respective
formulas (conjunction, disjunction, nagation, implication, group,
etc.) that are built using these components. ☐

We illustrate the above definitions with the following examples.
In addition to atomic, let there be another signature,
term{ }, which is intended here to represent the
context of the arguments to positional terms or atomic
formulas.

Consider the term p(p(a) p(a b c)). If
p has the (polymorphic) signature
mysig{(term)⇒term, (termterm)⇒term, (termtermterm)⇒term} and a, b,
c each has the signature term{ } then
p(p(a) p(a b c)) is a well-formed term with
signature term{ }. If instead p had the
signature mysig2{(termterm)⇒term, (termtermterm)⇒term} then
p(p(a) p(a b c)) would not be a well-formed
term since then p(a) would not be well-formed (in this
case, p would have no arrow expression which allows
p to take just one argument).

For a more complex example, let r have the signature
mysig3{(term)⇒atomic, (atomicterm)⇒term, (termtermterm)⇒term}. Then
r(r(a) r(a b c)) is well-formed. The
interesting twist here is that r(a) is an atomic formula
that occurs as an argument to a function symbol. However, this is
allowed by the arrow expression (atomicterm)⇒
term, which is part of r's signature. If
r's signature were
mysig4{(term)⇒atomic, (atomicterm)⇒atomic, (termtermterm)⇒term} instead, then
r(r(a) r(a b c)) would be not only a
well-formed term, but also a well-formed atomic formula.

An even more interesting example arises when the right-hand side
of an arrow expression is something other than term or
atomic. For instance, let John, Mary,
NewYork, and Boston have signatures
term{ }; flight and parent have
signature h2{(term term)⇒atomic}; and
closure has signature
hh1{(h2)⇒p2}, where
p2 is the name of the signature
p2{(term term)⇒atomic}. Then
flight(NewYork Boston),
closure(flight)(NewYork Boston),
parent(John Mary), and
closure(parent)(John Mary) would be well-formed
formulas. Such formulas are allowed in languages like HiLog
[CKW93], which support
predicate constructors like closure in the above example.
☐

2.9 Annotations in the Presentation
Syntax

RIF-FLD allows every term and formula (including terms and
formulas that occur inside other terms and formulas) to be
optionally preceded by an annotation of the form
(* id φ *) where id is a rif:iri
constant and φ is a RIF formula, which is not a
document-formula. Both items inside the annotation are optional.
The id part represents the identifier of the term (or
formula) to which the annotation is attached and φ is the
rest of the annotation. RIF-FLD does not impose any restrictions on
φ apart from what is stated above. This means that
φ may include variables, function symbols, rif:local constants, and so on.

Document formulas with and without annotations will be referred
to as RIF-FLD documents.

A convention is used to avoid a syntactic ambiguity in the above
definition. For instance, in (* id φ *) t[w -> v] the
annotation can be attributed to the term t or to the
entire frame t[w -> v]. Similarly, for an annotated
HiLog-like term of the form (* id φ *) f(a)(b,c), the
annotation can be attributed to the entire term f(a)(b,c)
or to just f(a). The convention adopted in RIF-FLD is that
any annotation is syntactically associated with the largest RIF-FLD
term or formula that appears to the right of that annotation.
Therefore, in our examples the annotation (* id φ *) is
considered to be attached to the entire frame t[w -> v]
and to the entire term f(a)(b,c).

Example 2 (A RIF-FLD document with nested groups and
annotations).

We illustrate formulas, including documents and groups, with the
following complete example (with apologies to Shakespeare for the
imperfect rendering of the intended meaning in logic). For better
readability, we use the shortcut notation defined in [RIF-DTB]. The example also illustrates
attachment of annotations.

Observe that the above set of formulas has a nested subset with
its own annotation, hamlet:facts, which contains only a
global IRI. ☐

2.10 EBNF Grammar for the Presentation
Syntax of RIF-FLD

Until now, to specify the syntax of RIF-FLD we relied on
"mathematical English," a special form of English for communicating
mathematical definitions, examples, etc. We will now specify the
syntax using the familiar EBNF notation. The following points about
the EBNF notation should be kept in mind:

The syntax of RIF-FLD relies on the signature mechanism and is
not context-free, so EBNF does not capture this syntax precisely.
As a result, the EBNF grammar defines a strict superset of
RIF-FLD (not all formulas that are derivable using the EBNF grammar
are well-formed).

The EBNF syntax is not a concrete syntax: it does not
address the details of how constants (defined in [RIF-DTB]) and variables are
represented, and it is not sufficiently precise about the
delimiters and escape symbols. White space is informally used as a
delimiter, and is implied in productions that use Kleene star. For
instance, TERM* is to be understood as
TERM TERM ... TERM, where each ' '
abstracts from one or more blanks, tabs, newlines, etc. This is
done intentionally since RIF's presentation syntax is used as a
tool for specifying the semantics and for illustration of the main
RIF concepts through examples.

In view of the above, the EBNF grammar can be viewed as just an
intermediary between the mathematical English and the XML. However,
it also gives a succinct overview of the syntax of RIF-FLD and as
such can be useful for dialect designers and users alike.

The RIF-FLD presentation syntax does not commit to any
particular vocabulary and permits arbitrary Unicode strings in
constant symbols, argument names, and variables. Constant symbols
can have this form: "UNICODESTRING"^^SYMSPACE, where
SYMSPACE is a ANGLEBRACKIRI or CURIE
that represents the identifier of the symbol space of the constant,
and UNICODESTRING is a Unicode string from the lexical
space of that symbol space. ANGLEBRACKIRI and
CURIE are defined in Section Shortcuts for Constants in RIF's Presentation Syntax of
[RIF-DTB]. Constant symbols can
also have several shortcut forms, which are represented by the
non-terminal CONSTSHORT. These shortcuts are also defined
in the same section of [RIF-DTB]. One of them is the CURIE shortcut, which
is used in the examples in this document. Names are Unicode
character sequences. Variables are composed of
UNICODESTRING symbols prefixed with a ?-sign.

RIF-FLD formulas and terms can be prefixed with optional
annotations, IRIMETA, for identification and metadata.
IRIMETA is represented using (*...*)-brackets
that contain an optional IRI constant as identifier followed by an
optional Frame or conjunction of Frames as
metadata. An IRICONST is the special case of a
Const with the symbol space rif:iri, again
permitting the shortcut forms defined in [RIF-DTB]. One such specialization is '"' IRI '"^^'
'rif:iri' from the Const production, where
IRI is a sequence of Unicode characters that forms an
internationalized resource identifier as defined by [RFC-3987].

3 Semantic Framework

Recall that the presentation syntax of RIF-FLD allows the use of
macros, which are specified via the Prefix and
Base directives, and various shortcuts for integers,
strings, and rif:local symbols. The semantics, below, is
described using the full syntax, i.e., we assume that all shortcuts
and macros have already been expanded, as defined in [RIF-DTB], Section Constants and
Symbol Spaces.

3.1 Semantics of a RIF Dialect as a
Specialization of RIF-FLD

The RIF-FLD semantic framework defines the notions of
semantic structures and of models for RIF-FLD
formulas. The semantics of a dialect is derived from
these notions by specializing the following parameters.

The effect of the syntax.

The syntax of a dialect may limit the kinds of terms that are
allowed. For instance, if a dialect's syntax excludes frames or
terms with named arguments then the parts of the semantic structures whose purpose
is to interpret those types of terms
(Iframe and
INF in this case) become
redundant.

Truth values.

The RIF-FLD semantic framework allows formulas to have truth
values from an arbitrary partially ordered set of truth values,
TV. A concrete dialect must select a concrete
partially or totally ordered set of truth values.

Datatypes.

A datatype is a symbol space whose symbols have a fixed
interpretation in any semantic structure. RIF-FLD defines a set of
core datatypes that each dialect is required to include as part of
its syntax and semantics. However, RIF-FLD does not limit dialects
to just the core types: they can introduce additional datatypes,
and each dialect must define the exact set of datatypes that it
includes.

Logical entailment.

Logical entailment in RIF-FLD is defined with respect to an
unspecified set of intended models. A RIF dialect must
define which models are considered to be intended. For instance,
one dialect might specify that all models are intended (which leads
to classical first-order entailment), another may consider only the
minimal models as intended, while a third one might only use
well-founded or stable models [GRS91, GL88].

These notions are defined in the remainder of this
specification.

3.2 Truth Values

Definition (Set of truth values). Each RIF dialect must
define the set of truth values, denoted by
TV. This set must have a partial order, called the
truth order, denoted <t. In some
dialects, <t can be a total order. We write a
≤tb if either a <tb
or a and b are the same element of TV.
In addition,

TV must be a complete lattice with respect to
<t, i.e., the least upper bound (lubt) and
the greatest lower bound (glbt) must exist for any
subset of TV.

TV is required to have two distinguished
elements, f and t, such that f ≤telt and elt ≤tt for every
elt∈TV.

TV has an operator of negation,
~: TV → TV, such that

~ is idempotent, i.e., applying ~ twice gives
the identity mapping.

~t = f (and thus
~f = t). ☐

RIF dialects can have additional truth values. For instance, the
semantics of some versions of NAF, such as well-founded
negation, requires three truth values: t, f, and
u (undefined), where f <tu
<tt. Handling of contradictions and
uncertainty usually requires at least four truth values: t,
u, f, and i (inconsistent). In this case, the
truth order is partial: f <tu
<tt and f <ti
<tt.

3.3 Primitive Datatypes

Definition
(Primitive datatype). A primitive datatype (or
just a datatype, for short) is a symbol space that has

an associated set, called the value space,
and

a mapping from the lexical space of the symbol space to the
value space, called lexical-to-value-space mapping.
☐

Semantic structures are always defined with respect to a
particular set of datatypes, denoted by DTS. In a
concrete dialect, DTS always includes the datatypes
supported by that dialect. All RIF dialects must support the
primitive datatypes that are listed in Section Datatypes of
[RIF-DTB]. Their value spaces
and the lexical-to-value-space mappings fot these datatypes are
described in the same section.

Although the lexical and the value spaces might sometimes look
similar, one should not confuse them. Lexical spaces define the
syntax of the constant symbols in the RIF language. Value spaces
define the meaning of the constants. The lexical and the
value spaces are often not even isomorphic. For example,
1.2^^xs:decimal and 1.20^^xs:decimal are two
legal -- and distinct -- constants in RIF because 1.2 and
1.20 belong to the lexical space of xs:decimal.
However, these two constants are interpreted by the same
element of the value space of the xs:decimal type.
Therefore, 1.2^^xs:decimal = 1.20^^xs:decimal is
a RIF tautology. Likewise, RIF semantics for datatypes implies
certain inequalities. For instance, abc^^xs:string ≠
abcd^^xs:string is a tautology, since the
lexical-to-value-space mapping of the xs:string type maps
these two constants into distinct elements in the value space of
xs:string.

3.4 Semantic Structures

The central step in specifying a model-theoretic semantics for a
logic-based language is defining the notion of a semantic
structure. Semantic structures are used to assign truth values
to RIF-FLD formulas.

Definition (Semantic structure). A semantic
structure, I, is a tuple of the form
<TV, DTS, D,
IC, IV,
IF, Iframe,
INF, Isub,
Iisa, I=,
Iexternal,
Itruth>. Here D is a
non-empty set of elements called the domain of
I. We will continue to use Const to refer to
the set of all constant symbols and Var to refer to the
set of all variable symbols. TV denotes the set of
truth values that the semantic structure uses and DTS
is a set of identifiers for primitive datatypes.

The other components of I are total
mappings defined as follows:

IC maps Const to elements of
D.

This mapping interprets constant symbols.

IV maps Var to elements of
D.

This mapping interprets variable symbols.

IF maps D to functions
D* → D (here D* is a set
of all sequences of any finite length over the domain
D)

This mapping interprets positional terms.

INF interprets terms with named
arguments. It is a total mapping from D to the set of
total functions of the form
SetOfFiniteBags(ArgNames × D) →
D.

This is analogous to the interpretation of positional terms with
two differences:

Each pair <s,v> ∈ ArgNames ×
D represents an argument/value pair instead of just a
value in the case of a positional term.

The argument to a term with named arguments is a finite bag of
argument/value pairs rather than a finite ordered sequence of
simple elements.

Bags are used here because the order of the argument/value
pairs in a term with named arguments is immaterial and the pairs
may repeat: p(a->b a->b).

To see why such repetition can occur, note that argument names
may repeat: p(a->b a->c). This can be understood as
treating a as a set-valued argument. Identical
argument/value pairs can then arise as a result of a substitution.
For instance, p(a->?A a->?B) becomes p(a->b
a->b) if the variables ?A and ?B are both
instantiated with the symbol b.

Iframe is a total mapping from
D to total functions of the form
SetOfFiniteBags(D × D) →
D.

This mapping interprets frame terms. An argument, d ∈
D, to Iframe represents an
object and a finite bag {<a1,v1>, ...,
<ak,vk>} represents a bag (multiset) of
attribute-value pairs for d. We will see shortly how
Iframe is used to determine the truth
valuation of frame terms.

Bags are employed here because the order of the attribute/value
pairs in a frame is immaterial and the pairs may repeat. For
instance, o[a->b a->b]. Such repetitions arise
naturally when variables are instantiated with constants. For
instance, o[?A->?B ?C->?D] becomes
o[a->b a->b] if variables ?A and
?C are instantiated with the symbol a and
?B, ?D with b.

Isub gives meaning to the subclass
relationship. It is a total function D ×
D → D.

The operator ## is required to be transitive, i.e.,
c1 ## c2 and c2 ## c3 must
imply c1 ## c3. This is ensured by a restriction
in Section Interpretation of Formulas.

Iisa gives meaning to class
membership. It is a total function D ×
D → D.

The relationships # and ## are required to
have the usual property that all members of a subclass are also
members of the superclass, i.e., o # cl and
cl ## scl must imply o # scl.
This is ensured by a restriction in Section Interpretation of
Formulas.

For every external schema, σ, associated with the
language, Iexternal(σ) is assumed
to be specified externally in some document (hence the name
external schema). In particular, if σ is a schema
of a RIF builtin predicate or function,
Iexternal(σ) is specified in
[RIF-DTB] so that:

If σ is a schema of a builtin function then
Iexternal(σ) must be the function
defined in the aforesaid document.

If σ is a schema of a builtin predicate then
Itruthο
(Iexternal(σ)) (the composition
of Itruth and
Iexternal(σ), a truth-valued
function) must be as specified in [RIF-DTB].

Note that, by definition, External(t) is well-formed
only if t is an instance of an external schema.
Furthermore, by the definition of coherent sets of external schemas, t
can be an instance of at most one such schema, so
I(External(t)) is well-defined.

The effect of signatures. For every signature,
sg, supported by a dialect, there is a subset
Dsg ⊆ D, called the
domain of the signature. Terms that have a given
signature, sg, must be mapped by I to
Dsg, and if a term has more than one
signature it must be mapped into the intersection of the
corresponding signature domains. To ensure this, the following is
required:

If sg < sg' then
Dsg⊆Dsg'.

If k is a constant that has signature sg then
IC(k) ∈
Dsg.

If ?v is a variable that has signature sg
then IV(?v) ∈
Dsg.

If sg has an arrow expression of the form (s1
... sn)⇒s then, for every
d∈Dsg,
IF(d) must map
Ds1× ... ×Dsn to
Ds.

If sg has an arrow expression of the form
(p1->s1 ... pn->sn)⇒s then, for
every d∈Dsg,
INF(d) must map the set
{<p1,Ds1>, ...,
<pn,Dsn>} to
Ds.

If the signature -> has arrow expressions
(sg,s1,r1)⇒k, ...,
(sg,sn,rn)⇒k, then, for every
d∈Dsg,
Iframe(d) must map
{<Ds1,Dr1>,
...,
<Dsn,Drn>}
to Dk.

If the signature # has an arrow expression (s
r)⇒k then Iisa must map
Ds×Dr to
Dk.

If the signature ## has an arrow expression (s
s)⇒k then Isub must map
Ds×Ds to
Dk.

If the signature = has an arrow expression (s
s)⇒k then I= must map
Ds×Ds to
Dk.

The effect of datatypes. The datatype identifiers
in DTS impose the following restrictions. If
dt ∈ DTS, let LSdt
denote the lexical space of dt,
VSdt denote its value space, and
Ldt: LSdt →
VSdt the lexical-to-value-space mapping.
Then the following must hold:

VSdt ⊆ D; and

For each constant lit^^dt such that lit ∈
LSdt,
IC(lit^^dt) =
Ldt(lit).

That is, IC must map the constants of a
datatype dt in accordance with
Ldt. ☐

RIF-FLD does not impose special requirements on
IC for constants in the symbol spaces that
do not correspond to the identifiers of the primitive datatypes in
DTS. Dialects may have such requirements, however. An
example of such a restriction could be a requirement that no
constant in a particular symbol space (such as rif:local) can be mapped to
VSdt of a datatype dt.

3.5 Annotations and the Formal
Semantics

RIF-FLD annotations are stripped before the mappings that
constitue RIF-FLD semantic structures are applied. Likewise, they
are stripped before applying the truth valuation,
TValI, defined in the next section. Thus,
identifiers and metadata have no effect on the formal
semantics.

Note that although annotations associated with RIF-FLD formulas
are ignored by the semantics, they can be extracted by XML tools.
Since annotations are represented by frame terms, they can be
reasoned with by the rules.

3.6 Interpretation of Non-document
Formulas

This section defines how a semantic structure, I,
determines the truth value TValI(φ) of a
RIF-FLD formula, φ, where φ is any formula other
than a document formula. Truth valuation of document formulas is
defined in the next section.

To this end, we define a mapping, TValI, from
the set of all non-document formulas to TV. Note that
the definition implies that TValI(φ) is
defined only if the set DTS of the datatypes
of I includes all the datatypes mentioned in
φ.

Definition
(Truth valuation).Truth valuation for
well-formed formulas in RIF-FLD is determined using the following
function, denoted TValI:

Note that, by definition, External(t) is well-formed
only if t is an instance of an external schema.
Furthermore, by the definition of coherent sets of external schemas, t
can be an instance of at most one such schema, so
I(External(t)) is well-defined.

Conjunction:
TValI(And(c1 ...
cn)) =
glbt(TValI(c1),
..., TValI(cn)).

The empty conjunction is treated as a tautology, so
TValI(And()) = t.

Disjunction:
TValI(Or(c1 ...
cn)) =
lubt(TValI(c1), ...,
TValI(cn)).

The empty disjunction is treated as a contradiction, so
TValI(Or()) = f.

Negation: TValI(Neg φ) =
~TValI(φ) and
TValI(Naf φ) =
~TValI(φ).

The symbol ~ here is the idempotent operator of
negation on TV introduced in Section Truth Values. Note that both
classical and default negation are interpreted the same way in any
concrete semantic structure. The difference between the two kinds
of negation comes into play when logical entailment is defined.

Quantification:

TValI(Exists ?v1
... ?vn (φ)) =
lubt(TValI*(φ)).

TValI(Forall ?v1
... ?vn (φ)) =
glbt(TValI*(φ)).

Here lubt (respectively, glbt) is taken
over all interpretations I* of the form
<TV, DTS, D,
IC, I*V,
IF, Iframe,
INF, Isub,
Iisa, I=,
Iexternal,
Itruth>, which are exactly like
I, except that the mapping
I*V, is used instead of
IV. I*V is
defined to coincide with IV on all
variables except, possibly, on ?v1,...
,?vn.

Rule implication:

TValI(head :-
body)=t, if TValI(head)
≥tTValI(body).

TValI(head :-
body)=f otherwise.

Groups of formulas:

If Γ is a group formula of the form
Group(φ1 ... φn) then

TValI(Γ) =
glbt(TValI(φ1),
..., TValI(φn)).

This means that a group of formulas is treated as a conjunction.
☐

Note that rule implications and equality formulas are always
two-valued, even if TV has more than two values.

3.7 Interpretation of
Documents

Document formulas are interpreted using semantic
multi-structures.

Definition (Semantic multi-structures). A semantic
multi-structure is a set
{IΔ1, ...,
IΔn}, n>0, where
IΔ1, ...,
IΔn are semantic
structures adorned with document formulas. These structures must be
identical in all respects except that the mappings
ICΔ1, ...,
ICΔn might
differ on the constants in Const that belong to the
rif:local symbol space. The above set is allowed to
have at most one semantic structure with the same adornment.
☐

Definition (Imported document). Let Δ be a document
formula and Import(t) be one of its import
directives, where t is an IRI constant that identifies
another document formula, Δ'. In this case, we say that
Δ' is directly imported into Δ.

A document formula Δ' is said to be
imported into Δ if it is either directly
imported into Δ or it is imported (directly or not) into
another formula, which is directly imported into Δ.
☐

The above definition considers only one-argument import
directive, since two-argument directives are expected to be defined
on a case-by-case basis by other specifications that need to be
integrated with RIF.

The notion of semantic multi-structures will now be used to
define a semantics for RIF documents.

Definition (Truth valuation of document formulas). Let
Δ be a document formula and let Δ1,
..., Δk be all the RIF-FLD document formulas
that are imported (directly or indirectly, according to the
previous definition) into Δ. Let Γ,
Γ1, ..., Γk denote the
respective group formulas associated with these documents. If any
of these Γi is missing (which is a possibility,
since every part of a document is optional), assume that it is a
tautology, such as a = a, so that every TVal
function maps such a Γi to the truth value
t. Let I =
{IΔ,
IΔ1, ...,
IΔk, ...} be a
semantic multi-structure that contains semantic structures adorned
with at least the documents Δ, Δ1,
..., Δk. Then we define:

TValI(Δ) =
glbt(TValIΔ(Γ),
TValIΔ1(Γ1),
...,
TValIΔk(Γk)).

Note that this definition considers only those document formulas
that are reachable via the one-argument import directives.
Two-argument import directives are not covered by RIF-FLD. Their
semantics is supposed to be defined by other documents, such as
[RIF-RDF+OWL].
☐

The above definitions make the intent behind the rif:local constants clear: occurrences of these
constants in different documents can be interpreted differently
even if they have the same name. Therefore, each document can
choose the names for the rif:local constants freely and
without regard to the names of such constants used in the imported
documents.

In the remainder of this specification, every formula is assumed to
be part of some document. It a formula is not physically part of
any document, it will be said to belong to a special query
document. This allows us to define
TValI(φ), where I is a
multi-structure, for arbitrary formulas, not just for document
formulas: If φ is a formula that is not a document-formula
and I is a semantic multi-structure that contains a
component IΔ that corresponds to
the document of the formula φ, then
TValI(φ) is defined as
TValIΔ(φ).
Otherwise, TValI(φ) is undefined.

Definition (Models). A multi-structure I is a
model of a formula, φ, written as
I|=φ, iff
TValI(φ) is defined and equals t.
☐

3.8 Intended Semantic
Structures

The semantics of a set of formulas, Γ, is
the set of its intended semantic multi-structures.
RIF-FLD does not specify what these intended multi-structures are,
leaving this to RIF dialects. Different logic theories may have
different criteria for what is considered an intended semantic
multi-structure.

For the classical first-order logic, every model is an intended
semantic multi-structure. For [RIF-BLD], which is based on Horn rules, intended
multi-structures are defined only for sets of rules: an intended
semantic multi-structure of a RIF-BLD set of formulas, Γ,
is the unique minimal Herbrand model of Γ. For the
dialects in which rule bodies may contain literals negated with the
negation-as-failure connective Naf, only some of
the minimal Herbrand models of a set of rules are intended. Each
logic dialect of RIF must define the set of intended semantic
multi-structures precisely. The two most common such theories are
the well-founded models [GRS91] and stable models [GL88].

The following example illustrates the notion of intended
semantic structures. Suppose Γ consists of a single rule
formula p :- Naf q. If Naf were
interpreted as classical negation, then this rule would be simply
equivalent to Or(p q), and so it would have two kinds of
models: those where p is true and those where q
is true. In contrast to first-order logic, most rule-based systems
do not consider p and q symmetrically. Instead,
they view the rule p :- Naf q as a
statement that p must be true if it is not possible to
establish the truth of q. Since it is, indeed, impossible
to establish the truth of q, such theories would derive
p even though it does not logically follow from Or(p
q). The logic underlying rule-based systems also assumes that
only the minimal Herbrand models are intended (minimality
here is with respect to the set of true facts). Furthermore,
although our example has two minimal Herbrand models -- one where
p is true and q is false, and the other where
p is false, but q is true, only the first model
is considered to be intended.

The above concept of intended models and the corresponding
notion of logical entailment with respect to the intended models,
defined below, is due to [Shoham87].

3.9 Logical Entailment

We will now define what it means for a set of RIF-FLD formulas
to entail another RIF-FLD formula. This notion is typically used
for defining queries to knowledge bases and for other tasks, such
as testing subsumption of concepts (e.g., in OWL). We assume that
each set of formulas has an associated set of intended semantic
structures.

Definition
(Logical entailment). Let Γ be a RIF-FLD formula and
φ another RIF-FLD formula. We say that Γentailsφ, written as
Γ |= φ, if and only if, for every intended
semantic multi-structure I of Γ for which
both TValI(Γ) and
TValI(φ) are defined, it is the case
that TValI(Γ) ≤tTValI(φ). ☐

This general notion of entailment covers both first-order logic
and the non-monotonic logics that underlie many rule-based
languages [Shoham87].

Note that one consequence of the multi-document semantics is that
local constants specified in one document cannot be queried from
another document. In particular, they cannot appear in entailed
formulas. For instance, if one document, Δ', has the fact
"http://example.com/ppp"^^rif:iri("abc"^^rif:local) while
another document formula, Δ, imports Δ' and has
the rule "http://example.com/qqq"^^rif:iri(?X) :-
"http://example.com/ppp"^^rif:iri(?X) , then Δ |=
"http://example.com/qqq"^^rif:iri("abc"^^rif:local) does
not hold. This is because "abc"^^rif:local in
Δ' and "abc"^^rif:local in the formula on the
right-hand side of |= are treated as different constants
by semantic multi-structures.

As explained in the overview section, the design of RIF envisions that the
presentation syntaxes of future logic RIF dialects will be
specializations of the presentation syntax of RIF-FLD. This means
that every well-formed formula in the presentation syntax of a
standard logic RIF dialect must also be well-formed in RIF-FLD. The
goal of the XML serialization framework is to provide a similar
yardstick for the RIF XML syntax. This amounts to the requirement
that any conformant XML document for a logic RIF dialect must also
be a conformant XML document for RIF-FLD (conformance is
defined below). In terms of the presentation-to-XML syntax
mappings, this means that each mapping for a logic RIF dialect must
be a restriction of the corresponding mapping for RIF-FLD. For
instance, the mapping from the presentation syntax of RIF-BLD to XML in
[RIF-BLD] is a restriction of
the presentation-syntax-to-XML mapping for RIF-FLD. In this way,
RIF-FLD provides a framework for extensibility and mutual
compatibility between XML syntaxes of RIF dialects.

Recall that the syntax of RIF-FLD is not context-free and thus
cannot be fully captured by EBNF or XML Schema. Still, validity
with respect to XML Schema can be a useful test. To reflect this
state of affairs, we define two notions of syntactic correctness.
The weaker notion checks correctness only with respect to XML
Schema, while the stricter notion represents "true" syntactic
correctness.

Definition (Valid XML document in a logic dialect). A
valid RIF-FLD document in the XML syntax is an XML
document that is valid with respect to the XML schema in Appendix
XML Schema for FLD.

If a dialect, D, specializes RIF-FLD then its XML schema
is a specialization of the XML schema of RIF-FLD. A valid XML
document in D is then one that is valid with respect to the
XML schema of D. ☐

If a dialect, D, specializes RIF-FLD then an XML document
is conformant with respect to D if and only if it is a valid
document in D and it is an image of a well-formed document
in the presentation syntax of D.

Note that if D requires the directive
Dialect(D) as part of its syntax then this implies
that any D-conformant document must have this directive.
☐

A round-tripping of a conformant document in a
dialect, D, is a semantics-preserving mapping to a document
in any language L followed by a semantics-preserving mapping
from the L-document back to a conformant D-document.
While semantically equivalent, the original and the round-tripped
D-documents need not be identical. Metadata should survive
round-tripping.

4.1 XML for the RIF-FLD
Language

RIF-FLD uses [XML1.0]
for its XML syntax. The XML serialization for RIF-FLD is
alternating or fully striped [ANF01]. A fully striped
serialization views XML documents as objects and divides all XML
tags into class descriptors, called type tags, and property
descriptors, called role tags [TRT03]. We follow the
tradition of using capitalized names for type tags and lowercase
names for role tags.

The all-uppercase classes in the EBNF of the presentation
syntax, such as FORMULA, become XML Schema groups in
Appendix XML Schema for FLD.
They act like macros and are not visible in instance markup. The
other classes as well as non-terminals and symbols (such as
Exists or =) become XML elements with optional
attributes, as shown below.

The XML syntax for symbol spaces uses the type
attribute associated with the XML element Const. For
instance, a literal in the xs:dateTime datatype is
represented as
<Const type="&xs;dateTime">2007-11-23T03:55:44-02:30</Const>.
RIF-FLD also uses the ordered attribute to indicate that
the children of args and slot elements are
ordered.

Example 3 (Serialization of a nested RIF-FLD group with
annotations).

This example shows an XML serialization for the formulas in
Example 2. For convenience of reference, the original formulas are
included at the top. For better readability, we again use the
shortcut syntax defined in [RIF-DTB].

4.2 Mapping from the RIF-FLD Presentation
Syntax to the XML Syntax

This section defines a normative mapping,
χfld, from the presentation syntax of Section
EBNF Grammar for the
Presentation Syntax of RIF-FLD to the XML syntax of RIF-FLD.
The mapping is given via tables where each row specifies the
mapping of a particular syntactic pattern in the presentation
syntax. These patterns appear in the first column of the tables and
the bold-italic symbols represent metavariables. The
second column represents the corresponding XML patterns, which may
contain applications of the mapping χfld to
these metavariables. When an expression
χfld(metavar) occurs in
an XML pattern in the right column of a translation table, it
should be understood as a recursive application of
χfld to the presentation syntax represented by
the metavariable. The XML syntax result of such an application is
substituted for the expression
χfld(metavar). A
sequence of terms containing metavariables with subscripts is
indicated by an ellipsis. A metavariable or a well-formed XML
subelement is marked as optional by appending a bold-italic
question mark, ?, to its right.

4.2.1 Mapping of the Non-annotated
RIF-FLD Language

The χfld mapping from the presentation
syntax to the XML syntax of the non-annotated RIF-FLD Language is
given by the table below. Each row indicates a translation
χfld(Presentation) = XML.
Since the presentation syntax of RIF-FLD is context sensitive, the
mapping must differentiate between the terms that occur in the
position of the individuals and the terms that occur as atomic
formulas. To this end, in the translation table, the positional and
named argument terms that occur in the context of atomic formulas
are denoted by the expressions of the form pred(...)
and the terms that occur as individuals are denoted by expressions
of the form func(...). In the table, each
metavariable for an (unnamed) positional
argumenti is assumed to be instantiated to
values unequal to the instantiations of named arguments
unicodestringj->fillerj. Regarding the last but first row,
we assume that shortcuts for constants [RIF-DTB] have already been expanded to their full form
("..."^^symspace).

Note that while the Import and the Dialect
directives are handled by the presentation-to-XML syntax mapping,
the Prefix and Base directives are not. Instead,
these directives should be handled by macro-expanding the
associated shortcuts (compact URIs). Namely, a prefix name declared
in a Prefix directive is expanded into the associated IRI,
while relative IRIs are completed using the IRI declared in the
Base directive. The mapping χfld
applies only to such macro-expanded documents. RIF-FLD also allows
other treatments of Prefix and Base provided that
they produce equivalent XML documents. One such treatment is
employed in the examples in this document, especially Example 3. It
replaces prefix names with definitions of XML entities as follows.
Each Prefix declaration becomes an ENTITY
declaration [XML1.0]
within a DOCTYPE DTD attached to the RIF-FLD
Document. The Base directive is mapped to the
xml:base attribute [XML-Base] in the XML Document tag. Compact URIs of
the form prefix:suffix are then mapped to
&prefix;suffix.

4.2.2 Mapping of RIF-FLD
Annotations

The χfld mapping from RIF-FLD annotations in
the presentation syntax to the XML syntax is specified by the table
below. It extends the translation table of Section Mapping of the
Non-annotated RIF-FLD Language. The metavariable
Typetag in the presentation and XML syntaxes stands
for any of the class names And, Or,
External, Document, or Group,
Quantifier for Exists or Forall,
and Negation for Neg or Naf. The
dollar sign, $, stands for any of the binary infix operator
names #, ##, =, or :-, while
Binop stands for their respective class names
Member, Subclass, Equal, or
Implies. The metavariable attr? is used with
Typetag to capture the optional dialect
attribute (with its value) of Document. Again, each
metavariable for an (unnamed) positional
argumenti is assumed to be instantiated to
values unequal to the instantiations of named arguments
unicodestringj->fillerj.

5 Conformance of RIF Processors with RIF
Dialects

RIF does not require or expect conformant systems to implement
the presentation syntax of a RIF dialect. Instead, conformance is
described in terms of semantics-preserving transformations.

Let Τ be a set of datatypes that includes the datatypes specified
in [RIF-DTB], and suppose Ε is
a set of external predicates and functions that includes the
built-ins listed in [RIF-DTB].
Let D be a RIF dialect (e.g., [RIF-BLD]). We say that a formula φ is a
DΤ,Ε formula iff

it is a formula in the dialect D,

all the datatypes used in φ are in Τ, and

all the externally defined functions and predicates used in φ
are in Ε.

A RIF
processor is a conformantDΤ,Εconsumer iff it implements a
semantics-preserving mapping, μ, from the set of all
DΤ,Ε formulas to the language L of the
processor.

Formally, this means that for any pair φ, ψ of
DΤ,Ε formulas for which φ
|=D ψ is defined, φ
|=D ψ iff μ(φ)
|=L μ(ψ). Here
|=D denotes the logical entailment in
the RIF dialect D and |=L is the
logical entailment in the language L of the RIF
processor.

A RIF processor is a conformantDΤ,Εproducer iff it implements a
semantics-preserving mapping, ν, from a subset of the language
L of the processor to the set of DΤ,Ε
formulas.

Formally, this means that for any pair φ, ψ of formulas in the
aforesaid subset of L for which φ
|=L ψ is defined, φ
|=L ψ iff ν(φ)
|=D ν(ψ).

A conformant document in a logic RIF dialect D
is one which conforms to all the syntactic constraints of D,
including the ones that cannot be checked by an XML Schema
validator (see Definition Conformant XML document in a logic dialect).

7 Appendix: XML Schema for
RIF-FLD

The namespace of RIF is
http://www.w3.org/2007/rif#.

XML schemas for the RIF-FLD language are defined below and are
also available here
with additional examples. For modularity, we define a
Baseline schema and a Skyline schema. Baseline is the
schema module that provides the foundation up to FORMULAs
without Implies. Skyline provides the full schema by
augmenting Baseline with the ImpliesFORMULA as
well as with Group and Document.