RuleML has been designed for the interchange of the major kinds of Web rules in an XML format that is uniform across various rule languages and platforms. It has broad coverage and is defined as an extensible family of languages, whose modular system of schemas permits rule interchange with high precision.

The scope of this specification covers

content models for the named sublanguages, available from the Content Models document (along with a glossary of all elements, available from the Glossary);

a mapping from the Version 1.0 syntax to the "normalized" Version 1.0 syntax as given in the Normalizer XSLT stylesheet;

a mapping from the preceding Version 0.91 XML syntax to Version 1.0 XML syntax as given in the Upgrader XSLT stylesheet.

The scope of this specification does not cover tools that make use of RuleML languages, such as parsers, inference engines or editors. For a listing of such tools, see the RuleML Implementations wiki page.

Version 1.0 is a "Rosetta Stone" release where two schema languages, XSD and Relax NG, are independently employed to formalize the syntax, to the extent possible within each language. See Design and Implementation of Highly Modular Schemas for XML: Customization of RuleML in Relax NG (RuleML in Relax NG) and the RuleML MYNG Wiki page for the details of this re-engineering effort. MYNG is an acronym for "Modular sYNtax confiGurator", or "Modularize Your NG", and may be pronounced either "ming" or "my N G".

The Version 1.0 XSD schemas are a minor modification of the patched XSD schemas of Version 0.91, which follow the tree-based modularization approach. A procedure for validation against the XSD schemas is described in Appendix 3.

The Relax NG schemas are a new component of the Version 1.0 release, and are intended to replace the hand-written XSD schemas in future releases. The Relax NG schemas are available for two serializations, "normal" and "relaxed", and follow a lattice-based modularization (RuleML in Relax NG). The relaxed-form serialization is permissive relative to the grammar given in the Content Models document, i. e. every Version 1.0 instance that is valid with respect to the Content Models document will validate against the relaxed serialization Relax NG schemas. The normal serialization Relax NG schemas validate instances in the normal form, which is described in the Normalizer section. Procedures for validation against the Relax NG schemas is described in Appendix 5.

While the DTD[1] schemas of RuleML have been provided up to and including RuleML 0.85[2] (see archive), for versions later than RuleML 0.85, conversion tools (such as XMLSpy[3] and oXygen[4]) are recommended for obtaining DTDs from the newer RuleML schemas in XSD or Relax NG.

1 Overview

An introduction to RuleML is given in our Primer. Also, the paper Overarching discusses the upper layer of the RuleML hierarchy of rules. In that terminology, the system of RuleML languages presented here covers derivation rules, which includes Horn Logic, and part of deliberation rules, in particular, First-Order Logic (FOL) with equality and negation-as-failure.

This is because we think it is important to start with a subset of simple rules (derivation rules), test and refine our principal strategy using these, and then proceed to more general categories of rules in the hierarchy (deliberation rules), as well as to other kinds of rules (reaction rules).

The grammar of Version 1.0 is partially described in the Content Models document, where the content model of each individual element is compared across the named sublanguages, as specified within the XSD Schemas. The RuleML grammar is also partially described by the Relax NG Schemas.

Below is a summary of the changes in Version 1.0:

The terminology "Type" and "role" (when referring to XML tags or elements) has been changed to "Node" and "edge".

Schematron dependency has been reduced (as comments in the XSD schema code)

A new edge element <act> introduced as optional wrapper of all performatives (e.g. <Assert>) in the <RuleML> element

A new edge element <meta> is introduced as a non-skippable wrapper of a formula expressing meta-knowledge, allowed zero or more times in the header of non-leaf Node elements.

A new attribute @node is introduced as an optional IRI identifier of a Node element, useful for referencing the element in external meta-knowledge.

The edge element <oid> is no longer allowed on all non-leaf Node elements, but is restricted to <Atom> and <Expr>, for use as an identifier for the object of slotted atoms and expressions.

The attributes @xml:base and @xml:id are allowed on all elements.

Additional freedom in element order with partial stripe-skipping (see Overarching for an explanation of stripe-skipping and element order in RuleML) has been introduced in the "relaxed-form serialization", as specified in the Relax NG schemas.

Readers who already know Version 0.91 may want to refer to the Changes section.

2 Examples

Numerous sample RuleML documents have been prepared and maintained; some exemplify features of RuleML and are useful didactically while others are mostly for testing puposes. Updated examples (e.g., own.ruleml and reify.ruleml) accompany the Version 1.0 release and can be found with all the others in the Examples directory. The following examples are new:

several examples in the MYNG examples directory illustrating the new features introduced in the Relax NG schemas, including greater flexibility in stripe-skipping, decreased dependendence on element order and a fine-grained modularization of features.

Examples from previous versions of RuleML are also maintained, e.g. 0.91 examples).

Upgrader transformation can be accomplished using a web-based XSLT tool, Online XSLT 2.0 Service, provided by W3C. Instructions for this process can be found in Appendix 4.

The stylesheet has also been tested using oXygen version 12.2, whereby it was confirmed that all examples in the directory http://ruleml.org/1.0/exa are properly upgraded. However, the upgraded instances documents will have the RuleML namespace as the default namespace, independent of the choice of prefix for the RuleML namespace in the original instance. A similar 091-to-ruleml100.xslt has been developed that uses the prefix "ruleml" for the RuleML namespace.

An XSLT processor which may be used to perform these transformations on a whole directory at once is Saxon, using the following command:

4 XSLT-Based Normalizer

RuleML has always allowed abbreviated encoding (skipped edge tags and default attribute values) and some freedom in the ordering of elements. The new Relax NG schemas allows even more freedom in element ordering than the XSD schemas, with all perturbations possible as long as the missing edge tags may be unambiguously reconstructed. An XSLT stylesheet has been developed (see RON, also see the Normalizer directory) for normalizing the syntax used in a given Version 1.0 instance, undoing the abbreviated encoding and re-ordering of elements. An example of the normal form is the expanded version of the compact 'own' example.

The goals of this normalizer include the following:

Reconstruct all skipped or optional edge tags to produce a fully striped form;

Perform canonical child ordering of sub-elements;

Make all default attributes explicit.

We say that using this normalizer followed by schema-based RuleML validation performs "normalidation" on RuleML instances. That is to say, not only are missing edge tags and attributes with default values inserted into the Version 1.0 instance in order to normalize it, but the general tree structure of the file is also validated syntactically to ensure that Node and edge elements only appear in correct positions.

The following is a list of the operations the Version 1.0 XSLT Normalizer should perform whenever it is used on an XML instance. Note that the Normalizer is intended to successfully transform XML instances that may not be valid Version 1.0 (validation is the second step of the "Normalidation" process); therefore no assumptions are made about the names or number of child elements in the instance being transformed.

When <if>-<then> edge elements are both missing, wrap the first "naked Node" (a Node element that has a Node parent element) child within an <if> edge element, and the second naked Node child within a <then> edge element.

If exactly one of the <if>-<then> edge elements is missing, wraps the first naked RuleML Node element within the missing edge element.

Re-orders the elements as <oid>, then <if>, followed by <then>, followed by any other children.

Adds <arg> edge elements (with an index attribute) wrapping each naked Node. The naked Node children that may occur within this element in a valid Version 1.0 instance include: <Ind>, <Var>, <Fun>, <Plex>, <Reify>, <Data>, or <Skolem>.

Re-order the child elements as <oid>, followed by <degree>, followed by <op>, followed by positional arguemnts (<arg>, then <repo>) followed by slotted arguemnts (<slot>, then <resl>), followed by any other children.

Add <arg> edge elements (with an index attribute) to each of the remaining naked child Nodes. The naked Node children that may occur within this element in a valid Version 1.0 instance include: <Const>, <Var>, <Reify>, <Uniterm>, <Skolem>.

Re-order the child elements as <oid>, followed by <degree>, followed by <op>, followed by positional arguemnts (<arg>, then <repo>, followed by slotted arguemnts (<slot>, then <resl>), followed by any other children.

If there are two naked child Nodes, wraps first naked Node child with a <left> element and second naked Node child with <right> element.

Partial completion is also accepted. Example: if the first child has a <left> edge element, and the second child has no <left> or <right> edge element, the XSLT will wrap the second child with the <right> edge element.

Re-order the child elements as <oid>, followed by <degree>, followed by <left>, followed by <right>, followed by any other children.

7 Modularization

The Relax NG schemas use a different modularization approach based on lattices. The details are given in the publication about the re-engineering of RuleML in Relax NG and the RuleML MYNG Wiki page. MYNG is accessible through a GUI or, for advanced users, a PHP script .

8 XSD Schemas

The XSD Schemas specification of Version 1.0 has been created using an approach that is consistent with that of all earlier XSDs, and back to that of the (version 0.85) DTDs. The Content Models document should be read in parallel to the XSD schemas, because they give a high-level, complete documentation of the XSDs. Likewise, the Glossary can help to find quick descriptions of, and cross-reference between, the Version 1.0 XML elements.

10 Appendices

Appended below is a simple example rulebase (Appendix 1), the XSD Schema for the Datalog sublanguage of RuleML to which the rulebase conforms (Appendix 2), instructions for how to validate instances against the XSD schema (Appendix 3), instructions for how to transform with XSLT stylesheets (Appendix 4), and instructions for how to validate instances against the Relax NG schema (Appendix 5).

Note: The validation may take a while, and may require a full refresh when re-validating to avoid caching. Also note: Depending on your browser, you may want to select a different output using the radio buttons just above the "Get Results" button.

For the xml resource use a file that is either considered compact RuleML (for the normalizer) or a RuleML file using an older version of RuleML (for the upgrader). Compact examples for the normalizer can be found in the Normalizer directory, while old version examples for the upgrader can be found in the Upgrader directory.

Once both URL's have been entered, click transform. To see the result, in some browsers you need to do "View | Page Source".

10.6 Appendix 5: Validating an Instance against a Relax NG schema

Validator.nu is an easy-to-use online tool for validating an XML instance against a Relax NG schema. A basic procedure for using this tool is as follows:

File Upload - use the browse dialog to locate the file on your local hard drive or network;

Text Field - type or paste text directly into the text area.

Skip the Encoding field

Type or paste a URL to the schema you wish to validate against. Only schemas available from the internet can be used. For example, the URL http://ruleml.org/1.0/relaxng/schema_rnc.php may be used to validate all Version 1.0 instances.

Skip the rest of the fields, and click Validate.

The result will appear below, followed by a copy of the instance source.

10.7 Appendix 6: Changes

Changes in the Version 1.0 XSD release relative to the previous XSD version 0.91 are detailed below, including examples where appropriate.

The terminology "Type" and "role" (when referring to XML tags or elements) has been changed to "Node" and "edge", in order to avoid the confusion caused by simultaneous usage of multiple meanings of these words, as well as to emphasize the connection to graphs, especially the RDF graph model.

Schematron dependency has been reduced (as comments in the code). For example, the following code provides annotation related to restrictions on the interpretation attribute of nested functions.

Attribute @uri is replaced with @iri. This attribute is used to identify, by Internationalized Resource Identifier (IRI), the elements <Ind>, <Rel> and <Fun> as individuals, relations and functions, respectively. For example:

A new edge element <act>, with a required @index attribute, is introduced as an optional wrapper for all performatives of the <RuleML> element, as shown here:

<RuleML>
<act index="1">
<Assert/>
</act>
</RuleML>

A new attribute @node is introduced as an optional IRI identifier of a Node element, useful for referencing the element in external meta-knowledge. It is allowed on all Nodes, including leaf Nodes such as <Data>, <Reify> and <Rel>.

<Implies node = "#rule1">
<if>...</if><then>...</then>
</Implies>

A new edge element <meta> is introduced as a non-skippable wrapper of a formula expressing meta-knowledge, allowed zero or more times in the header of non-leaf Node elements. When a slotted <Atom> is wrapped in <meta>, the object of the slotted frame is assumed (unless specified otherwise) to be the reification of the parent Node, which may be specified with the @node attribute, or may be anonymous.

The edge element <oid> is no longer allowed on all non-leaf Node elements, but is restricted to <Atom> and <Expr> (and <Uniterm> in the SWSL branch,) for use as an identifier for the object of slotted atoms and expressions. The functionality of providing an identifier for attaching meta-knowledge, such as the defeasible priority of rules, is transferred to the @node attribute.

The attributes @xml:base and @xml:id are allowed on all elements. The base for resolving relative IRI references is explicitly declared with @xml:base. The attribute @xml:id provides an alternate means of identifying elements, and is applicable to edges as well as nodes. The semantics of the @xml:id identifier, which denotes the information resource fragment that is the element and is referentially opaque, differs from that of the @node identifier, which denotes a reification of the logical or extra-logical entity associated with the element and is referentially transparent.

Additional freedom in element order with partial stripe-skipping has been introduced in the "relaxed form serialization", as specified in the Relax NG schemas. In particular, infix and postfix operator notation is now allowed in the relaxed form serialization as long as the <op> element is not skipped.

10.8 Appendix 7: Issues

Features that are "deprecation candidates", i.e. that may be deprecated in future releases, include:

<Reify> at Datalog and lower, as this introduces the possibility of nesting reification to an arbitrary level ;

the (SWSL) syntax, whose current syntax may be deprecated in favor of better integration with the rest of the RuleML syntax.

During the review of the Version 0.91 XSD schemas that was conducted in conjunction with the re-engineering of RuleML in Relax NG, a number of shortcomings and defects were identified. A number of these issues were patched in the Version 0.91 Patched release, and these patched XSDs form the basis of the hand-written XSDs for Version 1.0. There are several issues that could not be resolved in the handwritten XSDs, including:

Reification in Languages Below Datalog: the Binary sublanguages have an overly-permissive schema for reification (<Reify>). For example, in bindatagroundfact, it is valid to reify a universal quantification (<Forall> ), while it is not allowed to construct a universal quantification in the language. It is not possibly to fix this issue without discarding the tree-based modularization approach of the hand-written schemas. Because Version 1.0 is a "Rosetta" release, we have retained the hand-written XSD schemas that contain this issue for consistency with legacy schemas. However, the Relax NG schemas are not affected by this issue, and thus there is a discrepancy between the XSD and Relax NG schemas in these unusual cases.

Circular Group Definitions: all schemas that redefine the Datalog XSD schema use circular group definitions, leading to an inability to validate against these schemas using some validators, such as Xerces. Similar to the issue above, it is not possible to fix this issue within the modularization approach of the hand-written XSD schemas. However, XSV and Saxon validators are lax in enforcing Constraints on Model Group Schema Components (§3.8.6) of the XSD specification, Part 1, and are able to work around it. Relax NG schemas and the XSD schemas automatically generated from the Relax NG schemas are not affected by this issue.

The latter two issues will be resolved in Version 1.1 by discontinuing support of the handwritten XSDs, relying instead on XSD schemas automatically generated from the Relax NG schemas.