Schema Specification of RuleML 0.86

David Hirtle, Harold Boley, Carlos Damasio, Benjamin Grosof, Said Tabet, Gerd Wagner

Version History, 2001-01-25: Version 0.7
Version History, 2001-07-11: Version 0.8
Version History, 2003-12-09: Version 0.85

Version History, 2004-06-23: Version 0.86

Latest version: www.ruleml.org/spec

Archived DTDs: 0.85 DTD Directory

All Schemas: XSD Directory

Some Examples: Examples Directory


This is a complete and stable XML Schema specification for RuleML which overcomes certain issues encountered in the transitional release of RuleML 0.85. RuleML 0.85 now serves as a "Rosetta stone" for the DTD and XSD specifications of RuleML, useful for alignment between the two. The Document Type Definition (DTD) specification of RuleML will no longer be maintained, but will continue to be available as an archive.

Each XML Schema Definition (XSD) in the evolving hierarchy corresponds to a specific RuleML sublanguage. The implementation uses a modularization approach similar to the one in XHTML in order to offer appropriate flexibility and accomodate different implementations and approaches.

Contents

Overview

The upper layer of the RuleML hierarchy of rules is discussed in our main page's design section. In that terminology, the system of RuleML XSDs presented here only covers derivation rules, not reaction rules.

This is because we think it is important to start with a subset of simple rules, test and refine our principal strategy using these, and then work 'up' to the more general categories of rules in the hierarchy. For this we choose Datalog, a language corresponding to relational databases (ground facts without complex domains or 'constructors') augmented by views (possibly recursive rules), and work a few steps upwards to further declarative rules as allowed in (equational) Horn logic. We also introduce a URL/URI module corresponding to simple objects. The 'UR'-Datalog join of both of these classes then permits inferences over RDF-like 'resources' and can be re-specialized to RDF triples.

As an XSD-stable release of the transitional RuleML 0.85, version 0.86 adds the same features:

A full discussion of (most of) these additions can be found in the Object-Oriented RuleML section. Further changes include n-tuple (tup) being renamed (com)plex to accomodate unordered complex structures, and role-list (roli) now being simulated by complex structures containing nothing but metaroles (_slot). A presentation summarizing this work (as part of 0.85) entitled Object-Oriented RuleML: Re-Modularized and XML Schematized via Content Models (PDF) is also available.

The only change since 0.85 in this release concerns the (now stable) XSD implementation. To avoid certain limitations in XML Schema, the RuleML sublanguages have been remodularized since version 0.85.

Note that there are currently no plans to continue maintaining a DTD specification of RuleML, and as such the 0.85 DTDs are expected to be the last. As always, the DTDs etc. of previous versions (including 0.85) will remain untouched, and can be considered as languages by themselves.

Changes

First, two long-planned changes have now been officially incorporated: and/or nesting in the body and two kinds of negation. Also, non-positional user-level roles have been incorporated without duplicating the number of existing sublanguages. Finally, handles for term typing and URI-grounded clauses have been established as a couple of preliminary XML attributes.

Also, a few tags have been renamed: '_r' became '_slot', 'n' has been expanded to 'name', and 'w' is now 'weight'. This deviation from the earlier OO RuleML is the result of an attempt to increase readability.

Some of the OO DTD changes (namely, user-level roles via the metarole '_slot' and role weights via the attribute 'weight') are detailed in the OO RuleML section, but specific implementations are often more complex in order to comply with the XML 1.0 specification which prohibits non-determinististic (i.e. ambiguous) content models.

user level roles:

<!--
intuitively, the content model is
"((_opr, (_slot)*, (ind | var | cterm | plex)*, (_slot)*) | ((_slot)*, (ind | var | cterm | plex)+, (_slot)*, _opr))"
but this is non-deterministic
-->
<!ELEMENT atom (
                  ( _opr,
                    (_slot)*, ( (ind | var | cterm | plex)+, (_slot)*)?
                  ) 
                |
                  (
                     (
                        ((_slot)+, ( (ind | var | cterm | plex)+, (_slot)* )?)
                      |
                        ((ind | var | cterm | plex)+, (_slot)*)
                     ),
                     _opr
                  )
               )>

<!--
again, the content model is
"((_opc, (_slot)*, (ind | var | cterm | plex)*, (_slot)*) | ((_slot)*, (ind | var | cterm | plex)+, (_slot)*, _opc))"
but this is also ambiguous
-->
<!ELEMENT cterm (
                   ( _opc,
                     (_slot)*, ( (ind | var | cterm | plex)+, (_slot)*)?
                   ) 
                 |
                   (
                      (
                         ((_slot)+, ( (ind | var | cterm | plex)+, (_slot)* )?)
                       |
                         ((ind | var | cterm | plex)+, (_slot)*)
                      ),
                      _opc
                   )
                )>

<!ELEMENT plex (
                (_slot)*, ( (ind | var | cterm | plex)+, (_slot)* )?
              )>

<!ELEMENT _slot (ind | var | cterm | plex)>
<!ATTLIST _slot name CDATA #REQUIRED>
<!ATTLIST _slot card CDATA #IMPLIED>

URI-grounding:

<!-- Note: 'href' is only an example attribute, easily replaced with e.g. 'wid' and 'widref' -->
<!ATTLIST ind href CDATA #IMPLIED>
<!ATTLIST rel href CDATA #IMPLIED>
<!ATTLIST ctor href CDATA #IMPLIED>

term typing:

<!ATTLIST ind type CDATA #IMPLIED>
<!ATTLIST var type CDATA #IMPLIED>
<!ATTLIST cterm type CDATA #IMPLIED>

weighted extension:

<!ATTLIST _slot weight CDATA #IMPLIED>

These modifications can be seen in the 0.85 DTDs, as well as in the XML Schema representation. Validation instructions are included in each directory, as well as in the appendix.

Re-Modularization

The modularization used for Versions 0.7 and 0.8 was inverted in RuleML 0.85 to be more intuitive. The motivating factors behind this switch were simplicity (a single root with two distinct branches), consistency (inheritance in a single direction, for obvious super/subclass relationships) and efficiency (non-redundant implementation).

However, a limitation within XML Schema prevented this approach from being easily implemented (resulting in the unstable XSDs in 0.85), so the modularization was re-analyzed and now a new model has been devised. This updated model reflects both the -- now fully-validating -- (XSD) implementation and expressiveness layering of RuleML, simultaneously capturing both the abstract and concrete.

Content Model-Based Approach

DTDs have limited support for modularity, but it can be achieved in a roundabout way using macro-like parameter entities. In particular, the contents of an external file can be included using an externally-linked parameter entity. For example, the following includes the contents of datalog.dtd:

<!ENTITY % datalog_include SYSTEM "datalog.dtd">
%datalog_include;

Simple inclusion is not enough, though: overriding is also necessary. Previously, this was managed using INCLUDE/IGNORE sections: the section that declared the element which had to be changed was simply IGNOREd, then the element was re-declared.

In Version 0.85, this clumsy method of overriding is handled much more elegantly. Every element's content model is explicitly defined by a parameter entity. The metarole '_slot', for example, is declared as follows:

<!ENTITY % _slot.content "(ind)"> 
<!ELEMENT _slot %_slot.content;>

Since parameter entities can overwrite one another (even across files), this content model is easily replaced with another specified in a different DTD altogether, much like re-assigning a global variable in traditional programming languages. For example, the content model of the metarole '_slot' is just "(ind)" in urcbindatagroundfact.dtd (as above), but is extended to permit a variable (thus, becoming "(ind | var)") in urcbindatalog.dtd:

<!ENTITY % _slot.content "(ind | var)">

(Note that this overriding entity must be defined before the inclusion of other files.)

A visual demonstration of this process can be found on slide 25 of the Object-Oriented RuleML: Re-Modularized and XML Schematized via Content Models presentation.

The content model-based approach to modularization also works for XML Schema, using groups (and attributeGroups) instead of parameter entities. For example,

<!ENTITY % _slot.content "(ind)"> 
<!ELEMENT _slot %_slot.content;>

becomes

<xs:attributeGroup name="_slot.attlist"/>
<xs:group name="_slot.content">
   <xs:sequence>
      <xs:element ref="ind"/>
   </xs:sequence>
</xs:group>
<xs:complexType name="_slot.type" mixed="true">
   <xs:group ref="_slot.content"/>
   <xs:attributeGroup ref="_slot.attlist"/>
</xs:complexType>
<xs:element name="_slot" type="_slot.type"/>

There is no need for workarounds in XSD: <redefine> makes the specified changes and includes everything else. For example,

<!ENTITY % _slot.content "(ind | var)">

<!ENTITY % include SYSTEM "urcbindatagroundlog.dtd">
%include;

becomes

<xs:redefine schemaLocation="urcbindatagroundlog.xsd">
   <xs:group name="_slot.content">
      <xs:choice>
         <xs:group ref="_slot.content"/>
         <xs:element ref="var"/>
      </xs:choice>
   </xs:group>
</xs:redefine>

XSD Representation

A stable XML Schema representation of RuleML version 0.86 has been created with a further improved modularization and implementation approach that is consistent with the (version 0.85) DTDs.

Validation

To ensure validation stability, the 0.86 XSDs have been tested (using corresponding instance documents/examples) with various validators/tools. A summary of these validation results follows:

W3C XML Schema Validator (XSV) v 2.7-1
All examples and schemas validate perfectly. See Appendix 3 for instructions on validating an example against its XML Schema and the corresponding output.

Altova XMLSpy v 2004 rel. 4
XMLSpy actually uses two validators: one available in the text/grid view and another exclusively in the schema view. The two validators currently may produce different results, as happens to be the case with the RuleML 0.86 XSDs. The "schema" view is preferred for validating RuleML schemas; they validate perfectly with the slight modification of commenting out self-references as follows:

   <xs:attributeGroup name="rel.attlist">
      <!-- <xs:attributeGroup ref="rel.attlist"/> -->
      <xs:attributeGroup ref="href.attrib"/>
   </xs:attributeGroup>

Saxon v SA 8.0
All examples and schemas validate perfectly (see Michael Kay's confirmation). Sample output:

   $ java com.saxonica.Validate -t http://www.ruleml.org/0.86/exa/own.ruleml
   Saxon-SA 8.0 from Saxonica
   Java version 1.4.1_02
   Processing http://www.ruleml.org/0.86/exa/own.ruleml
   Execution time: 1132 milliseconds

Future Work

Appendix

Appended below is the XML Schema (version 0.86) for a Datalog subset of RuleML (Appendix 1). Also appended below is a simple example rulebase that conforms to that XSD (Appendix 2), and instructions for how to validate the example against the schema (Appendix 3).

The entire family of 0.86 XSDs, specified in a modular fashion, are available here: http://www.ruleml.org/0.86/xsd. The XSD files in the main directory represent actual sublanguages of RuleML, whereas those in the modules subdirectory are elementary components included in the main XSDs. For more information, see the Modularization section.

More sample files -- each referring to the most specific XSD which validates it -- can be found at http://www.ruleml.org/0.86/exa.

Appendix 1: XSD for a Datalog subset of RuleML (datalog.xsd)

<?xml version="1.0"?>
<xs:schema 
targetNamespace="http://www.ruleml.org/0.86/xsd" 
xmlns="http://www.ruleml.org/0.86/xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>

	<xs:annotation>
		<xs:documentation xml:lang="en">
			XML Schema for a Datalog RuleML sublanguage
			File: datalog.xsd
			Version: 0.86
			Last Modification: 2004-06-23
		</xs:documentation>
	</xs:annotation>

	<!--
		Note that datalog is entirely composed of modules and
		that all other schema drivers rely on it, making it the
		"root" of the sublanguage family tree.
	-->
	
	<!--
		Datalog includes the following modules:
		* toplevel
		* desc
		* clause
		* boole
		* atom
		* role
		* term

		For details on each module, including what element and/or
		attribute declarations they contain, please refer to them
		individually.
	-->
	   
	<xs:include schemaLocation="modules/toplevel_module.xsd"/>

	<xs:include schemaLocation="modules/desc_module.xsd"/>

	<xs:include schemaLocation="modules/clause_module.xsd"/>

	<xs:include schemaLocation="modules/boole_module.xsd"/>

	<xs:include schemaLocation="modules/atom_module.xsd"/>

	<xs:include schemaLocation="modules/role_module.xsd"/>

	<xs:include schemaLocation="modules/term_module.xsd"/>
	
</xs:schema>

Appendix 2: Example rulebase in RuleML (own.ruleml)

<?xml version="1.0"?>

<rulebase
xmlns="http://www.ruleml.org/0.86/xsd"
xsi:schemaLocation="http://www.ruleml.org/0.86/xsd http://www.ruleml.org/0.86/xsd/datalog.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>

<!-- start XML comment ...

This example rulebase contains four rules.
The first and second rules are implications; the third and fourth ones are facts.

In English:

The first rule implies that a person owns an object
if that person buys the object from a merchant and the person keeps the object.

As an OrdLab Tree:

 imp~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          *                         *
     head *                    body *
          *                         *
        atom~~~~~~~~~~~~~~~~~~     and~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                 *     |     |           |                                   |
             opr *     |     |           |                                   |
                 *     |     |           |                                   |        
                rel   var   var        atom~~~~~~~~~~~~~~~~~~~~~~~~~~~     atom~~~~~~~~~~~~~~~~~~
                 .     .     .                  *     |      |       |              *     |     |
                 .     .     .              opr *     |      |       |          opr *     |     |
                 .     .     .                  *     |      |       |              *     |     |
                own  person object             rel   var    var     var            rel   var   var
                                                .     .      .       .              .     .     . 
                                                .     .      .       .              .     .     .
                                                .     .      .       .              .     .     .
                                               buy  person merchant object        keep  person object

... end XML comment -->


<imp>
  <_head>
    <atom>
      <_opr><rel>own</rel></_opr>
      <var>person</var>
      <var>object</var>
    </atom>
  </_head>
  <_body>
    <!-- explicit 'and' -->
    <and>
      <atom>
        <_opr><rel>buy</rel></_opr>
        <var>person</var>
        <var>merchant</var>
        <var>object</var>
      </atom>
      <atom>
        <_opr><rel>keep</rel></_opr>
        <var>person</var>
        <var>object</var>
      </atom>
    </and>
  </_body>
</imp>

<!-- The second rule implies that a person buys an object from a merchant
if the merchant sells the object to the person. -->

<imp>
  <_head>
    <atom>
      <_opr><rel>buy</rel></_opr>
      <var>person</var>
      <var>merchant</var>
      <var>object</var>
    </atom>
  </_head>
  <_body>
    <atom>
      <_opr><rel>sell</rel></_opr>
      <var>merchant</var>
      <var>person</var>
      <var>object</var>
    </atom>
  </_body>
</imp>
 
 
<!-- The third rule is a fact that asserts that
John sells XMLBible to Mary. -->
 
<fact>
  <_head>
    <atom>
      <_opr><rel>sell</rel></_opr>
      <ind>John</ind>
      <ind>Mary</ind>
      <ind>XMLBible</ind>
    </atom>
  </_head>
</fact>
 
<!-- The fourth rule is a fact that asserts that
Mary keeps XMLBible.
 
Observe that this fact is binary - i.e., there are two arguments
for the relation. RDF viewed as a logical knowledge representation
is, likewise, binary, although its arguments have type restrictions,
e.g., the first must be a resource (basically, a URI). Some of the
DTD's on the RuleML website handle URL's/URI's (UR's); see especially
urc-datalog.dtd for inferencing with RDF-like facts -->
 
<fact>
  <_head>
    <atom>
      <_opr><rel>keep</rel></_opr>
      <ind>Mary</ind>
      <ind>XMLBible</ind>
    </atom>
  </_head>
</fact>
  
</rulebase>

Appendix 3: Instructions on validating the example against the XSD

Validating a RuleML 0.86 Sample Document: own.ruleml
====================================================

1. Direct your browser to 
http://www.w3.org/2001/03/webdata/xsv
(Validator for XML Schema REC 20010502 version).

2. Enter the following URL of our example RuleML file (or any other) into the textfield
preceded by "Address(es)": http://www.ruleml.org/0.86/exa/own.ruleml

3. If desired, check the "Show Warnings" box.

4. Click the "Get Results" button.

Note: The validation may take a while, and may require a full
refresh when re-validating to avoid caching.

Also note: Depending on your browser, you may want to select a different
output using the radio buttons just above the "Get Results" button.

***
 
You should get the following output (using the default output):

Schema validating with XSV 2.7-1 of 2004/04/01 13:40:50

  • Target: http://www.ruleml.org/0.86/exa/own.ruleml
       (Real name: http://www.ruleml.org/0.86/exa/own.ruleml
        Length: 3850 bytes
        Last Modified: Thu, 24 Jun 2004 18:24:59 GMT
        Server: Apache/1.3.27 (Unix) (Red-Hat/Linux) mod_fastcgi/2.4.2 mod_ssl/2.8.12 OpenSSL/0.9.6b DAV/1.0.3 PHP/4.1.2 mod_perl/1.26)
  • docElt: {http://www.ruleml.org/0.86/xsd}rulebase
  • Validation was strict, starting with type {http://www.ruleml.org/0.86/xsd}:rulebase.type
  • schemaLocs: http://www.ruleml.org/0.86/xsd -> http://www.ruleml.org/0.86/xsd/datalog.xsd
  • The schema(s) used for schema-validation had no errors
  • No schema-validity problems were found in the target

Schema resources involved

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/datalog.xsd (source: schemaLoc) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/toplevel_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/desc_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/clause_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/boole_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/atom_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/role_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded

Attempt to load a schema document from http://www.ruleml.org/0.86/xsd/modules/term_module.xsd (source: include) for http://www.ruleml.org/0.86/xsd, succeeded



Site Contact: Harold Boley. Page Version: 2004-07-09


"Practice what you preach": XML source of this homepage at index.xml (index.xml.txt);
transformed to HTML via the adaptation of Michael Sintek's SliML XSLT stylesheet at homepage.xsl (View | Page Source)