XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Relax NG, Compared

January 23, 2002

Introduction

This article is a companion to two different works already published on XML.com: my introduction to W3C XML Schema is a tutorial introducing the language's main features, with a progression which I hope is intuitive; and my comparison between the main schema languages, an attempt to provide an objective and practical feature-by-feature comparison between XML schema languages. In this new article, I have taken the same approach as the one used in the W3C XML Schema tutorial but this time I've implemented the schemas using RELAX NG.

While the result is neither an optimal tutorial for RELAX NG -- since the progression designed for W3C XML Schema is not ideal to get started with RELAX NG -- nor an impartial comparison between these languages, I think it provides a good starting point for those of us who know W3C XML Schema and want to quickly point out the differences with RELAX NG. Links are provided throughout to the corresponding sections of the W3C XML Schema tutorial, and you are encouraged to follow both simultaneously.

Introducing our First Schema

Table of Contents

Introducing Our First Schema

Slicing the Schema

Defining Named Types

Groups, Compositors and Derivation

Content Types

Building Usable -- and Reusable -- Schemas

Namespaces

[ Corresponding chapter for W3C XML Schema]

The document which we will be using for an example is the same that we saw in our W3C XML Schema tutorial:

<?xml version="1.0" encoding="UTF-8"?>
<book isbn="0836217462">
 <title>
  Being a Dog Is a Full-Time Job
 </title>
 <author>Charles M. Schulz</author>
 <character>
  <name>Snoopy</name>
  <friend-of>Peppermint Patty</friend-of>
  <since>1950-10-04</since>
  <qualification>
    extroverted beagle
  </qualification>
 </character>
 <character>
  <name>Peppermint Patty</name>
  <since>1966-08-22</since>
  <qualification>bold, brash and tomboyish</qualification>
 </character>
</book>

We will follow the same design style that we used for our first W3C XML Schema to describe the document and will design it as a "Russian doll".

A RELAX NG schema is very close to a textual description of a vocabulary. To describe this document, we could say that we define a grammar starting with an element named book and this is pretty much what we will write as a RELAX NG schema.

<?xml version="1.0" encoding="UTF-8"?>
<grammar 
 xmlns="http://relaxng.org/ns/structure/1.0">
 <start>
 <element name="book">
     .../...
 </element>
 </start>
</grammar>

To describe the element named book, we could say that it is composed of an attribute named isbn, an element named title, an element named author and zero or more elements named character:

<?xml version="1.0" encoding="UTF-8"?>
<grammar 
 xmlns="http://relaxng.org/ns/structure/1.0">
 <start>
  <element name="book">
   <attribute name="isbn">
        .../...
   </attribute>
   <element name="title">
        .../...
   </element>
   <element name="author">
        .../...
   </element>
   <zeroOrMore>
    <element name="character">
         .../...
    </element>
   </zeroOrMore>
  </element>
 </start>
</grammar>

RELAX NG has a clear separation between structure and datatypes, and we will see later on how we can plug a datatype system into our schema. For the moment, we will just consider that the values are not typed, i.e. that they are just text, and say so:

<?xml version="1.0" encoding="UTF-8"?>
<grammar 
 xmlns="http://relaxng.org/ns/structure/1.0">
 <start>
  <element name="book">
   <attribute name="isbn">
    <text/>
   </attribute>
   <element name="title">
    <text/>
   </element>
   <element name="author">
    <text/>
   </element>
   <zeroOrMore>
    <element name="character">
        .../...
    </element>
   </zeroOrMore>
  </element>
 </start>
</grammar>

The last thing that we need to do is to define the element named character. This can be done the same way by saying that it is composed of an element named name, an optional element named friend-of, an element named since, and an element named qualification.

<?xml version="1.0" encoding="UTF-8"?>
<grammar 
 xmlns="http://relaxng.org/ns/structure/1.0">
 <start>
  <element name="book">
   <attribute name="isbn">
    <text/>
   </attribute>
   <element name="title">
    <text/>
   </element>
   <element name="author">
    <text/>
   </element>
   <zeroOrMore>
    <element name="character">
     <element name="name">
      <text/>
     </element>
     <optional>
      <element name="friend-of">
       <text/>
      </element>
     </optional>
     <element name="since">
      <text/>
     </element>
     <element name="qualification">
      <text/>
     </element>
    </element>
   </zeroOrMore>
  </element>
 </start>
</grammar>

Our first schema is now complete and we can use it to validate our instance document using, for instance, jing, the Java open source implementation of RELAX NG written by James Clark.

To make it more comparable to the W3C XML Schema, we need to see how a datatype system can be embedded. This is done by declaring which datatype system we will use and replacing the "text" elements by "data" elements. The editors of RELAX NG believe that there can be no universal datatype system and that, beyond some very basic universal types, each application domain has its own requirements. RELAX NG defines a generic mechanism for plugging in external type systems. The current implementations support W3C XML Schema datatypes. To use this datatype system in our schema, we will update it and write:

<?xml version="1.0" encoding="UTF-8"?>
<grammar 
 xmlns="http://relaxng.org/ns/structure/1.0"
 datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
 >
 <start>
  <element name="book">
   <attribute name="isbn">
    <data type="nonNegativeInteger"/>
   </attribute>
   <element name="title">
    <data type="token"/>
   </element>
   <element name="author">
    <data type="token"/>
   </element>
   <zeroOrMore>
    <element name="character">
     <element name="name">
      <data type="token"/>
     </element>
     <optional>
      <element name="friend-of">
       <data type="token"/>
      </element>
     </optional>
     <element name="since">
      <data type="date"/>
     </element>
     <element name="qualification">
      <data type="token"/>
     </element>
    </element>
   </zeroOrMore>
  </element>
 </start>
</grammar>

Pages: 1, 2, 3, 4, 5

Next Pagearrow