RELAX NG's Compact Syntax
by Michael Fitzgerald
|
Pages: 1, 2, 3
Going Context Free
I mentioned earlier in the article that the compact syntax can look like a
context-free grammar. The following example uses a start symbol and other
symbols that serve as terminals and non-terminals. For example, the symbol
year, on the left side of the equals sign, may be considered a
non-terminal, and the element definition on right side, a terminal:
# RELAX NG schema for a date
start = date
date = element date { attribute type { text },
(year & month & day), limits*}
year = element year { text }
month = element month { text }
day = element day { text }
include "limits.rnc"
The following instance is valid with regard to the foregoing:
<?xml version="1.0" encoding="UTF-8"?>
<date type="US">
<month>June</month>
<day>1</day>
<year>2002</year>
<limits days="30"/>
</date>
When translated, this example creates a different RELAX NG schema than the
ones shown previously, producing a <grammar> and
<start> element and several <define>
elements, as seen in this incomplete fragment:
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
<!-- RELAX NG schema for a date -->
<start>
<ref name="date"/>
</start>
<define name="date">
<element name="date">
<attribute name="type"/>
<interleave>
<ref name="year"/>
<ref name="month"/>
<ref name="day"/>
</interleave>
<zeroOrMore>
<ref name="limits"/>
</zeroOrMore>
</element>
</define>
A <grammar> element is a container for definitions. The
<start> element indicates the document element for an
instance, just as a document type declaration does. The
<define> elements contain patterns which can be referenced by
name (with a <ref> element) and therefore easily reused.
Back in the compact schema, a symbol for the limits pattern
(modified with *) was added to the end of the content model for
date, but where is it defined? It's defined in the included schema
limits.rnc (see the last line of the last compact example), which
looks like
# Limits for year, months, and days
limits =
element limits {
attribute years { text }?,
attribute months { text }?,
attribute days { text }?
}
When processed, the included compact schema is translated into RELAX NG XML
syntax as well. The resulting filename, limits.rng, is inferred
from limits.rnc. The included pattern contains a definition for the
limits element, which may contain up to three optional
attributes. The absence of element children indicates that its content is
empty.
Conclusion
It would several more articles to cover all aspects of RELAX NG in fair detail. This article has only touched lightly on its compact syntax and some of the more commonly used structures of the language. I have neglected some interesting things: for example, lists, name classes, merging grammars, and combining definitions. If you've gotten behind the wheel and tested these examples for yourself, you likely have a good feel for just how easy RELAX NG's compact syntax is to learn and use.
Related Links
"The Design of RELAX NG," a paper by James Clark
RELAX NG 1.0 DTD compatibility specification
RELAX NG compact syntax specification
Jing, James Clark's RELAX NG processor (Java)
Trang, James Clark's RELAX NG compact syntax processor (Java)
Multi-schema Validator (MSV), Sun's schema validator (by Kawaguchi Kohsuke)
Murata Makoto's online RELAX NG validator (Java/JSP)
Eric van der Vlist's online RELAX NG validator (Python)
- Something wrong
2006-10-27 08:34:49 johnhmichael - So what?
2002-06-21 09:46:58 Paul Strand