Combining RELAX NG and Schematron
by Eddie Robertsson
|
Pages: 1, 2, 3, 4, 5
Now that all the Schematron rules are defined, the only remaining task is to insert them into the main RELAX NG schema. As already mentioned, a RELAX NG schema allows any element not in the RELAX NG namespace to appear anywhere in the schema where markup is allowed. However, to keep the RELAX NG schema well organized and easy to read, it is recommended that you embed the Schematron rules in one of two places:
Insert all the embedded Schematron rules at the beginning of the RELAX NG schema as a child of the top-level element. Then you always know that if you have embedded rules, they will be specified together and in the same place.
Specify each Schematron rule on the element pattern that specifies the context of the embedded rule. In the previous example this means that one of the Schematron rules would be embedded on the element pattern for the
itemelement and the other on the element pattern for theamountelement in the payment section.
I prefer to embed each Schematron rule in the element that defines the context, but it is really up to the developer which method to use. Another good rule to follow is to always declare the Schematron namespace on the top-level element in the RELAX NG schema. That way you know that if the top-level element contains a declaration for the Schematron namespace, then the schema contains embedded Schematron rules. The complete RELAX NG schema for the purchase order with embedded Schematron rules might look like this:
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
xmlns:sch="http://www.ascc.net/xml/schematron">
<start>
<ref name="purchaseOrder"/>
</start>
<define name="purchaseOrder">
<element name="purchaseOrder">
<attribute name="date">
<data type="date"/>
</attribute>
<ref name="deliveryDetails"/>
<element name="items">
<oneOrMore>
<ref name="item"/>
</oneOrMore>
</element>
<ref name="payment"/>
</element>
</define>
<define name="deliveryDetails">
<element name="deliveryDetails">
<element name="name"><text/></element>
<element name="address"><text/></element>
<element name="phone"><text/></element>
</element>
</define>
<define name="item">
<element name="item">
<sch:pattern
name="Check that the pricing and currency of an item is correct.">
<sch:rule context="purchaseOrder/items/item">
<sch:assert
test="number(price) * number(quantity) = number(totalAmount)">
The total amount for the item doesn't add up to (quantity * price).
</sch:assert>
<sch:assert
test="price/@currency = totalAmount/@currency">
The currency in price doesn't match the currency in totalAmount.
</sch:assert>
</sch:rule>
</sch:pattern>
<attribute name="id">
<data type="string">
<param name="pattern">\d{3}-[A-Z]{2}</param>
</data>
</attribute>
<element name="productName"><text/></element>
<element name="quantity">
<data type="int"/>
</element>
<element name="price">
<ref name="currency"/>
</element>
<element name="totalAmount">
<ref name="currency"/>
</element>
</element>
</define>
<define name="payment">
<element name="payment">
<attribute name="type">
<choice>
<value>Prepaid</value>
<value>OnArrival</value>
</choice>
</attribute>
<element name="amount">
<sch:pattern
name="Check that the total amount is correct and that the currencies match">
<sch:rule context="purchaseOrder/payment/amount">
<sch:assert
test="number(.) = sum(/purchaseOrder/items/item/totalAmount)">
The total purchase amount doesn't match the cost of all items.
</sch:assert>
<sch:assert
test="not(/purchaseOrder/items/item/totalAmount/@currency != @currency)">
The currency in at least one of the items doesn't match the
currency for the total amount.
</sch:assert>
</sch:rule>
</sch:pattern>
<ref name="currency"/>
</element>
</element>
</define>
<define name="currency">
<attribute name="currency">
<choice>
<value>AUD</value>
<value>USD</value>
<value>SEK</value>
</choice>
</attribute>
<data type="int"/>
</define>
</grammar>
Dependency between XML documents
Like most other XML schema languages, RELAX NG lacks the ability to specify constraints between XML instance documents. In many XML applications, this is a very useful functionality. A typical example would be to check if a certain ID reference has a corresponding ID in a different document. For the purchase order example in the preceding section, this could be a simple database file where all the available products are listed. Typically a simple database would contain the following information:
Datewhen the database was updatedOne or more products
Each product have an
id, aname, adescription, apriceand thenumber of items in stock
A sample XML instance document for the database would look like this:
<?xml version="1.0" encoding="UTF-8"?>
<products lastUpdated="2002-10-22">
<product id="123-XY">
<productName>Coffin</productName>
<description>Standard coffin, Size 200x80x50cm</description>
<numberInStock>4</numberInStock>
<price currency="AUD">2300</price>
</product>
<product id="112-AA">
<productName>Shovel</productName>
<description>Plastic grip shovel</description>
<numberInStock>2</numberInStock>
<price currency="AUD">75</price>
</product>
</products>
With the corresponding RELAX NG schema:
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<ref name="products"/>
</start>
<define name="products">
<element name="products">
<attribute name="lastUpdated">
<data type="date"/>
</attribute>
<oneOrMore>
<ref name="product"/>
</oneOrMore>
</element>
</define>
<define name="product">
<element name="product">
<attribute name="id">
<data type="string">
<param name="pattern">\d{3}-[A-Z]{2}</param>
</data>
</attribute>
<element name="productName"><text/></element>
<element name="description"><text/></element>
<element name="numberInStock">
<data type="int"/>
</element>
<element name="price">
<ref name="currency"/>
</element>
</element>
</define>
<define name="currency">
<attribute name="currency">
<choice>
<value>AUD</value>
<value>USD</value>
<value>SEK</value>
</choice>
</attribute>
<data type="int"/>
</define>
</grammar>
Looking back at the purchase order in the preceding section each item purchased was specified as:
<item id="123-XY">
<productName>Coffin</productName>
<quantity>1</quantity>
<price currency="AUD">2300</price>
<totalAmount currency="AUD">2300</totalAmount>
</item>
Since there also exists a database for each product available for purchase, there are now at least two more constraints that can be checked for each purchase order:
Make sure that each item's
idexists as a product id in the databaseMake sure that the quantity ordered is less than or equal to the total number of products in stock for each item in the purchase order
Since these constraints require checks between XML documents, they
can only be checked by Schematron processors that support
XSLT's document() function (or similar functionality). If
a Schematron processor based on XSLT is used, this is not a problem;
but most XPath implementations of Schematron do not have this type of
functionality. If you use an XSLT implementation, the Schematron rule
for the first constraint can be specified like this:
<sch:pattern name="Check that the item exists in the database."
xmlns:sch="http://www.ascc.net/xml/schematron">
<sch:rule context="purchaseOrder/items/item">
<sch:assert test = "document('Products.xml')/products/product/@id = @id"
>The item doesn't exist in the database.</sch:assert>
</sch:rule>
</sch:pattern>
Here the document() function is used to access the XML
instance document that contains the available products. Once
the document() function has retrieved the external
document, you can use normal XPath expressions to select the nodes of
interest. In this example, the id of all
the product elements with a parent products
is compared to the id of the item that is
currently being checked. If an item
element's id value does not exist in the database
(Products.xml), the assertion will fail.
The easiest way to check the second constraint is to use a different rule where the context is restricted using predicates. Here is an example of how this can be specified:
<sch:pattern name="Check that there are enough items in stock for the purchase."
xmlns:sch="http://www.ascc.net/xml/schematron">
<sch:rule
context="purchaseOrder/items/item[@id = document('Products.xml')/products/product/@id]">
<sch:assert
test="number(document('Products.xml')/products/product[@id = current()/@id]/numberInStock)
>= number(quantity)">
There are not enough items of this type in stock for this quantity.
</sch:assert>
</sch:rule>
</sch:pattern>
This rule is a bit more complicated than the previous ones. The
first thing that is different is that the context specification for
this rule is using a predicate to limit the number of elements
checked. In this case, the predicate is used because instead of
selecting all the item elements in the document, only
the item elements with an id that exists in
the database should be selected. This ensures that when the processor
checks the assertion, it is certain that the item being validated
exists in the database.