XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Combining RELAX NG and Schematron
by Eddie Robertsson | Pages: 1, 2, 3, 4, 5

Validation using Extraction

To extract the embedded Schematron rules from the RELAX NG schema, the RNG2Schtrn.xsl stylesheet can be used. This stylesheet will also extract Schematron rules that have been declared in RELAX NG modules that are included in or referenced from the base schema.

The result from the script is a complete Schematron schema that can be used to validate the XML instance document using a Schematron processor as described in the section Introduction to Schematron. The XML instance document is then validated against the RELAX NG schema using a normal RELAX NG processor that will ignore all the embedded rules. This means that validation results are available from both Schematron validation and RELAX NG validation and if needed the results can be merged into one report. The whole process is described in the following figure:

As shown in the figure, there are two distinct paths in the validation process, which means that if timing requirements are important both paths can be implemented as a separate process and be executed in parallel.

A batch file that would (using the Win32 executable of Jing and Saxon) validate an XML instance document against both a RELAX NG schema and its embedded Schematron rules can look like this:

echo Running Jing validation on Sample.xml...

   jing PurchaseOrder.rng Sample.xml

echo Creating Schematron schema from PurchaseOrder.rng...

   saxon -o PurchaseOrder.sch PurchaseOrder.rng RNG2Schtron.xsl

echo Running Basic Schematron validation on file Sample.xml...

   saxon -o validate.xsl PurchaseOrder.sch schematron-basic.xsl
   saxon Sample.xml validate.xsl

So, first, the XML instance document is validated against the RELAX NG schema using Jing, and then it is validated with the embedded Schematron rules using Saxon. An output example could look like this:

Running Jing validation on Sample.xml...

Error at URL "file:/C:/Sample.xml", line number 7: unknown element "BogusElement"

Creating Schematron schema from PurchaseOrder.rng...

Running Basic Schematron validation on file Sample.xml...

From pattern "Check that each team is registered in the tournament":
   Assertion fails: "The item doesn't exist in the database." at 
    /purchaseOrder[1]/items[1]/item[2]
     <item id="112-AX">...</>

Done.		

The Topologi Schematron Validator is a free graphical validator that can validate an XML instance document against a RELAX NG schema with embedded Schematron rules.

Summary

Schematron is a very good complement to RELAX NG, and there is little that cannot be validated by the combination of the two. This article has shown how to embed Schematron rules in a RELAX NG schema as well as providing guidelines for how to perform validation. A Java implementation of Schematron that works as a wrapper around Xalan can be downloaded from Topologi. This implementation also contains classes to perform RELAX NG validation (using Jing) with embedded Schematron rules.

It is up to each project and use-case to evaluate if embedding Schematron rules in RELAX NG schemas is a suitable technique to achieve more powerful validation. Following is a list of some advantages to take into account:

  • By combining the power of WXS and Schematron the limit for what can be performed in terms of validation is raised to a new level.

  • Many of the constraints that previously had to be checked in the application can now be moved out of the application and into the schema.

  • Since Schematron lets you provide your own error messages (the content of the assertion elements) you can assure that each message is as explanatory as needed.

And some disadvantages:

  • In time critical applications the time overhead of processing the embedded Schematron rules may be too long. This is especially true if XSLT implementations of Schematron are used in conjunction with the extraction method in the preceding section. Extensive use of XSLT's document() function is also very resource demanding and time consuming.

  • Since the extraction of Schematron rules from a RELAX NG schema is performed with XSLT, embedded Schematron rules are only supported in RELAX NG schemas that use the full XML syntax.

The ability to combine embedded Schematron rules with a different schema language is not unique to RELAX NG and should be possible in all XML schema languages that use XML syntax and have an extensibility mechanism. The only thing needed is to modify the XSLT extractor stylesheet to accommodate the extension mechanism in the host XML schema language used.

Acknowledgements

I would like to thank Rick Jelliffe and Mike Fitzgerald for comments and suggestions on this article.

Resources



1 to 3 of 3
  1. Combining schema languages
    2004-02-17 13:40:06 Mark Seaborne
  2. Schematron+relax ng article
    2004-02-14 00:32:54 Dave Pawson
  3. RELAX NG / Schematron / OASIS CAM
    2004-02-12 19:23:07 David Webber
1 to 3 of 3