XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Combining RELAX NG and Schematron
by Eddie Robertsson | Pages: 1, 2, 3, 4, 5

Validation using Extraction

To extract the embedded Schematron rules from the RELAX NG schema, the RNG2Schtrn.xsl stylesheet can be used. This stylesheet will also extract Schematron rules that have been declared in RELAX NG modules that are included in or referenced from the base schema.

The result from the script is a complete Schematron schema that can be used to validate the XML instance document using a Schematron processor as described in the section Introduction to Schematron. The XML instance document is then validated against the RELAX NG schema using a normal RELAX NG processor that will ignore all the embedded rules. This means that validation results are available from both Schematron validation and RELAX NG validation and if needed the results can be merged into one report. The whole process is described in the following figure:

As shown in the figure, there are two distinct paths in the validation process, which means that if timing requirements are important both paths can be implemented as a separate process and be executed in parallel.

A batch file that would (using the Win32 executable of Jing and Saxon) validate an XML instance document against both a RELAX NG schema and its embedded Schematron rules can look like this:

echo Running Jing validation on Sample.xml...

   jing PurchaseOrder.rng Sample.xml

echo Creating Schematron schema from PurchaseOrder.rng...

   saxon -o PurchaseOrder.sch PurchaseOrder.rng RNG2Schtron.xsl

echo Running Basic Schematron validation on file Sample.xml...

   saxon -o validate.xsl PurchaseOrder.sch schematron-basic.xsl
   saxon Sample.xml validate.xsl

So, first, the XML instance document is validated against the RELAX NG schema using Jing, and then it is validated with the embedded Schematron rules using Saxon. An output example could look like this:

Running Jing validation on Sample.xml...

Error at URL "file:/C:/Sample.xml", line number 7: unknown element "BogusElement"

Creating Schematron schema from PurchaseOrder.rng...

Running Basic Schematron validation on file Sample.xml...

From pattern "Check that each team is registered in the tournament":
   Assertion fails: "The item doesn't exist in the database." at 
    /purchaseOrder[1]/items[1]/item[2]
     <item id="112-AX">...</>

Done.		

The Topologi Schematron Validator is a free graphical validator that can validate an XML instance document against a RELAX NG schema with embedded Schematron rules.

Summary

Schematron is a very good complement to RELAX NG, and there is little that cannot be validated by the combination of the two. This article has shown how to embed Schematron rules in a RELAX NG schema as well as providing guidelines for how to perform validation. A Java implementation of Schematron that works as a wrapper around Xalan can be downloaded from Topologi. This implementation also contains classes to perform RELAX NG validation (using Jing) with embedded Schematron rules.

It is up to each project and use-case to evaluate if embedding Schematron rules in RELAX NG schemas is a suitable technique to achieve more powerful validation. Following is a list of some advantages to take into account:

  • By combining the power of WXS and Schematron the limit for what can be performed in terms of validation is raised to a new level.

  • Many of the constraints that previously had to be checked in the application can now be moved out of the application and into the schema.

  • Since Schematron lets you provide your own error messages (the content of the assertion elements) you can assure that each message is as explanatory as needed.

And some disadvantages:

  • In time critical applications the time overhead of processing the embedded Schematron rules may be too long. This is especially true if XSLT implementations of Schematron are used in conjunction with the extraction method in the preceding section. Extensive use of XSLT's document() function is also very resource demanding and time consuming.

  • Since the extraction of Schematron rules from a RELAX NG schema is performed with XSLT, embedded Schematron rules are only supported in RELAX NG schemas that use the full XML syntax.

The ability to combine embedded Schematron rules with a different schema language is not unique to RELAX NG and should be possible in all XML schema languages that use XML syntax and have an extensibility mechanism. The only thing needed is to modify the XSLT extractor stylesheet to accommodate the extension mechanism in the host XML schema language used.

Acknowledgements

I would like to thank Rick Jelliffe and Mike Fitzgerald for comments and suggestions on this article.

Resources



1 to 3 of 3
  1. Combining schema languages
    2004-02-17 13:40:06 Mark Seaborne
    Interesting article, thanks.


    I wonder if you have looked at all at how you can achieve pretty much the same results you have with Relax NG + Schematron, but with WXS + XForms Model.


    The XForms model, among other things, can be used to define XPath based constraints on an XML instance (including equivalent to the document() function), combined with a WXS schema.


    You could even use XForms + XHTML + CSS as a dynamic report on the validity of a particular instance if you so wished.


    I remember thinking the first time I looked at the XForms model that it does pretty much the same job as Schematron, and wondered why the authors hadn't just used Schematron, instead of coming up with their own syntax.


    Never mind, it does a job that you have demonstrated to be very useful.


    All the best


    Mark Seaborne

  2. Schematron+relax ng article
    2004-02-14 00:32:54 Dave Pawson
    1. Page 2, you say embed Schematron rules at the top level, then don't do it.
    2. Which version of schematron please?
    3. Why such a long complex example, why not focus on the issue being discussed, not the instance validation.
    4. No links to the command line Schematron stylesheet you use.


  3. RELAX NG / Schematron / OASIS CAM
    2004-02-12 19:23:07 David Webber
    Eddie,


    Wish you had posted a note to the RELAX NG list on OASIS vis this!


    I would have been able to point you at the work on OASIS CAM - which does not only what you descibe - but another whole more too - and works with simple XML structures standalone.


    you can find a presentation on OASIS CAM - Content Assembly Mechanism - here:


    http://xml.gov/presentations.asp


    and see January 21st on CAM.


    You've validated the idea with your article - now OASIS CAM allows you to take it to another whole level beyond that - its like having RELAX NG, Schematron, and XSLT all rolled into one.


    Enjoy, DW.

1 to 3 of 3






close