Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Comparing XML Schema Languages
by Eric van der Vlist | Pages: 1, 2, 3

Examplotron

Overview

Author: Eric van der Vlist
Status: Unofficial
Location: http://examplotron.org/
PSVI: No (not yet)
Structures: Yes
Datatypes: No (not directly)
Integrity: No (not directly)
Rules: Yes, through XPath expressions
Vendor support: None
Miscellaneous: Schema by example (a sample document is a schema) with rules checking (syntax borrowed to Schematron).

Examplotron is an experiment to define a schema language based on sample trees, not unlike early proposals for XPath. An Examplotron schema for our sample could be:

<?xml version="1.0" encoding="UTF-8"?>
<library xmlns:eg="http://examplotron.org/0/">
  <book id="_0836217462" 
     eg:occurs="+" 
     eg:assert="not(following-sibling::book/@id=@id) and @id=concat('_', isbn)">
    <isbn>0836217462</isbn>
    <title>Being a Dog Is a Full-Time Job</title>
    <author-ref id="Charles-M.-Schulz"  eg:occurs="*"/>
    <character-ref id="Peppermint-Patty"  eg:occurs="*"/>
  </book>
  <author id="Charles-M.-Schulz" eg:occurs="*">
    <name>Charles M. Schulz</name>
    <nickName>SPARKY</nickName>
    <born>November 26, 1922</born>
    <dead>February 12, 2000</dead>
  </author>
  <character id="Peppermint-Patty" eg:occurs="*">
    <name>Peppermint Patty</name>
    <since>Aug. 22, 1966</since>
    <qualification>bold, brash and tomboyish</qualification>
  </character>
</library>

Mix and Match

We have seen that the features of some of these languages are more complementary than overlapping, and there is room for interesting combinations, especially with Schematron and the structure and datatype-based languages.

Some early implementations are available which support the embedding of Schematron rules in xs:annotation/xs:appinfo W3C XML Schema elements. The combination of W3C XML Schema and Schematron enables the use of each language for the purpose for which it was designed: structure and datatype validation for W3C XML Schema, and rules for Schematron. The power of the rules expressed with Schematron can also compensate for he weaknesses of W3C XML Schema.

Discussions are also underway to embed Schematron rules in RELAX NG schemas. This would then lead to a possible combination of RELAX NG for the structure, W3C XML Schema part 2 for the datatypes and Schematron for the rules, which would certainly demonstrate the extensibility of XML applications.

Comparisons

To wrap up, I will summarize the pros and cons of each language.

Tool support (as of today)

  1. Best: DTD

  2. Most promising: W3C XML Schema

  3. Challenger: RELAX NG

  4. Niche: Schematron and Examplotron

Features

  1. Structures: DTD, W3C XML Schema, RELAX NG, Examplotron.

  2. Datatype: W3C XML Schema

  3. Integrity: W3C XML Schema, Schematron, Examplotron

  4. Rules: Schematron, Examplotron

Flexibility (ability to describe a wide range of structures)

  1. Most flexible: Schematron (but everything needs to be defined "by hand").

  2. Most flexible structure-based language: RELAX NG.

  3. Integrity: W3C XML Schema, Schematron, Examplotron

  4. Challenger: Examplotron

  5. Behind: W3C XML Schema

  6. Least flexible: DTD (lack of namespace support)

So What?

There are currently no perfect XML Schema languages. Fortunately, there are a number of good choices, each with strengths and weaknesses, and these choices can be combined. Your job may be as simple as picking the right combination for your application.

References


Comment on this articleGot a question about the schema languages surveyed in this article, or a different opinion to the author? Share in our forum.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • ISBN checksum validation
    2006-06-26 20:15:28 Conal [Reply]

    I found this article while searching for a way to validate ISBNs using Schematron. The article hints that this is possible, but doesn't give an example. For future reference, the following XPath expression can be used in schematron as the "test" attribute of an "assert" element:


    substring(
    '0123456789X',
    11 + 1 - (
    10 * substring(translate(., ' -', ''), 1, 1) +
    9 * substring(translate(., ' -', ''), 2, 1) +
    8 * substring(translate(., ' -', ''), 3, 1) +
    7 * substring(translate(., ' -', ''), 4, 1) +
    6 * substring(translate(., ' -', ''), 5, 1) +
    5 * substring(translate(., ' -', ''), 6, 1) +
    4 * substring(translate(., ' -', ''), 7, 1) +
    3 * substring(translate(., ' -', ''), 8, 1) +
    2 * substring(translate(., ' -', ''), 9, 1)
    ) mod 11,
    1
    ) = substring(translate(., ' -', ''), 10, 1)

  • Where is RDF Schema
    2001-12-13 07:51:41 John McClure [Reply]

    RDF Schema, particularly when combined with DAML+OIL, is a schema definition language also available from the W3C; it amazes me that the author failed to include it in the survey. Sure, the author mentions RDF while discussing XML-Data, but that is hardly giving RDF Schema its due.


    The author's vision may need some fine tuning: the 'schema' represented by objects in a software program is best coordinated with the schema of the data in an XML document -- that is what RDF Schema gives us, simply and forcefully.

    • Where is RDF Schema
      2001-12-13 15:45:28 Eric van der Vlist [Reply]

      I see them as belonging to very different levels.


      RDF Schema is constraining the relations between a triples of RDF application independently of its XML Serialization (and even if the application doesn't have any XML serialization).


      The XML Schema languages OTH are constraing the XML syntax and even when applied to RDF documents, they restrict the triples on a very indirect way.


      Depending on your needs, you may then use either one of them only (RDF Schema if you care only about the triples, XML Schema if you care only about the syntax), two of them (if you care about the triples and want to fix a syntax too) or none...


      Seeing them as independent, I have thought to kind of oppose them by including RDF Schema in the balance would have been more confusing.


      That's also why I haven't mention UML or other modeling technologies.

  • vendor support of schematron
    2001-12-13 01:16:29 bryan rasmussen [Reply]

    I think it's an error to say that schematron has low vendor support, considering that there is a schematron implementation written using xsl-t it follows that the vendor support is very high, same goes for examplotron.

    • vendor support of schematron
      2001-12-13 15:49:35 Eric van der Vlist [Reply]

      Yes, you're right.


      In fact all the schema languages which I have mentioned have a high level of support since they come with open source implementations which you can embed in your applications.


      What I meant is that their support embedded in applications such as XML editors or browsers (and even in other specifications such as XPath 2.0) is currently weaker than the support of DTDs and W3C XML Schema.