XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

The W3C XML Schema Specification in Context
by Rick Jelliffe | Pages: 1, 2, 3, 4, 5

Table of Contents

W3C XML Schemas operates on the Information Set of a Document

W3C XML Schemas and W3C XML Markup Declarations (DTDs)

Role of W3C XML Markup Declarations (DTDS) in the Immediate Future

W3C XML Schema and ISO SGML Markup Declarations (SGML DTDs)

W3C XML Schema and ISO SGML Extended Facilities (Meta-DTDs and Lexical Types)

Schema Languages influenced by W3C XML Schemas

Schema Languages influenced by W3C XML Schemas

JIS RELAX and Schematron are schema languages influenced in various degrees by W3C XML Schema. Both are created by W3C XML Schema WG members and may be seen as "minority reports" espousing alternate features or approaches to W3C XML Schemas. Both have adopted suggestions made during the course of the development of W3C XML Schemas that did not make the final cut. However, in public material the authors of both have stressed that the design differences largely flow from having a different answer to the question, "what problem should Schemas solve?" In particular, the view that a schema language should not make information set contributions is shared: a document should not require schema processing to have a complete information set.

DSD takes an opposite approach, paying much attention to various defaulting issues, including defaults which make information set contributions.

Interested readers may also enjoy the paper " Comparative Analysis of Six Schema Languages", Lee and Chu, ACM SIGMOD Record 29(3), September 2000. The comments on the various schema languages refer to earlier drafts, so individual comments may be out-of-date.

JIS RELAX

JIS RELAX is based on supporting document editing and schema modularization: it treats the schema as a grammar definition. This can be contrasted with XML Schemas, which treats the schema as a type definition system.

A subset called RELAX Core is fairly convertible to W3C XML Schemas and to DTDs. RELAX adopts W3C XML Schema's datatypes, annotation elements, and uses the same naming conventions as far as possible. It does not provide type derivation. Some features of W3C XML Schemas are not supported.

A second level, RELAX Namespace, adds some others particularly in the areas of modularization and schema combination.

Not every full RELAX Schema can be converted fully into a W3C XML Schema: modularization information will be lost and the selection of type based on using all the information present in a start-tag (an excellent design feature which DSD and Schematron share, following the lead in Dave Ragget's Assertion Grammars). However, useful conversion is certainly possible.

Refer: http://www.xml.gr.jp/relax/ especially the FAQ. JIS RELAX is mooted for release as an ISO Technical Report.

Schematron

My Schematron schema language is based on making assertions about the presence or absence of patterns in the document object tree. Paths and expressions use the version of W3C XPath paths and expressions available in W3C XSLT. Schematron schemas in particular allow the validation of co-occurrence constraints in a document where the presence or absence or value of some element or attribute in some document impacts the presence or absence or value of another element or attribute, possibly in another document.

Schematron relates to XML Schemas in two ways: first, a W3C XML Schema may be partially but usefully transformed into a Schematron schema, though this may be quite complex to achieve; second, a Schematron schema can be embedded into a Schematron schema in an <appinfo> element, providing an extension to express co-occurrence constraints. Schematron demonstrates that path expressions are a useful tool in the vocabulary of schema language designers, perhaps as useful as grammars, though with different modeling capabilities.

A Schematron implementation may take only a few hundred lines (given the availability of a W3C XSL library) while a W3C XML Schema implementation may take tens of thousands of lines.

Refer: http://www.ascc.net/xml/resource/schematron/schematron.html and for interview with me see http://www.xmlhack.com/read.php?item=121

Document Structure Descriptions

DSD is a grammar-based approach based on transplanting some of the useful mechanisms of CSS into the schema world. It does not handle namespaces yet.

DSD allows a kind of simple path expressions for getting context-dependent validation, inspired by W3C CSS selectors. In this it provides more than W3C XML Schema but much less than Schematron, which is not constrained to the ancestor tree. In W3C XML Schemas, a type can be selected based on the element type name and in the context of a particular type (e.g., by the element's name and the parent's type).

DSD also follows W3C CSS by allowing gradual declaration of allowed contents and attributes, as part of a comprehensive defaulting strategy.

Refer: http://www.brics.dk/DSD/ but note that some of the statements about XML Schemas are now out-of-date. W3C XML Schemas has edged unsystematically closer to DSD:

  • W3C XML Schemas do allow the parent type to determine the type of a child element, not only the name of an element; an element type is like a non-terminal;
  • W3C XML Schemas do allow the type of the target of a reference to be constrained, using the identity constraint mechanism which utilizes XPaths;
  • W3C XML Schemas do allow the piecemeal declaration of types by the substitution group mechanism.