XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

The W3C XML Schema Specification in Context
by Rick Jelliffe | Pages: 1, 2, 3, 4, 5

Table of Contents

W3C XML Schemas operates on the Information Set of a Document

W3C XML Schemas and W3C XML Markup Declarations (DTDs)

Role of W3C XML Markup Declarations (DTDS) in the Immediate Future

W3C XML Schema and ISO SGML Markup Declarations (SGML DTDs)

W3C XML Schema and ISO SGML Extended Facilities (Meta-DTDs and Lexical Types)

Schema Languages influenced by W3C XML Schemas

W3C XML Schema and ISO SGML Extended Facilities (Meta-DTDs and Lexical Types)

A W3C XML Schema is a high-level specification of an architecture. W3C XML Schemas could be implemented as

  • a transformation on the document to add xsi:type elements, based on the type derivation mechanism;
  • a transformation on the schema to derive an effective schema, expressed according to the ISO HyTime Architectural Forms Definition Requirements;
  • architectural parse of the document using the effective schema as a meta-DTD and the xsi:type attribute as the element form.

It has not been proven yet that all W3C XML Schema constraints can be expressed using meta-DTDs and the other standard features of the ISO SGML Extended Facilities (given in the Annexes to the ISO HyTime standard). Consequently, an architectural validation system using meta-DTDs in ISO SGML markup declaration syntax may not completely validate every W3C XML Schema instance. In particular, the use of namespaces complicates understanding of the transformations required. Certainly it is not true that every schema definable using Architectural Forms has an equivalent W3C XML Schema: attribute renaming cannot be performed, for example. The tag/type distinction is the same as the element-form/architecture distinction: an abstract element type is a "base" (architectural) element.

W3C XML Schemas provides similar lexical capabilities to the ISO SGML Extended Facilities Lexical Definition Requirements, using a non-standard regular expression syntax.

W3C XML Schema and Perl Regular Expressions

Perl Regular Expressions W3C XML Schema Regular Expressions Comments

^ = beginning of string

^ = character ^ only

All regular expression matches start from the beginning of the string. For substring matching use .*substring.*

$ = end of string

$ = character $ only

All regular expression matches end at the end of the string

Zero-width assertions, look-ahead and look-behind, back references

Not available

 

Non-greedy + and *

Not available

 

\c

An XML NAME character

 

\i

An XML initial NAME (i.e, SGML NAMESTRT) character

 

\033 and \xAB

XML Numeric Character Reference must be used

 

\p{}

\p{}

The character classes allowed are the Unicode Consortium's character classes.

 

Pages: 1, 2, 3, 4, 5

Next Pagearrow