The W3C XML Schema Specification in Context
by Rick Jelliffe
|
Pages: 1, 2, 3, 4, 5
Role of W3C XML Markup Declarations (DTDS) in the Immediate Future
W3C XML Markup Declarations (DTDS) are not superceded by W3C XML Schemas 1.0, and there is no general way to specify that DTD processing should not occur, nor any way to verify that it has not. So Markup Declarations will continue to be useful for non-schema related tasks in the near future, in particular as a simple and terse syntax for removing document constants to headers, which isn't really a task related to data or structural type specification:
- Entity declarations;
- Namespace declarations;
- Global variable defaults, particularly for xml:space;
- Attribute defaulting.
The terseness of DTDs and their widespread deployment in XML processors makes them a suitable notation for simple client-side validation; a W3C XML Schemas may be transformed into the closest approximating W3C XML DTD. However, given that an W3C XML Schema does not have a one-to-one correspondence between element name and content model, the closest approximating DTD may be less strict than what's needed or desired.
W3C XML Schema and ISO SGML Markup Declarations (SGML DTDs)
ISO SGML (IS 8879:1986 as amended 1997) provides additional features and capabilities to W3C XML 1.0. ISO SGML allows the specification of many different kinds of grammars: different levels of tag and delimiter omission, contextual delimiter recognition, and richer support for modeling a documents as an asynchronous tree of elements and tree of entities, each of which can have local links and other attributes. Consequently, an ISO SGML DTD is really a grammar specification rather than a data schema per se , though in practice such a regular grammar contains enough structural definition to make it useful for many kinds of data modeling.
| ISO SGML Markup Declarations | W3C XML Schemas | Comments |
|---|---|---|
|
CDATA Declared Content Type |
Not supported by XML |
This is a parser function. |
|
RCDATA Declared Content Type |
Not supported by XML. |
This is a parser function. |
|
ANY Declared Content Type |
Supported by the <any/> particle. The urType (the top-most type from which all other types are derived) -- called "anyType" -- allows any subelements and any attributes. |
Various wildcards allow ANY to be restricted to certain namespaces. Also, substitution groups allow a content model to name the position of a particle but to allow the name an complex type to be specified elsewhere. |
|
& Connector |
Conjunction. The <all> element allows this functionality at the top-level of an element only. |
In an <all> group, the elements have a maxOccurs of 1. |
|
ISO SGML Content Models |
No equivalent of allowing #PCDATA as a particular particle in content models. |
W3C XML Schemas keeps the ISO SGML requirement for unambiguous content models. The big difference between ISO SGML DTDs and W3C XML Schemas is the tag/type distinction. |
|
Global Inclusion Exceptions |
Not supported directly. A single-level inclusion can be made by using type refinement on the complex types elsewhere. |
However, the effect of a global inclusion can be achieved by deriving restricted types for each complex type possible underneath an element to any level. This may double the number of declarations for each inclusion. |
|
Global Exclusion Exceptions |
Not supported directly. A single-level exclusion can be made using type refinement. |
However, the effect of a global exclusion can be achieved by deriving restricted types for each complex type possible underneath an element to any level. This may double the number of declarations for each exception. |
|
NUTOKEN NUTOKENS Attribute Types |
Can be supported using regular expressions, subclassing simple type "token". NUTOKENS can be supported by deriving a list type. |
|
|
NAME, NAMES Attribute Types |
NAME is supported by the simple type "Name". NAMES can be supported by deriving a list type. |
To get a type closer to the ISO SGML Reference Concrete Syntax defaults, derive a type from NCName (which allows no colons) and restrict the type further to characters less than 0xFF using regular expressions and the pattern attribute. |
|
ENTITY Types (e.g., SDATA, CDATA) and Data Attributes (Attributes on Entities) |
Not supported |
|
|
#CONREF Attribute Keyword |
Not supported |
No support for keying occurrence from the value of an attribute. |
|
SUBDOC |
Not supported |
However, the key mechanism allows a scoping of IDs and IDREFs; and both the namespace mechanism and the tag/type mechanism allow element names to refer to different types in different contexts. In a sense, each new namespace encountered is a SUBDOC, as they all will have a separate schema. |
|
LINK Attribute Groups |
Not supported |
However, a similar effect can be gained using type refinement, so that different default and fixed attribute values are added in different contexts. |
|
Data Attributes on Elements |
Not supported |
|