XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

W3C XML Schema Design Patterns: Avoiding Complexity

November 20, 2002

Introduction

Over the course of the past year, during which I've worked closely with W3C XML Schema (WXS), I've observed many schema authors struggle with various aspects of the language. Given the size and relative complexity of the WXS recommendation (parts one and two ), it seems that many schema authors would be best served by understanding and utilizing an effective subset instead of attempting to comprehend all of its esoterica.

There have been a few public attempts to define an effective subset of W3C XML Schema for general usage, most notable have been W3C XML Schema Made Simple by Kohsuke Kawaguchi and the X12 Reference Model for XML Design by the Accredited Standards Committee (ASC) X12. However, both documents are extremely conservative and advise against useful features of WXS without adequately describing the cost of doing so.

This article is primarily a counterpoint to Kohsuke's and considers each of his original guidelines; the goal is to provide a set of solid guidelines about what you should do and shouldn't do when working with WXS.

The Guidelines

I've altered some of Kohsuke's original guidelines:

  • Do use element declarations, attribute groups, model groups, and simple types.
  • Do use XML namespaces as much as possible. Learn the correct way to use them.
  • Do not try to be a master of XML Schema. It would take months.
  • Do not use complex types and attribute declarations.
  • Do not use notations
  • Do not use local declarations.
  • Do not carefully use substitution groups.
  • Do not carefully use a schema without the targetNamespace attribute (aka chameleon schema.)

I propose some additional guidelines as well:

  • Do favor key/keyref/unique over ID/IDREF for identity constraints.
  • Do not use default or fixed values especially for types of xs:QName.
  • Do not use type or group redefinition.
  • Do use restriction and extension of simple types.
  • Do use extension of complex types.
  • Do carefully use restriction of complex types.
  • Do carefully use abstract types.
  • Do use elementFormDefault set to qualified and attributeFormDefault set to unqualified.
  • Do use wildcards to provide well defined points of extensibility.

The guidelines qualified with the word carefully are best avoided by novice users unless absolutely required by the problem being solved.

Why You Should Use Global And Local Element Declarations

An element declaration is used to specify the structure, type, occurrence, and value constraints for an element. The element declaration is the most important and common piece of a schema document.

Elements declarations that appear as children of the xs:schema element are global elements, which can be reused by referencing them in other parts of the schema or from other schema documents. They can also be members of substitution groups. Since the WXS recommendation doesn't provide a mechanism for specifying the root element of the document being validated, any global element can be used as the root element for a valid document.

Element declarations that appear within complex type or model group definitions, and that aren't references to a global element, are local elements. Unlike global elements, there can be many local element declarations with the same name and differing types in a schema as long as the local elements are not declared at the same level. Section 3.3 of the W3C XML Schema Primer gives the following example:

You can only declare one global element called "title", and that element is bound to a single type (e.g., xs:string or PersonTitle). However, you can locally declare one element called "title" that has a string type, and is a subelement of "book". Within the same schema (target namespace) you can declare a second element also called "title" that is an enumeration of the values "Mr Mrs Ms".

Global element declarations should be used for elements that will be reused from the target schema as well as from other schema documents, when the element and its associated type are comfortably bound together for widespread use. Local elements are to be favored when element declarations only make sense in the context of the declaring type and are unlikely to be reused.

By default, global elements have a namespace name equivalent to that of the target namespace of the schema, while local elements have no namespace name. So, by default, elements in an XML document which are meant to be validated against global element declarations should have a namespace name identical to that of the global element's schema target namespace. Those which are to be validated against local elements should have no namespace name. For example, consider this schema:

test.xsd
<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 targetNamespace="http://www.example.com"
 xmlns="http://www.example.com">

 <!-- global element declaration validates
    <language> elements from http://www.example.com
    namespace  -->
 <xs:element name="language" type="xs:string" />
 <xs:element name="Root" type="sequenceOfLanguages" />
 <xs:element name="Root2" type="sequenceOfLanguages2" />
 
 <!-- complex type with local element declaration
    validates <language> elements without a namespace
    name -->
 <xs:complexType name="sequenceOfLanguages" >  
  <xs:sequence>
   <xs:element name="language" type="xs:NMTOKEN" maxOccurs="unbounded" />
  </xs:sequence>
 </xs:complexType>

 <!-- complex type with reference to global
    element declaration -->
  <xs:complexType name="sequenceOfLanguages2" >  
  <xs:sequence>
   <xs:element ref="language" maxOccurs="10" />
  </xs:sequence>
 </xs:complexType>
</xs:schema>

test.xml
<?xml version="1.0"?>
<ex:Root xmlns:ex="http://www.example.com">
 <language>EN</language> 
</ex:Root> 

test2.xml
<?xml version="1.0"?>
<ex:Root2 xmlns:ex="http://www.example.com">
 <ex:language>English</ex:language> 
 <ex:language>Klingon</ex:language> 
</ex:Root2> 

Why You Should Use Global And Local Attribute Declarations

An attribute declaration is used to specify the type, optionality, and defaulting information for an attribute.

Attribute declarations that appear as children of the xs:schema element are global attributes, which can be reused by referencing them in other parts of the schema or from other schema documents. Attribute declarations that appear within complex type definitions, and that do not reference global attributes, are local attributes.

Global attribute declarations should be used for types that will be reused from the target schema as well as from other schema documents. Local attributes should be used when attribute declarations only make sense in the context of the declaring type and are unlike to be reused. Since attributes are usually tightly coupled to their parent elements, local attribute declarations are typically favored by schema authors. But there are cases where global attributes which can apply to many elements from multiple namespaces are useful (for example, xsi:type and xsi:schemaLocation).

By default global attributes have a namespace name equivalent to that of the target namespace of the schema, while local attributes have no namespace name. Thus, attributes which are to be validated against global attribute declarations should have namespace name identical to that of the global attribute's schema target namespace. Those to be validated against local attributes should have no namespace name. For example,

test.xsd
<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 targetNamespace="http://www.example.com" 
 xmlns="http://www.example.com">

 <!-- global attribute declaration validates
    language attributes from http://www.example.com namespace  --> 
 <xs:attribute name="language" type="xs:string" />
 <xs:element name="Root" type="sequenceOfNotes" />
 <xs:element name="Root2" type="sequenceOfNotes2" />

 <!-- complex type with local attribute
    declaration validates language attributes without a
    namespace name -->
 <xs:complexType name="sequenceOfNotes" >  
  <xs:sequence>
   <xs:element name="Note" type="xs:string" />
  </xs:sequence>
  <xs:attribute name="language" type="xs:NMTOKEN"  /> 
 </xs:complexType>

 <!-- complex type with reference to
    global attribute declaration -->
  <xs:complexType name="sequenceOfNotes2" >  
  <xs:sequence>
   <xs:element name="Note" type="xs:string" />
  </xs:sequence>
  <xs:attribute ref="language" />
 </xs:complexType>
</xs:schema>

test.xml
<?xml version="1.0"?>
<ex:Root xmlns:ex="http://www.example.com" language="EN" >
 <Note>Nothing to see here</Note> 
</ex:Root> 

test2.xml
<?xml version="1.0"?>
<ex:Root2 xmlns:ex="http://www.example.com" ex:language="The English Language">
 <Note>Nothing to see here</Note> 
</ex:Root2> 

Pages: 1, 2, 3, 4

Next Pagearrow







close