XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Transforming XML Schemas

January 15, 2003

A W3C XML Schema (WXS) document contains valuable information that can be used throughout a system or application, but the complexity that WXS allows can make this difficult in practice. XSLT, however, can concisely and efficiently manipulate WXS documents in order to perform a number of tasks, including creating HTML input forms, generating query interfaces, documenting data structures and interfaces, and controlling a variety of user interface elements.

As an example, this article describes an XSLT document which creates an XHTML form based on the WXS definition of a complex element. For brevity and clarity, the article omits several WXS and XHTML form aspects, including attribute definitions, keys, imported/included schemas, and qualified name issues. How these additional features are implemented can depend greatly on your use of WXS and on your application. However, building a stylesheet that handles every possible WXS feature can be quite an effort and may often be unnecessary.

Much of the information -- occurrence constraints, data types, special restrictions, and enumerations -- needed to build an XHTML form is already contained in a WXS document. Missing bits such as label text and write restrictions can be added into WXS's <annotation> element.

The stylesheet will perform four distinct tasks:

  1. Find the definition of the target complex element that we want.
  2. Build a form element for the target element.
  3. Find the definitions of the target element's valid children.
  4. Build an input element for each of the simple child elements.

In order to do this, the stylesheet will apply different template rules to similar WXS elements depending on the task at hand. To make this possible, the stylesheet will use a separate mode for each task. The modes are default, targetElementForm, findChildNodes, and childNodeInput. Using a separate mode for each task will also help when modifying, expanding, or deciphering this stylesheet.

Seek and Transform I

When looking someone up in the phonebook, unique names are very helpful: "Smith, Julian Forsythe" is better than "Smith, J". Similarly we need to name the target element uniquely. This can be tricky because an element name is not necessarily unique; multiple locally defined elements or attributes can share the same name. So for inspiration, we can look to NUNs (Normalized Universal Names), which are path statements used to uniquely identify components of a schema. NUNs made their first appearance in the Schema Formal Description Working Draft. The W3C recently issued a formal draft of NUNs under the new name Schema Component Descriptors (SCD).

For this exercise we will use a simplified form of NUN based on element names to identify the target element and pass it into the stylesheet as a parameter. So, for example, the NUN for the definition of the global <actionItem> element would be "element::actionItem", and the NUN for an element <employee>, local to the global <workGroup> element, would be "element::workGroup/element::employee".

This article will preface the description of a set of template rules with a model that is based on UML Activity Diagrams. In these diagrams template rules are shown as states; modes are shown as composite states; <apply-templates> actions are shown as arrows; branching by template rule match patterns are shown as hollow circles; and other logical branches and the XSLT entry point are shown as solid circles.

Diagram of default mode

The first template to fire, match='xs:schema', will extract the first element name from the NUN string. The first element name in the NUN will always be a global element, therefore its definition will be a child of the <xs:schema> element.

<xsl:param name="targetNUN"/>

<xsl:template match="xs:schema">
  <xsl:variable name="NUNModified" select="concat($targetNUN,'/')"/>
  <xsl:variable name="NUNToken" 
  <xsl:apply-templates select="xs:element[@name=$NUNToken]">
    <xsl:with-param name="NUNRemain" select="substring-after($NUNModified,'/')"/>

The other templates in the mode will then recursively parse each element name in the NUN, drilling down through the nested local element definitions until all names from the NUN have been parsed and the target element has been found.

<xsl:template match="xs:element[@name]">
  <xsl:param name="NUNRemain"/>
    <xsl:when test="string-length($NUNRemain)=0">
      <xsl:apply-templates select="." mode="targetElementForm"/>
      <xsl:apply-templates select="xs:complexType/*">
        <xsl:with-param name="NUNRemainder" 
        <xsl:with-param name="NUNToken" 

<xsl:template match="xs:sequence|xs:choice|xs:all">
  <xsl:param name="NUNRemain"/>
  <xsl:param name="NUNToken"/>
    <xsl:with-param name="NUNRemain" select="$NUNRemain"/>
    <xsl:with-param name="NUNToken" select="$NUNToken"/>

<xsl:template match="*"/>

Powers of Annotation

Now that the stylesheet has found the target element and has begun creating a form, it needs an informative title. WXS's <annotation> element lets us associate human-friendly information and almost any kind of application specific information we want in the schema. A detailed description of the data contained in a simple element can be placed into <documentation>. For our form, we can also use the <appinfo> element to indicate whether an element is read-only and how it should be labeled.

The targetElementForm mode contains a single template rule, which builds the form container. It then enters the findChildNodes mode by selecting the anonymous or named complex type definitions for the element.

<xsl:template match="xs:element[@name]" mode="targetElementForm">
  <b>New <xsl:value-of select="xs:annotation/xs:appinfo/frm:label"/></b><br/>
  <xsl:copy-of select="xs:annotation/xs:documentation/node()"/>

    <input type="hidden" name="%%elementNUN" 
      <xsl:apply-templates select="*|/xs:schema/xs:complexType[@name=current()/@type]"
      <tr><td colspan="2">
        <input type="submit" value="Save Changes"/>

<xsl:template match="*" mode="targetElement"/>

WXS's <annotation> element will also be useful later on when building XHTML input elements for the target element's children.

Seek and Transform II

The next step is to find the definitions for the target element's child elements. Fortunately, with a bit of recursion, XSLT can easily handle the complex content models that WXS makes possible.

The templates of the findChildNodes mode will recursively walk through the content model of the target element and apply the template rules to locate the definitions of its child nodes. These few templates can handle named types, sequences, group references, and type extensions. Though not shown in these examples, a stylesheet could use these templates to track occurrence constraints and determine whether a child element is required or optional.

Model of the findChildNodes mode

The first template matches content model elements that the stylesheet will just step through.

<xsl:template match="xs:complexType|xs:complexContent|
  <xsl:apply-templates select="*" mode="findChildNodes"/>

References to global group and element definitions are resolved by the following two templates. Included and imported schemas are not shown in this example. However, one method for handling referenced WXS documents is to build a node set of the referenced WXS documents' <xs:schema> elements with the help of XSLT's document() function. You can then use your favorite XSLT 1.0 processor's node-set() extension function (.NET, MSXML, XALAN, SAXON, EXSL) to assign the node set to a global variable, which you can use in place of /xs:schema in the following templates.

<xsl:template match="xs:group[@ref]" mode="findChildNodes">
  <xsl:apply-templates select="/xs:schema/xs:group[@name=current()/@ref]" 

<xsl:template match="xs:element[@ref]" mode="findChildNodes">
  <xsl:apply-templates select="/xs:schema/xs:element[@name=current()/@ref]" 

Pages: 1, 2

Next Pagearrow