Menu

Transforming XML Schemas

January 15, 2003

Eric Gropp

A W3C XML Schema (WXS) document contains valuable information that can be used throughout a system or application, but the complexity that WXS allows can make this difficult in practice. XSLT, however, can concisely and efficiently manipulate WXS documents in order to perform a number of tasks, including creating HTML input forms, generating query interfaces, documenting data structures and interfaces, and controlling a variety of user interface elements.

As an example, this article describes an XSLT document which creates an XHTML form based on the WXS definition of a complex element. For brevity and clarity, the article omits several WXS and XHTML form aspects, including attribute definitions, keys, imported/included schemas, and qualified name issues. How these additional features are implemented can depend greatly on your use of WXS and on your application. However, building a stylesheet that handles every possible WXS feature can be quite an effort and may often be unnecessary.

Much of the information -- occurrence constraints, data types, special restrictions, and enumerations -- needed to build an XHTML form is already contained in a WXS document. Missing bits such as label text and write restrictions can be added into WXS's <annotation> element.

The stylesheet will perform four distinct tasks:

  1. Find the definition of the target complex element that we want.
  2. Build a form element for the target element.
  3. Find the definitions of the target element's valid children.
  4. Build an input element for each of the simple child elements.

In order to do this, the stylesheet will apply different template rules to similar WXS elements depending on the task at hand. To make this possible, the stylesheet will use a separate mode for each task. The modes are default, targetElementForm, findChildNodes, and childNodeInput. Using a separate mode for each task will also help when modifying, expanding, or deciphering this stylesheet.

Seek and Transform I

When looking someone up in the phonebook, unique names are very helpful: "Smith, Julian Forsythe" is better than "Smith, J". Similarly we need to name the target element uniquely. This can be tricky because an element name is not necessarily unique; multiple locally defined elements or attributes can share the same name. So for inspiration, we can look to NUNs (Normalized Universal Names), which are path statements used to uniquely identify components of a schema. NUNs made their first appearance in the Schema Formal Description Working Draft. The W3C recently issued a formal draft of NUNs under the new name Schema Component Descriptors (SCD).

For this exercise we will use a simplified form of NUN based on element names to identify the target element and pass it into the stylesheet as a parameter. So, for example, the NUN for the definition of the global <actionItem> element would be "element::actionItem", and the NUN for an element <employee>, local to the global <workGroup> element, would be "element::workGroup/element::employee".

This article will preface the description of a set of template rules with a model that is based on UML Activity Diagrams. In these diagrams template rules are shown as states; modes are shown as composite states; <apply-templates> actions are shown as arrows; branching by template rule match patterns are shown as hollow circles; and other logical branches and the XSLT entry point are shown as solid circles.

Diagram of default mode

The first template to fire, match='xs:schema', will extract the first element name from the NUN string. The first element name in the NUN will always be a global element, therefore its definition will be a child of the <xs:schema> element.

<xsl:param name="targetNUN"/>



<xsl:template match="xs:schema">

  <xsl:variable name="NUNModified" select="concat($targetNUN,'/')"/>

  <xsl:variable name="NUNToken" 

       select="substring-after(substring-before($NUNModified,'/'),'element::')"/>

  <xsl:apply-templates select="xs:element[@name=$NUNToken]">

    <xsl:with-param name="NUNRemain" select="substring-after($NUNModified,'/')"/>

  </xsl:apply-templates>

</xsl:template>

The other templates in the mode will then recursively parse each element name in the NUN, drilling down through the nested local element definitions until all names from the NUN have been parsed and the target element has been found.

<xsl:template match="xs:element[@name]">

  <xsl:param name="NUNRemain"/>

  <xsl:choose>

    <xsl:when test="string-length($NUNRemain)=0">

      <xsl:apply-templates select="." mode="targetElementForm"/>

    </xsl:when>

    <xsl:otherwise>

      <xsl:apply-templates select="xs:complexType/*">

        <xsl:with-param name="NUNRemainder" 

                        select="substring-after($NUNRemain,'/')"/>

        <xsl:with-param name="NUNToken" 

             select="substring-after(substring-before($NUNRemain,'/')

                     ,'element::')"/>

      </xsl:apply-templates>

    </xsl:otherwise>

  </xsl:choose>

</xsl:template>	



<xsl:template match="xs:sequence|xs:choice|xs:all">

  <xsl:param name="NUNRemain"/>

  <xsl:param name="NUNToken"/>

  <xsl:apply-templates 

       select="xs:sequence|xs:choice|xs:all|xs:element[@name=$NUNToken]">

    <xsl:with-param name="NUNRemain" select="$NUNRemain"/>

    <xsl:with-param name="NUNToken" select="$NUNToken"/>

  </xsl:apply-templates>

</xsl:template>



<xsl:template match="*"/>

Powers of Annotation

Now that the stylesheet has found the target element and has begun creating a form, it needs an informative title. WXS's <annotation> element lets us associate human-friendly information and almost any kind of application specific information we want in the schema. A detailed description of the data contained in a simple element can be placed into <documentation>. For our form, we can also use the <appinfo> element to indicate whether an element is read-only and how it should be labeled.

The targetElementForm mode contains a single template rule, which builds the form container. It then enters the findChildNodes mode by selecting the anonymous or named complex type definitions for the element.

<xsl:template match="xs:element[@name]" mode="targetElementForm">

  <b>New <xsl:value-of select="xs:annotation/xs:appinfo/frm:label"/></b><br/>

  <xsl:copy-of select="xs:annotation/xs:documentation/node()"/>



  <form>

    <input type="hidden" name="%%elementNUN" 

     value="{/xs:schema/@targetNamespace}#{$targetNUN}"/>

    <table>

      <xsl:apply-templates select="*|/xs:schema/xs:complexType[@name=current()/@type]"

                           mode="findChildNodes"/>

      <tr><td colspan="2">

        <input type="submit" value="Save Changes"/>

      </td></tr>

    </table>

  </form>

</xsl:template>



<xsl:template match="*" mode="targetElement"/>

WXS's <annotation> element will also be useful later on when building XHTML input elements for the target element's children.

Seek and Transform II

The next step is to find the definitions for the target element's child elements. Fortunately, with a bit of recursion, XSLT can easily handle the complex content models that WXS makes possible.

The templates of the findChildNodes mode will recursively walk through the content model of the target element and apply the template rules to locate the definitions of its child nodes. These few templates can handle named types, sequences, group references, and type extensions. Though not shown in these examples, a stylesheet could use these templates to track occurrence constraints and determine whether a child element is required or optional.

Model of the findChildNodes mode

The first template matches content model elements that the stylesheet will just step through.

<xsl:template match="xs:complexType|xs:complexContent|

                     xs:sequence|xs:all|xs:group[@name]"

              mode="findChildNodes">

  <xsl:apply-templates select="*" mode="findChildNodes"/>

</xsl:template>

References to global group and element definitions are resolved by the following two templates. Included and imported schemas are not shown in this example. However, one method for handling referenced WXS documents is to build a node set of the referenced WXS documents' <xs:schema> elements with the help of XSLT's document() function. You can then use your favorite XSLT 1.0 processor's node-set() extension function (.NET, MSXML, XALAN, SAXON, EXSL) to assign the node set to a global variable, which you can use in place of /xs:schema in the following templates.

<xsl:template match="xs:group[@ref]" mode="findChildNodes">

  <xsl:apply-templates select="/xs:schema/xs:group[@name=current()/@ref]" 

                       mode="findChildNodes"/>

</xsl:template>



<xsl:template match="xs:element[@ref]" mode="findChildNodes">

  <xsl:apply-templates select="/xs:schema/xs:element[@name=current()/@ref]" 

                       mode="findChildNodes"/>

</xsl:template>

When an <extension> is encountered, the stylesheet will process the extension's base and local content models in sequence.

<xsl:template match="xs:extension" mode="findChildNodes">

  <xsl:apply-templates select="/xs:schema/xs:complexType[@name=current()/@base]" 

                       mode="findChildNodes"/>

  <xsl:apply-templates select="*" mode="findChildNodes"/>

</xsl:template>

Once the definition of the child element is found, the template determines whether it is a simple type and exempt from any application specific restrictions. If this is true, it builds the table row, and then applies the templates from the childNodeInput mode to construct the input element.

<xsl:template match="xs:element[@name]" mode="findChildNodes">

  <xsl:if test="not(xs:complexType|

                /xs:schema/xs:complexType[@name=current()/@type]|

                xs:annotation/xs:appinfo/frm:readonly)">



    <tr>

      <td>

        <xsl:choose>

          <xsl:when test="xs:annotation/xs:appinfo/frm:label">	

            <xsl:value-of select="xs:annotation/xs:appinfo/frm:label"/>

          </xsl:when>

          <xsl:otherwise>

            <xsl:value-of select="@name"/>

          </xsl:otherwise>	

        </xsl:choose>



      </td>

      <td>

        <xsl:apply-templates select="." mode="childNodeInput">

          <xsl:with-param name="nodeName" select="@name"/>

        </xsl:apply-templates>



      </td>

    </tr>



  </xsl:if>

</xsl:template>



<xsl:template match="*" mode="findChildNodes"/>

Interpreting Simple Type Definitions

The templates of the simpleInputElement mode will walk the schema to find the base type of each simple element and then output the appropriate XHTML form element for the type. If the element's type is a restriction of a base type, it will further modify the XHTML form element with XHTML or custom attributes.

Model of the childNodeInput mode

The first two template rules are designed to match elements typed with anonymous or namespace defined simple types. The priority attribute of the first template is set to zero, so that it will have a lower priority than the template rules matching native WXS types.

<xsl:template match="xs:element[@type]" mode="childNodeInput" priority="0">

  <xsl:param name="nodeName"/>

  <xsl:param name="nodeValue" select="@default"/>

  <xsl:apply-templates 

       select="/xs:schema/xs:simpleType[@name=current()/@type]/xs:restriction"

       mode="childNodeInput">

    <xsl:with-param name="nodeName" select="$nodeName"/>

    <xsl:with-param name="nodeValue" select="$nodeValue"/>

  </xsl:apply-templates>

</xsl:template>



<xsl:template match="xs:element[xs:simpleType]" mode="childNodeInput">

  <xsl:param name="nodeName"/>

  <xsl:param name="nodeValue" select="@default"/>

   <xsl:apply-templates select="xs:simpleType/xs:restriction" mode="childNodeInput">

    <xsl:with-param name="nodeName" select="$nodeName"/>

    <xsl:with-param name="nodeValue" select="$nodeValue"/>

  </xsl:apply-templates>

</xsl:template>

The following template rules are a sample of the templates to match the native WXS types.

<xsl:template match="xs:element[@type='xs:string']|xs:restriction[@base='xs:string']"

              mode="childNodeInput">

  <xsl:param name="nodeName"/>

  <xsl:param name="nodeValue" select="@default"/>

    <xsl:choose>

      <xsl:when test="xs:maxLength">

        <input type="text" name="{$nodeName}" value="{$nodeValue}">

          <xsl:apply-templates select="*" mode="childNodeInput"/>

        </input>					

      </xsl:when>

      <xsl:otherwise>

        <textArea name="{$nodeName}">

          <xsl:apply-templates select="*" mode="childNodeInput"/>

          <xsl:value-of select="$nodeValue"/>

        </textArea>

      </xsl:otherwise>

    </xsl:choose>

</xsl:template>	



<xsl:template match="xs:element[@type='xs:boolean']|xs:restriction[@base='xs:boolean']"

              mode="childNodeInput">

  <xsl:param name="nodeName"/>

  <xsl:param name="nodeValue" select="@default"/>

  <input type="radio" name="{$nodeName}" value="true">

    <xsl:if test="$nodeValue='true'">

      <xsl:attribute name="checked">checked</xsl:attribute>

    </xsl:if>

    Yes

  </input>

  <input type="radio" name="{$nodeName}" value="false">

    <xsl:if test="$nodeValue='false'">

      <xsl:attribute name="checked">checked</xsl:attribute>

    </xsl:if>

    No

  </input>

</xsl:template>

Some restriction facets map directly to XHTML input element attributes. For example <xs:maxLength value="10"> maps to maxsize="10" attribute in an <input> element. Other restriction facets such as <xs:pattern> and <xs:maxInclusive> do not map to XHTML nodes. One approach to ensure conformant input from the user is to add custom attributes to the XHTML <input> element, and use a client side script to validate the user's input using the custom attributes.

<xsl:template match="xs:maxLength" mode="childNodeInput">

  <xsl:attribute name="maxLength">

    <xsl:value-of select="@value"/>

  </xsl:attribute>

</xsl:template>



<xsl:template match="xs:pattern" mode="childNodeInput">

  <xsl:attribute name="validationRegExp">

    <xsl:value-of select="@value"/>

  </xsl:attribute>

</xsl:template>

This stylesheet will map restrictions that have enumeration facets, regardless of the base type, to a <select> XHTML element. WXS annotation elements can useful here, when we need to associate human-readable text with an enumeration facet:

<xs:enumeration value="AK">

  <xs:annotation>

    <xs:documentation>Alaska</xs:documentation>

  </xs:annotation>

</xs:enumeration>



<option value="AK">Alaska</option>

The template rule that matches each enumeration facet checks for annotation information as well as whether the enumeration is the default value.

<xsl:template match="xs:restriction[xs:enumeration]" mode="childNodeInput" 

              priority="1">

  <xsl:param name="nodeName"/>

  <xsl:param name="nodeValue"/>

  <select name="{$nodeName}">

    <xsl:apply-templates select="*" mode="childNodeInput">

      <xsl:with-param name="nodeValue" select="$nodeValue"/>

    </xsl:apply-templates>

  </select>

</xsl:template>



<xsl:template match="xs:enumeration" mode="childNodeInput">

  <xsl:param name="nodeValue"/>

  <option value="{@value}">

    <xsl:if test="@value=$nodeValue">

      <xsl:attribute name="selected">

        selected

      </xsl:attribute>

    </xsl:if>

    <xsl:choose>

      <xsl:when test="xs:annotation/xs:documentation">

        <xsl:value-of select="xs:annotation/xs:documentation"/>

      </xsl:when>

      <xsl:otherwise>

        <xsl:value-of select="@value"/>

      </xsl:otherwise>

    </xsl:choose>

  </option>

</xsl:template>



<xsl:template match="*" mode="childNodeInput"/>

Conclusion

Using WXS as the common resource for data typing in your application can have big payoffs. By allowing components and interfaces to automatically reflect changes to an application's data model, you can greatly increase the reusability and flexibility of a system. XSLT is a useful, largely platform-independent, and highly portable tool for making this possible.