Menu

XSLT Extensions

September 5, 2001

Bob DuCharme

If the specialized elements of the XSLT namespace and the combined functions of XSLT and XPath aren't enough to perform the transformations you need, XSLT gives you ways to incorporate additional instruction elements and functions into your stylesheets. Most XSLT processors offer several extra extension elements and functions because it's a good way to distinguish themselves from the competition. This month we'll examine the use of extensions and some ways that a stylesheet can gracefully handle the possibility that the XSLT processor doesn't recognize an extension designed for use with another processor.

Extension Elements

There are three categories of elements that can be in an XSLT stylesheet:

  • Elements from the XSLT namespace that tell the processor how to transform the source tree into the result tree.

  • Literal result elements of any namespace you like that get added to the result tree just as they are shown in the stylesheet.

  • Extension elements: customized instruction elements that can be used along with the instructions from the XSLT namespace.

Warning Elements from the XSLT namespace fall into two categories: top-level elements, which are children of the xsl:stylesheet element with general instructions about handling the source document, and the children of the xsl:template elements known as instructions, which give specific instructions about nodes to add to the result tree. Extension elements cannot be top-level elements; they are always new instructions.

It's an important part of an XSLT processor's job to recognize all the elements in a stylesheet from the XSLT namespace and to carry out their instructions. If literal result elements can be from any namespace (see the earlier "Transforming XML" column Namespaces and XSLT Stylesheets for more on namespaces), letting you add elements from the HTML, XLink, or any other namespace to the result tree, how does a processor know which elements are extension elements? Because the stylesheet must explicitly list which namespaces are to be treated as extension element namespaces in the extension-element-prefixes attribute.

Let's look at an example. Some XSLT stylesheet authors are frustrated by XSLT's prohibition against changing the value of a variable during the execution of a stylesheet. Michael Kay added an assign extension element to his Saxon processor that lets you change a variable's value all you want. In the following stylesheet, the http://icl.com/saxon namespace (the one his processor expects to find for Saxon extension elements) is declared as a namespace with a prefix of "saxon", and this "saxon" namespace prefix is included in the value of the xsl:stylesheet element's extension-element-prefixes attribute. (The sample documents, stylesheets, and output shown in this column are available in this zip file.)

<!-- xq610.xsl: converts xq610.xsl into xq611.txt -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:saxon="http://icl.com/saxon"
     extension-element-prefixes="saxon"
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:variable name="color" 
       saxon:assignable="yes">red</xsl:variable>

  <xsl:template match="/">

    <saxon:assign name="color">blue</saxon:assign>

    The color variable has the following value:
    <xsl:value-of select="$color"/>
  </xsl:template>

</xsl:stylesheet>
Tip The extension-element-prefix attribute is usually included with the xsl:stylesheet element, but it can be added to any literal result element or extension element. If it's an attribute of an element other than the stylesheet's root xsl:stylesheet element, then it's only effective within the element where it's an attribute -- in other words, any extension elements from the specified namespace can only be used in that element or in one of its descendants. That's why it's more convenient to declare any extension namespaces in an extension-element-prefixes attribute of the xsl:stylesheet element; you can then use the extension elements anywhere you want in the document.

The preceding stylesheet, which you can run using any XML document as input, doesn't do much. First, it declares a variable named "color" and assigns it the value "red". Next, the single template rule in the stylesheet adds the phrase "The color variable has the following value:" to the result tree followed by the variable's value as put there by the xsl:value-of instruction. The special part comes inside that template rule just before this text gets added to the result tree: the saxon:assign extension element assigns the value "blue" to the "color" variable. (XSLT also allows extension attributes as well as extension elements, and the special Saxon attribute saxon:assignable is added to XSLT's xsl:variable element to let the saxon:assign element know that changing this variable's value is okay.) The output, when run with the Saxon processor, shows that the variable's value was successfully changed:


    The color variable has the following value:
    blue

When run with the Xalan Java processor, or any others besides Saxon, the saxon:assign element has no effect:


    The color variable has the following value:
    red

A lack of error messages might be considered a good thing, but in this case it's a bad thing: a stylesheet instruction failed to execute, so some sort of message about this failure would make for a more robust system. Fortunately, XSLT offers two ways to check whether an extension element will work or not: fallback and the element-available() function.

If an XSLT processor doesn't implement a particular extension element, it must look for an xsl:fallback child of that element and add its contents to the result tree. In the following revision of the stylesheet above, the xsl:fallback element has no contents to add to the result tree, but instead sends an xsl:message text string to wherever such strings go for the processor in question. (The xsl:message element sends a text message somewhere other than the result tree -- for example, to a command line window -- and is useful for debugging and status messages. Not all XSLT processors support it fully.)

<!-- xq617.xsl: converts xq617.xsl to xq637.txt  -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:saxon="http://icl.com/saxon"
     extension-element-prefixes="saxon"  
     version="1.0">
  <xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:variable name="color" 
       saxon:assignable="yes">red</xsl:variable>

  <xsl:template match="/">

    <saxon:assign name="color">blue<xsl:fallback>
        <xsl:message>This XSLT processor doesn't support saxon:assign.
        </xsl:message></xsl:fallback></saxon:assign>

    The color variable has the following value:
    <xsl:value-of select="$color"/>
  </xsl:template>

</xsl:stylesheet>

When run with the Saxon processor, this creates the same result as the earlier Saxon run. No xsl:message text appears at the command line because the saxon:assign element enclosing the xsl:fallback element executed successfully. When run with the Xalan Java parser, however, the stylesheet creates the same result as with the earlier Xalan run and sends the following message to the command line window:

This XSLT processor doesn't support saxon:assign. 

While the xsl:fallback element gives you a way to handle the failure of an extension element, the boolean element-available() function gives you a way to check whether the extension element is supported before even trying to execute it.

The following revision of the stylesheet above has an xsl:choose element that uses this function to test whether the saxon:assign element is supported. If so, the saxon:assign element inside the xsl:when element gets evaluated; if not, the message "This XSLT processor doesn't support saxon:assign" gets sent wherever that XSLT processor sends such messages.

<!-- xq619.xsl: converts xq619.xsl  -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:saxon="http://icl.com/saxon"
     extension-element-prefixes="saxon"  
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:variable name="color" 
       saxon:assignable="yes">red</xsl:variable>


  <xsl:template match="/">

    <xsl:choose>
      <xsl:when test="element-available('saxon:assign')">
        <saxon:assign name="color">blue</saxon:assign>
      </xsl:when>
      <xsl:otherwise>
        <xsl:message>This XSLT processor doesn't support saxon:assign.
        </xsl:message>
      </xsl:otherwise>
    </xsl:choose>
    The color variable has the following value:
    <xsl:value-of select="$color"/>
  </xsl:template>

</xsl:stylesheet>

The results, when run with Saxon and then Xalan Java, are the same as with the previous version of the stylesheet: with Saxon, the result tells us that the color variable was successfully set to "blue", and no extra messages appear in the command prompt window; when run with Xalan Java, the result tells us that the color variable remains at the original setting of "red" and the xsl:message element sends the message about the lack of support for saxon:assign to the command prompt window.

Using Built-in Extension Functions

An XSLT processor can add additional functions to the selection required of it by the XSLT and XPath specifications. To use one of these extension functions, you only have to declare the namespace and then reference that namespace when calling the function. (When using extension elements, you need the extension-element-prefixes attribute to tell the processor that extension elements from certain namespaces will be used in the stylesheet, but there's no need for this when using extension functions.)

To demonstrate the use of an extension function available in an XSLT processor, we'll look at Xalan Java's tokenize() function, which is similar to the Perl programming language's split() function: it splits up a string whenever it finds a certain character. If you tell it to split up "red,green,light blue" at the commas, you'll get "red", "green" and "light blue". Xalan's tokenize() function accepts two parameters: a string of text delimited by a certain character and an optional string showing the characters used as the delimiter. (The default delimiters are the whitespace characters.) It then splits up the string wherever it finds the delimiting character, creating a node list that your stylesheet can iterate across using an xsl:for-each instruction.

Our stylesheet will use the tokenize() function to split up the fields in the employee elements of the following document:

<employees>
<employee>Herbert,Johnny,09/01/1998,95000</employee>
<employee>Hill,Graham,08/20/2000,89000</employee>
<employee>Hill,Phil,04/23/1999/100000</employee>
<employee>Moss,Sterling,10/16/2000,97000</employee>
</employees>
    

Also in Transforming XML

Automating Stylesheet Creation

Appreciating Libxslt

Push, Pull, Next!

Seeking Equality

The Path of Control

The following stylesheet converts these employee elements into a table. The two parameters passed to the tokenize() function are "." (an abbreviation of "self::node()", thereby passing the contents of the employee context node) and a comma enclosed in single quotes to show that it's the delimiting character in the first string. The stylesheet also uses the function-available() function that must be supported by all XSLT processors to check whether the tokenize() function is available for use by the stylesheet. If it is, the contents of the xsl:when element get added to the result tree: the tokenize() function splits up the employee contents into a node list and the xsl:for-each instruction goes through that list, adding the contents of each node to the result tree enclosed by an entry element. If the function isn't available, the xsl:otherwise instruction just adds the contents of the source tree employee element inside the row element as one big entry element.

<!-- xq620.xsl: converts xq621.xml into xq622.xml, xq623.xml -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:xalan="http://xml.apache.org/xalan"
     exclude-result-prefixes="xalan"
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:template match="employees">
    <table>
      <xsl:apply-templates/>
    </table>
  </xsl:template>

  <xsl:template match="employee">
    <row>
      <xsl:choose>
        <xsl:when test="function-available('xalan:tokenize')">
          <xsl:for-each select="xalan:tokenize(.,',')">
            <entry><xsl:value-of select="."/></entry>
          </xsl:for-each>
        </xsl:when>
        <xsl:otherwise>
          <entry><xsl:value-of select="."/></entry>
        </xsl:otherwise>
      </xsl:choose>
    </row>
  </xsl:template>

</xsl:stylesheet>

For the processor to recognize the function as an extension function, the stylesheet calls tokenize() using the namespace prefix ("xalan") that goes with the declaration identifying the namespace. Because an XSLT processor passes along namespace declarations for any referenced namespaces to the result tree, but we don't want this declaration in our result document, the stylesheet has an exclude-result-prefixes attribute in the xsl:stylesheet element to prevent this.

When run with the Xalan Java processor, the stylesheet splits up each employee element into separate entry elements in each row.

<table>
<row><entry>Herbert</entry><entry>Johnny</entry>
<entry>09/01/1998</entry><entry>95000</entry></row>
<row><entry>Hill</entry><entry>Graham</entry>
<entry>08/20/2000</entry><entry>89000</entry></row>
<row><entry>Hill</entry><entry>Phil</entry>
<entry>04/23/1999/100000</entry></row>
<row><entry>Moss</entry><entry>Sterling</entry>
<entry>10/16/2000</entry><entry>97000</entry></row>
</table>

When run with the Saxon XSLT processor, which doesn't support Xalan's tokenize() function, each employee element gets added to the result tree as one big entry element.

<table>
<row><entry>Herbert,Johnny,09/01/1998,95000</entry></row>
<row><entry>Hill,Graham,08/20/2000,89000</entry></row>
<row><entry>Hill,Phil,04/23/1999/100000</entry></row>
<row><entry>Moss,Sterling,10/16/2000,97000</entry></row>
</table>

Saxon actually has a tokenize() function, but as Saxon's own extension function, you have to use it by declaring the appropriate namespace and using that namespace to identify the function when calling it. If the stylesheet above were used in a production environment in which Xalan Java and Saxon were both available, the xsl:choose element could include another xsl:when element to check whether saxon:tokenize() is available and to use it if so.

Take a look through your XSLT processor's documentation to see what extension functions are available. Along with debugging features, it's another area where processor developers strive to stand out from the competition, because it's an obvious place to add features unavailable in other processors. It's also a place where developers can address any deficiencies they see in XSLT by adding features they feel should have been there in the first place.