Shortening XSLT Stylesheets

June 11, 2003

XSLT is often considered to be too verbose. As stylesheet code grows, it tends to be unreadable. This is not a fate stylesheet authors have to accept. There are some strategies to keep your XSLT code short. This article proposes some ways of shorten stylesheets without loss of functionality, and throws a glance at XSLT 2.0 user defined functions.

Current uses of XSLT often go beyound simple tag transformation. The more popular XSLT gets, the more people are using XSLT for application logic. But even when you create simple HTML links, you have to combine diverse parts of information coming from parameters and variables or from one or more input documents.

First of all, let's assume we have an input file with lines like these:

<page id="index" title="XML" src="home.xml" >

  <page id="download" title="downloads" src="downl.xml"/>

  <page id="weabus" title="XML-Webmaster" src="we.xml"/>

  <page id="xmllinks" title="links" src="links.xml" />

</page>

This describes a hierarchy of web pages to be constructed from the XML files mentioned in the src attributes. Each page is further described by a title and id attribute.

Strategy 1: Use Attribute Value templates

The usual way to evaluate XSLT variables, attributes or nodes is via a <xsl:value-of select="expr"/> statement or, to be more precise, with the help of the select expression. So if you generate HTML link tags from the title and src attributes your XSLT code might look like this:

<a>

 <xsl:attribute name="href">

   <xsl:value-of select="@src"/>

 </xsl:attribute>

 <xsl:value-of select="@title"/>

</a>

Using the xsl:attribute statement enables you to set the name and the value of the attribute at runtime. If the attribute name is known and only the attribute value has to be computed, the use of an attribute value template cuts off the xsl:attribute line:

<a href="{@src}">

   <xsl:value-of select="@title"/>

</a>

Attribute values that are written to the output tree normally are made of literal result elements. Attribute value templates (AVT) are nothing more than XPath expressions that are evaluated to a string inside the value section of the attribute. To differentiate strings from XPath expressions you have to wrap the expressions into curly brackets. This leads the XSLT processor to evaluate the expression. AVTs can only be used for attributes that do not belong to the xsl namespace, except for a few attributes of xslt elements (sorting and numbering). Keep in mind that AVTs are not allowed in top-level elements, and that they are not interpreted inside attributes that may contain an expression. (XSLT 1.0 §7.6.2)

Strategy 2: Use Attribute Sets

When the output format is HTML, or XSL Formatting Objects, stylesheet authors are forced to produce a number of attributes repeatedly. Assume you have to write several small tables always having similar border and spacing settings like this:

<table border="1" cellpadding="1" cellspacing="2" 

....

</table>

It is enough to have all the attributes in the target document; there is no need to have them listed in your stylesheet again and again. You better pack the attributes into a named attribute set which is defined at the beginning of your stylesheet as top level Elements:

<xsl:attribute-set name="tb.basics">

   <xsl:attribute name="border">1</xsl:attribute>

   <xsl:attribute name="cellspacing">2</xsl:attribute>

   <xsl:attribute name="cellpadding">1</xsl:attribute>

   <xsl:attribute name="bgcolor">#FF0000</xsl:attribute>

</xsl:attribute-set>

In the table tag you can now refer to this definition with the the xsl:use-attribute-sets notation:

<table xsl:use-attribute-sets="tb.basics">

It is worth noting that the table tag belongs to the HTML namespace. Thus you are forced to explicitly use the xsl namespace prefix for the use-attribute-sets inside the tag. It is possible to use attribute sets in attribute sets. This feature is sometimes compared with a kind of inheritance.

<xsl:attribute-set name="tb.center" 

                      use-attribute-sets="tb.basics">

   <xsl:attribute name="align">center</xsl:attribute>

</xsl:attribute-set>

Using the tb.center attribute set creates all attributes from tb.basics and the additional align="center" attribute. You can even overwrite attributes from the attribute set inside the referencing tag like shown in this extract from the shortcut_1.xslt sample.

<table xsl:use-attribute-sets="tb.basics">

   <xsl:apply-templates/>

</table>





<table xsl:use-attribute-sets="tb.center" bgcolor="green">

   <xsl:apply-templates/>

</table>

The use of attribute sets leads to longer code, which is the opposite of what we want to achieve. It is highly recommended to use another strategy in combination with attribute sets.

Strategy 3: Use Include Files

Once you're accustomed to working with attribute sets, you'll have a collection of common settings needed frequently during your work. The trick is to put them into a separate stylesheet file, like commons.xslt. Usage is quite easy when you connect the definitions with <xsl:include href="/2003/06/11/examples/commons.xslt"/> at the beginning of your stylesheet (see shortcut_2.xslt and commons.xslt).

The xsl:include simply copies the content of the referenced file to the position where the include statement is placed. This means that every definition from the external stylesheet is treated as if it were inside the including file. It also makes sense to place template rules or named templates in external files. But in that case you should take care of processing priority rules when there is more than one matching template for a specific tag. For a similar purpose xsl:import can be used, but the rules for import precedence become quite complex when more than one file or a cascade of files is imported.

I consider it a good strategy to modularize the attribute sets in one file, the most common used templates in another, and the application-dependent code inside the main stylesheet. To avoid hurting your brain, avoid cascaded xsl:import statements.

Strategy 4: Use the Concat Function

A great shortening mechanism is the use of the concat string function inside select expressions. It took me some time to understand this function, especially in combination with the attribute value templates. A chunk of information can be placed inside a select attribute or an AVT using the concat function.

Let's assume we have a parameter datadir passed to the stylesheet which keeps a platform specific description of a directory. And a second parameter holding a file extension. For a Windows environment this might be

<xsl:param name="datadir">d:\data\</xsl:param>

<xsl:param name="extension">.html</xsl:param>

Now a filename for each page of our input file must be created. We will combine the directory name, the id, and the extension param into a variable called fullname. This might be done with the following XSLT code:

<xsl:variable name="fullname">

   <xsl:value-of select="$datadir"/>

   <xsl:value-of select="@id"/>

   <xsl:value-of select="$extension"/>

</xsl:variable>

But if you decide to define the value of the variable inside the select attribute, you may use the concat function to combine the variables and the id attribute.

<xsl:variable name="fullname" 

              select="concat($datadir,@id,$extension)"/>

The concat function takes a comma separated list of arbitrary string arguments. If the arguments are not strings but node sets, they are converted into string values. You can increase the legibility by using concat, especially when you misuse XSLT to generate SQL or PHP code. If your xsl:output method is set to text, you may produce a function call in your output file with

create_html_header(<xsl:value-of select="@title"/>,

      <xsl:value-of select="@author"/> );

or use the concat function inside a select expression:

concat('create_html_header(',@title,@author,');')

It is a great advantage of the concat function to combine XPath expressions and strings in one statement. Very often multiple xsl:value-of statements can be compressed to a unique line of code. Since concat is an XPath function, it can be used in attribute value templates as well.

Strategy 5: Use XSLT 2.0 User-defined Functions

If you are interested in the development of XSLT 2.0, you should consider to try the new user-defined functions. An experimental implementation of parts of XSLT 2.0 is available with Michael Kay's Saxon 7.x XSLT processor. At first glance they are very similar to named templates. The big difference is that they can be called from within XPath expressions. This means inside select attributes and even inside AVTs. Consider the following small example, which also uses some new date and time features. It produces a string containing a copyright notice, a timestamp, and the vendor identification of the XSLT processor. You can find the full sample in shortcut_3.xslt and commons.xslt.

<xsl:function name="udf:impressum">

<xsl:param name="cr"/>

  <xsl:value-of 

    select="concat('&#169; ',$cr, string(current-date()),

    ', ', system-property('xsl:vendor'), ' V. ',

    system-property('xsl:version'))"/>

</xsl:function>

This function receives a single parameter, a string with the name of the copyright owner. Unlike named templates, you are forced to pass all parameters to the function you have declared inside the function. You have to keep track of the arguments order as well. So there is no possibility to initialize default argument values inside the xsl:param declaration like you can do with named templates. The function may be called as an expression like this: <xsl:value-of select="udf:impressum(' M. Knobloch ')"/>

Please note that you have to define a separate namespace for your own functions. In the example I used the domain name of my website (xmlns:udf="http://www.xml-web.de/udf") with udf (user defined function) as the namespace prefix. Of course you can use your own namespace definition, but take care of the namespace being present where you wish to call your function.

Conclusion

Verbose stylesheets are not unavoidable. You can combine the preceding strategies to shorten XSLT stylesheets considerbly. Regarding performance issues, it's hard to say much, given the difference in XSLT processor performance. (See Fast XSLT by Steve Punte). But my guess is that one function call is cheaper than multiple xsl:value-of calls.