Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Shortening XSLT Stylesheets

by Manfred Knobloch
June 11, 2003

XSLT is often considered to be too verbose. As stylesheet code grows, it tends to be unreadable. This is not a fate stylesheet authors have to accept. There are some strategies to keep your XSLT code short. This article proposes some ways of shorten stylesheets without loss of functionality, and throws a glance at XSLT 2.0 user defined functions.

Current uses of XSLT often go beyound simple tag transformation. The more popular XSLT gets, the more people are using XSLT for application logic. But even when you create simple HTML links, you have to combine diverse parts of information coming from parameters and variables or from one or more input documents.

First of all, let's assume we have an input file with lines like these:

<page id="index" title="XML" src="home.xml" >
  <page id="download" title="downloads" src="downl.xml"/>
  <page id="weabus" title="XML-Webmaster" src="we.xml"/>
  <page id="xmllinks" title="links" src="links.xml" />
</page>

This describes a hierarchy of web pages to be constructed from the XML files mentioned in the src attributes. Each page is further described by a title and id attribute.

Strategy 1: Use Attribute Value templates

The usual way to evaluate XSLT variables, attributes or nodes is via a <xsl:value-of select="expr"/> statement or, to be more precise, with the help of the select expression. So if you generate HTML link tags from the title and src attributes your XSLT code might look like this:

<a>
 <xsl:attribute name="href">
   <xsl:value-of select="@src"/>
 </xsl:attribute>
 <xsl:value-of select="@title"/>
</a>

Using the xsl:attribute statement enables you to set the name and the value of the attribute at runtime. If the attribute name is known and only the attribute value has to be computed, the use of an attribute value template cuts off the xsl:attribute line:

<a href="{@src}">
   <xsl:value-of select="@title"/>
</a>

Attribute values that are written to the output tree normally are made of literal result elements. Attribute value templates (AVT) are nothing more than XPath expressions that are evaluated to a string inside the value section of the attribute. To differentiate strings from XPath expressions you have to wrap the expressions into curly brackets. This leads the XSLT processor to evaluate the expression. AVTs can only be used for attributes that do not belong to the xsl namespace, except for a few attributes of xslt elements (sorting and numbering). Keep in mind that AVTs are not allowed in top-level elements, and that they are not interpreted inside attributes that may contain an expression. (XSLT 1.0 §7.6.2)

Strategy 2: Use Attribute Sets

When the output format is HTML, or XSL Formatting Objects, stylesheet authors are forced to produce a number of attributes repeatedly. Assume you have to write several small tables always having similar border and spacing settings like this:

<table border="1" cellpadding="1" cellspacing="2" 
....
</table>

It is enough to have all the attributes in the target document; there is no need to have them listed in your stylesheet again and again. You better pack the attributes into a named attribute set which is defined at the beginning of your stylesheet as top level Elements:

<xsl:attribute-set name="tb.basics">
   <xsl:attribute name="border">1</xsl:attribute>
   <xsl:attribute name="cellspacing">2</xsl:attribute>
   <xsl:attribute name="cellpadding">1</xsl:attribute>
   <xsl:attribute name="bgcolor">#FF0000</xsl:attribute>
</xsl:attribute-set>

In the table tag you can now refer to this definition with the the xsl:use-attribute-sets notation:

<table xsl:use-attribute-sets="tb.basics">

It is worth noting that the table tag belongs to the HTML namespace. Thus you are forced to explicitly use the xsl namespace prefix for the use-attribute-sets inside the tag. It is possible to use attribute sets in attribute sets. This feature is sometimes compared with a kind of inheritance.

<xsl:attribute-set name="tb.center" 
                      use-attribute-sets="tb.basics">
   <xsl:attribute name="align">center</xsl:attribute>
</xsl:attribute-set>

Using the tb.center attribute set creates all attributes from tb.basics and the additional align="center" attribute. You can even overwrite attributes from the attribute set inside the referencing tag like shown in this extract from the shortcut_1.xslt sample.

<table xsl:use-attribute-sets="tb.basics">
   <xsl:apply-templates/>
</table>


<table xsl:use-attribute-sets="tb.center" bgcolor="green">
   <xsl:apply-templates/>
</table>

The use of attribute sets leads to longer code, which is the opposite of what we want to achieve. It is highly recommended to use another strategy in combination with attribute sets.

Strategy 3: Use Include Files

Once you're accustomed to working with attribute sets, you'll have a collection of common settings needed frequently during your work. The trick is to put them into a separate stylesheet file, like commons.xslt. Usage is quite easy when you connect the definitions with <xsl:include href="/2003/06/11/examples/commons.xslt"/> at the beginning of your stylesheet (see shortcut_2.xslt and commons.xslt).

The xsl:include simply copies the content of the referenced file to the position where the include statement is placed. This means that every definition from the external stylesheet is treated as if it were inside the including file. It also makes sense to place template rules or named templates in external files. But in that case you should take care of processing priority rules when there is more than one matching template for a specific tag. For a similar purpose xsl:import can be used, but the rules for import precedence become quite complex when more than one file or a cascade of files is imported.

I consider it a good strategy to modularize the attribute sets in one file, the most common used templates in another, and the application-dependent code inside the main stylesheet. To avoid hurting your brain, avoid cascaded xsl:import statements.

Strategy 4: Use the Concat Function

A great shortening mechanism is the use of the concat string function inside select expressions. It took me some time to understand this function, especially in combination with the attribute value templates. A chunk of information can be placed inside a select attribute or an AVT using the concat function.

Let's assume we have a parameter datadir passed to the stylesheet which keeps a platform specific description of a directory. And a second parameter holding a file extension. For a Windows environment this might be

<xsl:param name="datadir">d:\data\</xsl:param>
<xsl:param name="extension">.html</xsl:param>

Now a filename for each page of our input file must be created. We will combine the directory name, the id, and the extension param into a variable called fullname. This might be done with the following XSLT code:

<xsl:variable name="fullname">
   <xsl:value-of select="$datadir"/>
   <xsl:value-of select="@id"/>
   <xsl:value-of select="$extension"/>
</xsl:variable>

But if you decide to define the value of the variable inside the select attribute, you may use the concat function to combine the variables and the id attribute.

<xsl:variable name="fullname" 
              select="concat($datadir,@id,$extension)"/>

The concat function takes a comma separated list of arbitrary string arguments. If the arguments are not strings but node sets, they are converted into string values. You can increase the legibility by using concat, especially when you misuse XSLT to generate SQL or PHP code. If your xsl:output method is set to text, you may produce a function call in your output file with

create_html_header(<xsl:value-of select="@title"/>,
      <xsl:value-of select="@author"/> );

or use the concat function inside a select expression:

concat('create_html_header(',@title,@author,');')

It is a great advantage of the concat function to combine XPath expressions and strings in one statement. Very often multiple xsl:value-of statements can be compressed to a unique line of code. Since concat is an XPath function, it can be used in attribute value templates as well.

Strategy 5: Use XSLT 2.0 User-defined Functions

If you are interested in the development of XSLT 2.0, you should consider to try the new user-defined functions. An experimental implementation of parts of XSLT 2.0 is available with Michael Kay's Saxon 7.x XSLT processor. At first glance they are very similar to named templates. The big difference is that they can be called from within XPath expressions. This means inside select attributes and even inside AVTs. Consider the following small example, which also uses some new date and time features. It produces a string containing a copyright notice, a timestamp, and the vendor identification of the XSLT processor. You can find the full sample in shortcut_3.xslt and commons.xslt.

<xsl:function name="udf:impressum">
<xsl:param name="cr"/>
  <xsl:value-of 
    select="concat('&#169; ',$cr, string(current-date()),
    ', ', system-property('xsl:vendor'), ' V. ',
    system-property('xsl:version'))"/>
</xsl:function>

This function receives a single parameter, a string with the name of the copyright owner. Unlike named templates, you are forced to pass all parameters to the function you have declared inside the function. You have to keep track of the arguments order as well. So there is no possibility to initialize default argument values inside the xsl:param declaration like you can do with named templates. The function may be called as an expression like this: <xsl:value-of select="udf:impressum(' M. Knobloch ')"/>

Please note that you have to define a separate namespace for your own functions. In the example I used the domain name of my website (xmlns:udf="http://www.xml-web.de/udf") with udf (user defined function) as the namespace prefix. Of course you can use your own namespace definition, but take care of the namespace being present where you wish to call your function.

Conclusion

Verbose stylesheets are not unavoidable. You can combine the preceding strategies to shorten XSLT stylesheets considerbly. Regarding performance issues, it's hard to say much, given the difference in XSLT processor performance. (See Fast XSLT by Steve Punte). But my guess is that one function call is cheaper than multiple xsl:value-of calls.

Related Links

XSL Transformations (XSLT) Version 1.0 W3C Recommendation

XSL Transformations (XSLT) Version 2.0. W3C Working Draft 2 May 2003

XML Path Language (XPath) Version 2.0. W3C Working Draft 02 May 2003

XML.COM article about XPath 2.0 by Evan Lenz

XML.COM article about XSLT 2.0 by Evan Lenz


Comment on this articleDo you have other ways of keeping XSLT readable? Share them with other readers in our forum.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • Named templates
    2003-06-19 13:09:50 Lars Marius Garshol [Reply]

    Perhaps the most obvious way to shorten stylesheets is to move repeated logic or code bits out into named templates. This can often dramatically shorten a stylesheet, and of course also contributes to its legibility.


    Another trick is to inline templates whenever possible. That is, rather than keep something in a separate <xsl:template match="whatever">... you can just inline it through judicious use of <xsl:when> and <xsl:foreach>. Again, this also helps greatly to make the stylesheets more readable.

  • Shorthands and EXSLT
    2003-06-18 13:20:22 Tom Moertel [Reply]

    You asked, "Do you have other ways of keeping XSLT readable?"


    One of the things that I do is to use a shorthand notation.


    In essence, XSLT is a programming language embedded within XML. While XML is great for writing documents, it isn't a clean and tidy medium in which to express programming languages.


    So I don't write XSLT in XML anymore. Instead, I use a shorthand notation.


    One such notation is XSLTXT, "the XSLT compact form." It is open source and tailor made for writing XSLT.


    http://savannah.nongnu.org/projects/xsltxt


    A more general notation is PXSL ("pixel"), the Parsimonious XML Shorthand Language. It's an open-source, extensible shorthand for markup-dense XML, and it also has built-in shortcuts for writing XSLT. (Disclaimer: I'm one of the guys who created PXSL.)


    http://community.moertel.com/pxsl/


    For advanced users, PXSL has a macro facility that can be used to refactor XSLT code. One particularly good application of its macros is to compartmentalize and reuse the boilerplate code that seems so common to XSLT stylesheets. If you're interested in seeing an example of this, I wrote a diary about it on kuro5hin.org:


    http://www.kuro5hin.org/story/2003/6/4/12434/75716


    Another option is to write less XSLT code in the first place by using a richer XSLT vocabulary. One such vocabulary is EXSLT, the result of a community initiative to augment XSLT with common, much-needed extensions. Most XSLT processing engines support EXSLT out of the box, and the few that don't can make use of EXSLT implementations written in vanilla XSLT.


    http://www.exslt.org/


    If your XSLT stylesheets are becoming unreadable, do consider these options. They make a big difference.


  • Great Article (small error though?)
    2003-06-12 08:33:07 chris hoke [Reply]

    a really interesting article as i never used attribute sets yet and did not think about this usage possibilities of concat...


    just one thing:


    concat('create_html_header(',@title,@author,');')


    should really be


    concat('create_html_header(',@title,',',@author,');')


    should it not?





  • Another tip: Entity References
    2003-06-11 17:43:13 Doug Ransom [Reply]

    I like to use entity references for repeated strings and reusable portions of xpaths, especially for concatenating xpath statements together to form a longer path.
    For example


    <!DOCTYPE xsl:stylesheet [
    <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#" >
    <!ENTITY rss "http://purl.org/rss/1.0/" >
    <!ENTITY content "http://purl.org/rss/1.0/modules/content/">
    <!ENTITY xsl "http://www.w3.org/1999/XSL/Transform" >
    <!ENTITY html "http://www.w3.org/1999/xhtml">
    <!ENTITY linkpredbase "html:link[@rel='alternate'][@type='application/rss+xml']">
    <!ENTITY linkpred "&linkpredbase;[contains(@href,':')]" >
    <!ENTITY rellinkpred "&linkpredbase;[@href][not(contains(@href,':'))]">
    <!ENTITY relrsslink "html:head[&linkpredbase;]">
    <!ENTITY rsslink "html:head[&linkpred;]">
    <!ENTITY baseuri "html:base/@href">
    <!ENTITY badChannelLocation "/html:html[descendant::rss:channel[preceding::rss:*]]">
    <!ENTITY content-meta "/html:html/html:head/html:meta[@name='copy-descriptions-to-content']">
    <!ENTITY linkWithAnchorStart "rss:link[descendant::html:a/@href" >
    <!ENTITY linkWithAnchor "&linkWithAnchorStart;]">
    <!ENTITY dc-meta "html:meta[starts-with(@name,'DC')]">
    <!ENTITY dc "http://purl.org/dc/elements/1.1/">
    ]>
    ....
    <xsl:when
    test="&content-meta;">
    <xsl:value-of select="&content-meta;/@content" />
    </xsl:when>






    • Another tip: Entity References
      2003-06-15 06:15:25 manfred Knobloch [Reply]

      Good strategy if you can use DTDs. This restriction makes me use other ways. Sometimes i take the document() function to include larger parts. The advantage over Entities is that the values come in at stylesheet runtime. Expansion of entities is done at parsing time. So you have no conditional inclusion of values.
      But you are right: Entities are made for shortcutting.