Understanding the node-set() Function
The XSLT language is capable of achieving many tasks, but some
surprisingly trivial requirements, such as calculating the total
amount of an invoice, cannot be expressed in a straightforward way.
This article describes how you can get round this by using a very
powerful extension function in your stylesheets: the
node-set() function.
In XSLT you can assign to a variable any XPath data type. For example, to store all books from a catalog in a variable for further processing, you can use the following instruction:
<xsl:variable name="books" select="//book"/>
The variable $books now contains a set of nodes. Thus
you can use this variable in other XPath expressions without any
limitations. For example, you can use the expression
$books/title to get the titles of all books from the
catalog.
So far, so good, but XSLT added a new data type called "result
tree fragment" into XPath. You can imagine a result tree fragment
(RTF) as a fragment or a chunk of XML code. You can assign a result
tree fragment to a variable directly, or result tree fragment can
arise from applying templates or other XSLT instructions. The
following code assigns a simple fragment of XML to the variable
$author.
<xsl:variable name="author">
<firstname>Jirka</firstname>
<surname>Kosek</surname>
<email>jirka@kosek.cz</email>
</xsl:variable>
Now let's say we want to extract the e-mail address from the
$author variable. The most obvious way is to use an
expression such as $author/email. But this will fail, as
you can't apply XPath navigation to a variable of the type "result
tree fragment."
If we want to get around this limitation, we can use an extension
function which is able to convert a result tree fragment back to a
node-set. This function is not a part of the XSLT or XPath standards;
thus, stylesheets which use it will not be as portable as ones which
don't. However, the advantages of node-set() usually
outweigh portability issues.
Extension functions always reside in a separate namespace. In order
to use them we must declare this namespace as an extension namespace
in our stylesheet. The namespace in which the node-set()
function is implemented is different for each processor, but
fortunately many processors also support EXSLT, so we can use
the following declarations at the start of our stylesheet.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
version="1.0">
...
<!-- Now we can convert result tree fragment back to node-set -->
<xsl:value-of select="exsl:node-set($author)/email"/>
...
</xsl:stylesheet>
The expression exsl:node-set($author) converts the
result tree fragment to a node-set; we can take it as a start for
further XPath navigation. If our processor is not EXSLT-aware we must
change the namespace http://exslt.org/common according to
Table 1.
Table 1. Support for node-set() in XSLT processors
| Processor | Function name | Namespace |
|---|---|---|
| EXSLT aware processors (Saxon, xsltproc, Xalan-J, jd.xslt, 4XSLT) | node-set() | http://exslt.org/common |
| MSXML | node-set() | urn:schemas-microsoft-com:xslt |
| Xalan-C | nodeset() | http://xml.apache.org/xalan |
| Sablotron | Can operate on result tree fragments directly | |
After this rather theoretical introduction, I will now show you how
you can use node-set() for something more useful.
Let's suppose that we want to create a stylesheet that is able to render a simple XML invoice into nice HTML for further browsing and printing. For the sake of simplicity our invoice contains just items, each item has a description, ordered quantity and unit price.
<?xml version="1.0" encoding="utf-8"?>
<invoice>
<item>
<description>Pilsner Beer</description>
<qty>6</qty>
<unitPrice>1.69</unitPrice>
</item>
<item>
<description>Sausage</description>
<qty>3</qty>
<unitPrice>0.59</unitPrice>
</item>
<item>
<description>Portable Barbecue</description>
<qty>1</qty>
<unitPrice>23.99</unitPrice>
</item>
<item>
<description>Charcoal</description>
<qty>2</qty>
<unitPrice>1.19</unitPrice>
</item>
</invoice>
We don't want to be responsible for putting a damper on the party,
so we will write a stylesheet for turning this XML into HTML. However,
there is one complication: the rendered invoice should certainly
contain the total amount. This might look like a simple task, but it
will quickly become apparent that XPath and XSLT will fail here. XPath
provides us with the sum() function, but it is only
possible to sum values of nodes, and in our example we want to
calculate a sum of subtotals (qty * unitPrice), which are
not present in the source XML and thus are not accessible to XPath's
sum(). The only pure XSLT 1.0 solution is to use
recursive processing, which leads to code that is not very clear and
easy to understand. (A pure XSLT solution is presented in
invoice-noext.xsl stylesheet in the ZIP archive with all examples.)
The whole task will be much easier if we decide to utilize the
node-set() function. In the first step we calculate
subtotals for each item and store them as a fragment of XML.
<xsl:variable name="subTotals">
<xsl:for-each select="invoice/item">
<number><xsl:value-of select="qty * unitPrice"/></number>
</xsl:for-each>
</xsl:variable>
The variable $subTotals now holds subtotals, where
each subtotal is marked-up with a number element.
<number>10.14</number>
<number>1.77</number>
<number>23.99</number>
<number>2.38</number>
Now we can get the total invoice amount quite easily by summing up
values stored in number nodes:
sum(exsl:node-set($subTotals)/number).
Here is a complete working stylesheet.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
version="1.0">
<xsl:template match="/">
<html>
<head>
<title>Invoice</title>
</head>
<body>
<h1>Invoice</h1>
<!-- Format invoice items as a table -->
<table border="1" style="text-align: center">
<tr>
<th>Description</th>
<th>Quantity</th>
<th>Unit price</th>
<th>Subtotal</th>
</tr>
<xsl:for-each select="invoice/item">
<tr>
<td><xsl:value-of select="description"/></td>
<td><xsl:value-of select="qty"/></td>
<td><xsl:value-of select="unitPrice"/></td>
<td><xsl:value-of select="qty * unitPrice"/></td>
</tr>
</xsl:for-each>
<tr>
<th colspan="3">Total</th>
<th>
<!-- Gather subtotals into variable -->
<xsl:variable name="subTotals">
<xsl:for-each select="invoice/item">
<number>
<xsl:value-of select="qty * unitPrice"/>
</number>
</xsl:for-each>
</xsl:variable>
<!-- Sum subtotals stored as a result tree fragment
in the variable -->
<xsl:value-of
select="sum(exsl:node-set($subTotals)/number)"/>
</th>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
Multipass processing is another situation in which the
node-set() function is essential. In some situations it
is hard to do the transformation in a single step; some
post-processing on the result is needed. If we want to do this during
a single transformation without the need for storing a temporary
result, and without the need for repeated invocation of the XSLT
processor, we must capture the result of the first transformation in a
variable as a result tree fragment (RTF), convert the RTF to a
node-set, and feed this node-set to templates which are responsible for
post-processing.
We can demonstrate this technique on a very simple but real problem. Suppose that we must change an existing stylesheet to display a small image before each external link, in order to inform the user that an Internet connection is needed to traverse the link. The conventional approach to solving this task is to change the existing stylesheet to emit icons in appropriate places. But in the case of a very complex stylesheet this can be very time consuming work.
Our approach will give up on modifying the existing stylesheet.
Instead we will capture its output and we will modify links in the
captured output. In order to capture the output of other stylesheets
we must import the stylesheet, and in the template for root node we
must invoke the original templates using
xsl:apply-imports inside a variable definition.
<xsl:variable name="content">
<xsl:apply-imports/>
</xsl:variable>
The variable $content now holds the complete output
from the original stylesheet. In this output we must change all
occurrences of external links such as:
<a href="http...">text</a>
to
<a href="http..."><img src="external.gif" width="16"
height="16" border="0"> text</a>
All other text and markup should retain untouched. To copy the XML tree without modification we can use a very simple template that copies all element, attribute and text nodes.
<xsl:template match="@*|*|text()">
<xsl:copy>
<xsl:apply-templates select="@*|*|text()"/>
</xsl:copy>
</xsl:template>
A second template is needed to process external links in a different way. As this template will match against a named element it has a higher priority than the previous copy-only template and will override it.
<xsl:template match="a[starts-with(@href,'http')]">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<img src="external.gif" width="16" height="16" border="0"/>
<xsl:text> </xsl:text>
<xsl:apply-templates select="*|text()"/>
</xsl:copy>
</xsl:template>
Note that we must copy the original attributes for element
a before inserting the image, otherwise the attributes
will be appended to the wrong place.
In the final stylesheet, former templates must be in their own mode to prevent conflicts with the original stylesheet. To show that there are real situations where you don't have enough time to get to grips with other's work I'm using the DocBook XSL stylesheets as my original stylesheet. You can process any valid DocBook document with our final stylesheet.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
version="1.0">
<!-- Import original stylesheet -->
<xsl:import
href="http://docbook.sourceforge.net/release/xsl/current/html/docbook.xsl"/>
<xsl:template match="/">
<!-- Grab result of original stylesheet -->
<xsl:variable name="content">
<xsl:apply-imports/>
</xsl:variable>
<!-- Pass grabbed content to postprocessing templates -->
<xsl:apply-templates select="exsl:node-set($content)"
mode="decoratelinks"/>
</xsl:template>
<!-- Default postprocessing is just copying of nodes -->
<xsl:template match="@*|*|text()" mode="decoratelinks">
<xsl:copy>
<xsl:apply-templates select="@*|*|text()"
mode="decoratelinks"/>
</xsl:copy>
</xsl:template>
<!-- Absolute links starting with "http" are external and we
must add icon to them -->
<xsl:template match="a[starts-with(@href,'http')]"
mode="decoratelinks">
<xsl:copy>
<!-- Copy original <a> attributes -->
<xsl:apply-templates select="@*" mode="decoratelinks"/>
<!-- Insert image -->
<img src="external.gif" width="16" height="16" border="0"/>
<xsl:text> </xsl:text>
<!-- Copy content (subelements and text nodes) of <a> -->
<xsl:apply-templates select="*|text()"
mode="decoratelinks"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
node-set() FunctionXSLT 2.0 and XPath 2.0 are slowly progressing toward W3C
Recommendation. You might be wondering whether the
node-set() function will be part of these standards. The
answer is no, but don't worry. The authors of XSLT 2.0 made an
important decision: result tree fragments are gone. There will be no
need to use the node-set() function in XSLT 2.0 as you
can operate directly on XML fragments stored in a variable, as on any
other node-set. Regardless, you should put the node-set()
function in your bag of tools as it will take several years before
XSLT 2.0 will be deployed as widely as XSLT 1.0 is deployed today.
Related links
|
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.