The Path of Control
May 4, 2005
Just as XPath 1.0 was part of XSLT 1.0, XPath 2.0 plays an important role in XSLT 2.0, and all the new features of XPath 2.0 will be available to XSLT 2.0 stylesheet developers. In earlier columns on XSLT 2.0, I reviewed its new data model and some of its new functions, and this month I'm going to cover XPath's new ability to do some things that every real programming language can do: conditional statements and iteration, or, as they're more colloquially known, "if" statements and "for" loops. We'll also look at a useful related technique for checking whether certain conditions do or don't exist in a set of nodes.
To be honest, I had some difficulty understanding exactly what XPath's if
statements and for
loops brought to XSLT. I could think of examples to
demonstrate their use, but the examples were either toy examples with no real utility
or
implementations of tasks that I could accomplish just fine with XSLT 1.0 and XPath
1.0. I
began to suspect that these were added to XPath more for the benefit of XQuery 1.0,
a W3C
specification that also depends heavily on XPath 2.0, than for XSLT 2.0. With some
help from
the members of the Saxon
mailing list, particularly Dave Pawson and Michael Kay, I eventually saw that
conditional expressions and for
loops make a real contribution to XSLT 2.0. I
also found an XSL-List posting with a position
paper that Michael wrote for the W3C XSL Working Group to evaluate how badly XPath
needed these constructs, and I recommend it to anyone interested in some theoretical
background on the role that XPath conditional and iteration statements can play in
XSLT. (By
"theoretical," I don't mean "highly abstract"; I just mean that the paper provides
a good "big picture" of the potential fit of all the pieces of the problem.) Several
of my
examples here are based on examples in that paper.
Conditional Expressions
Conditional expressions let you evaluate a condition and only take action if the condition
is true. In most programming languages, a conditional expression also lets you specify
another action to take if the condition is false; it's always been a bit annoying
that XSLT
1.0's xsl:if
instruction had no corresponding xsl:else
instruction. You can get the same
effect with an xsl:choose
element that has one xsl:when
child and
an xsl:otherwise
child, but this is pretty verbose, even for XSLT.
An XPath 2.0 if
expression lets you pack a lot (even an else
condition!) into a pretty small space. In the following template rule, a conditional
expression in an attribute value template for the font
element's
color
attribute sets its value to "blue" if the cost
element's value is greater than or equal to zero and "red" otherwise:
<xsl:template match="cost"> <font color="{if (. >= 0) then 'blue' else 'red'}"> <xsl:apply-templates/> </font> </xsl:template>
The else
part of an XPath if
expression is not only available,
it's required, because the entire expression must return something whether the condition
is
true or not. (As the opening of the XPath 2.0 spec tells us, "XPath 2.0 is an
expression language.") The parentheses are also required.
The spec's syntax for conditional expressions shows ExprSingle
where the example
above has the strings "blue" and "red". The spec's production for
ExprSingle
shows that it can be an if
expression in and of
itself, letting us build some very complex conditional expressions. In the following
modification of the template rule from above, I've replaced the original "blue" and
"red"
strings with new if
expressions:
<xsl:template match="cost"> <font color="{if (. >= 0) then if (. > 50) then 'blue' else 'lightblue' else if (. < -50) then 'red' else 'pink'}"> <xsl:apply-templates/> </font> </xsl:template>
Remember, there's nothing wrong with some extra white space in an XPath expression,
and
while a color
attribute value of "{if (. >= 0) then if (.
> 50) then 'blue' else lightblue' else if (. < -50) the 'red' else
'pink'}"
above would work just as well, it's much more difficult to follow the
control flow when trying to read it.
The expressions returned by XPath 2.0 if
expressions can be node sets, which
is where they really move beyond the capabilities of the XPath 1.0 xsl:if
instruction. For example, when the following template rule finds a checkBook
element, it sums up all the values of its credit
element children if the
$depositReport
variable is equal to a Boolean true, and it sums up the
debit
children otherwise:
<xsl:template match="checkBook"> <xsl:value-of select="sum(if ($depositReport) then credit else debit)"/> </xsl:template>
For Expressions
Many programming languages offer a choice of expressions that execute instructions
a
specific number of times, but a for
loop (named for the keyword that usually
begins it) is the most common. XSLT 1.0's xsl:for-each
instruction is
different: it lets you iterate across a set of nodes, performing one or more action
on each.
To execute something a specific number of times, XSLT 1.0 makes you use the recursion
techniques of its ancestors Scheme and LISP. (For a comparison and demonstration of
these
two techniques, see my earlier column Getting
Loopy.)
XPath 2.0's for
expressions (and XSLT 2.0's xsl:for-each
instructions) let you iterate across a set of nodes. They also let you iterate across
a
range of numbers, executing anything you like a specific number of times, and returning
a
sequence. A for
expression always returns a sequence; the one in the following
returns the numbers 2, 4, 6, 8, and 10:
<xsl:value-of select="for $i in (1 to 5) return $i * 2"/>
Making use of the individual members of the returned sequence would require a
for-each
loop to iterate through that sequence, so in many cases the XPath
for
loop doesn't necessarily mean tighter code in your stylesheet. However,
when you pass such a sequence (especially a sequence created from a document's nodes,
as
opposed to a range of numbers like in the example above) to a function that can operate
on
it, the power of the XPath 2.0 for
expression becomes more apparent. (Don't
forget that XPath 2.0 lets you write your own new functions in your stylesheet, which amplifies this power even
more.) For example, imagine that you have the following source document:
<items> <item price="14" quantity="4">fountain</item> <item price="9" quantity="5">bottle rack</item> <item price="8" quantity="2">bicycle wheel</item> <item price="10" quantity="3">50 cc of Paris air</item> </items>
The following template rule, upon finding the items
element, multiplies the
price
and quantity
attribute values for each item
element and sums up the results:
<xsl:template match="items"> <xsl:value-of select="sum(for $i in item return $i/@price * $i/@quantity)"/> </xsl:template>
The identification of the nodes to iterate over need not be so simple. For example, we could add the predicate shown in the following if we only want to total up the cost of buying items with a price value below 10:
<xsl:template match="items"> <xsl:value-of select="sum(for $i in item[@price < 10] return $i/@price * $i/@quantity)"/> </xsl:template>
Quantified Expressions
Quantified expressions let you do existential quantification (checking whether some
condition is true for some member of a set) and universal quantification (checking
whether
some condition is true for all members of a set). Because a quantified expression
returns a
Boolean value, we can use it as the test condition in an XPath 2.0 if
condition, or even in a regular XSLT xsl:if
or xsl:when
test.
When the single template rule in the following stylesheet finds the items
element in the sample document above, it checks that no more than $100 is being spent
on any
item by multiplying the quantity times the price for each item
and then
comparing the result with the number 100. If, as the syntax says, every item
satisfies that condition, the if
test returns the string "approved".
Otherwise it returns the string "rejected".
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="items"> <xsl:variable name="amountsOK" select="every $c in item satisfies ($c/@price * $c/@quantity < 100)"/> <xsl:value-of select="if ($amountsOK) then 'approved' else 'rejected'"/> </xsl:template> </xsl:stylesheet>
To demonstrate the use of existential quantification, we can get the same effect by
checking for the opposite. If the test described in the foundBadPurchase
variable in the next stylesheet finds one item for which price times quantity is greater
than or equal to 100, the some...in...satisfies
expression returns a Boolean
true, and the if
condition rejects the purchase.
<xsl:template match="items"> <xsl:variable name="foundBadPurchase" select="some $c in item satisfies ($c/@price * $c/@quantity >= 100)"/> <xsl:value-of select="if ($foundBadPurchase) then 'rejected' else 'approved'"/> </xsl:template>
The syntax for
for
expressions and the syntax for
quantified expressions both have our friend ExprSingle
in several places.
We already saw that the definition for ExprSingle
lets us use an
if
expression as an ExprSingle
; it also shows that a
for
expression and a quantified expression both qualify as an
ExprSingle
. This lets us go well beyond the incorporation of if
expressions inside of if
expressions: you can embed if
expressions, quantified expressions, and for
expressions inside of each other
to whatever depth you wish. (Michael Kay's position paper mentioned above describes
the
desire of some, when this was all being designed, to restrict the use of conditional
expressions to the top level--"whatever that means", as he so elegantly put
it--and points out what a mess this could have caused.)
If you pursue ambitious levels of complexity with your combinations of these expressions,
remember my earlier point about extra white space having no effect on the execution
of an
XPath expression. If you're going to get complex, make it readable. Another classic
coding
trick for enhancing readability is to store a moderately complex expression in a variable,
as I did with $amountsOK
and $foundBadPurchase
in the last two
examples, and then referencing those variables from other expressions, instead of
putting
all the complexity in one place. These principles certainly aren't exclusive to XSLT,
but if
you've ever heard people complain about the readability of XSLT stylesheets, then
you'll
know that this extra effort is worthwhile.