Writing Your Own Functions in XSLT 2.0
Most XSLT 1.0 processors, particularly the ones written in Java, let you write extension functions in the processor's host language, link them in, and then call those functions from stylesheets. The XSLT 1.0 spec spells out specific ways to check whether a particular extension function is available and how to recover gracefully if not. In the September 2001 "Transforming XML" column, I presented examples of extension elements and functions.
If you wanted to write your own functions within a stylesheet, there were ways to fake it with named templates, but faking it won't be necessary with XSLT 2.0, which lets you write your own functions using XSLT syntax. These functions return values that can be used all over your spreadsheet, even in XPath expressions.
Let's look at a simple example. The following stylesheet creates a result tree upon seeing the root of any document, so you can run it with itself as input. It declares a function called
foo:compareCI, which does a case-insensitive comparison of two strings and returns the same values as the XSLT 2.0
compare() function described in last month's column.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:foo="http://whatever"> <!-- Compare two strings ignoring case, returning same values as compare(). --> <xsl:function name="foo:compareCI"> <xsl:param name="string1"/> <xsl:param name="string2"/> <xsl:value-of select="compare(upper-case($string1),upper-case($string2))"/> </xsl:function> <xsl:template match="/"> compareCI red,blue: <xsl:value-of select="foo:compareCI('red','blue')"/> compareCI red,red: <xsl:value-of select="foo:compareCI('red','red')"/> compareCI red,Red: <xsl:value-of select="foo:compareCI('red','Red')"/> compareCI red,Yellow: <xsl:value-of select="foo:compareCI('red','Yellow')"/> </xsl:template> </xsl:stylesheet>
Also in Transforming XML
The first thing to notice is that the declared function must come from a namespace outside of the XSLT namespace. In the example I assigned a namespace prefix of
foo to the
http://whatever URL to make it clear that you can use any namespace, as long as it's not the XSLT namespace. The URL I specified wasn't serious, but works anyway. You'll probably want to pick a URL associated with your company or project.
The actual function declaration in the sample stylesheet is in an
xsl:function element. Its structure is pretty straightforward: a
name attribute stores the function's name, and optional
xsl:param child elements name parameters that can be passed to the function, just like
xsl:param elements do in XSLT 1.0's
xsl:template elements. In the example above, the two parameters passed are the two strings to be compared.
The function's only remaining line is an
xsl:value-of instruction, which uses XPath 2.0's
upper-case() functions to perform its comparison and output the result. The return value of the function is the sequence of nodes that it outputs. If you want, you can add an
as attribute to the
xsl:function element to indicate a specific data type that the function returns. Because my
foo:compareCI() function returns the integer returned by its call to the
compare() function, I could have added an
as="xs:integer" attribute value to the
xsl:function element (which would have required declaration of the
http://www.w3.org/2001/XMLSchema namespace to go with that "ns" prefix), but I wanted to keep my first example function as simple as possible.
When run with Saxon 7's experimental XSLT 2.0 support, this stylesheet creates the following output:
<?xml version="1.0" encoding="UTF-8"?> compareCI red,blue: 1 compareCI red,red: 0 compareCI red,Red: 0 compareCI red,Yellow: -1
The third line is the most important here because it shows that the function considers "red" and "Red" to be equal. (See last month's column for the meaning of the various return values.)
XSLT 2.0 functions can be recursive. The following stylesheet includes a substring function that expects you to pass it a string (
inString) and the length of a substring to pull from that string (
length), starting at its first character. Instead of always breaking after
length characters, though, this function only breaks there if it finds a word boundary character. Otherwise, it breaks at the last word boundary before that. It does this by calling itself with the same
inString value and a
length value of
length - 1. Before making each recursive call, the function's
xsl:choose element's first
xsl:when element checks whether
$length is less than or equal to 0 and returns the entire string if so, because if
$length was decremented that far, there's no point in continuing. The second
xsl:when element checks whether the passed string is already shorter than the requested length, in which case it just returns the whole string. The third and last
xsl:when element checks whether character number
$inString is a member of the list of delimiter characters defined near the beginning of the stylesheet, and if so, returns the string up to that point, because its job is done. If none of these conditions are true, the
xsl:otherwise element makes the recursive call.
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:foo="http://whatever"> <xsl:output method="text"/> <xsl:variable name="delimiters"> ,."!?()</xsl:variable> <xsl:function name="foo:substrWordBoundary"> <xsl:param name="inString"/> <xsl:param name="length"/> <xsl:choose> <xsl:when test="$length <= 0"> <xsl:value-of select="$inString"/> </xsl:when> <xsl:when test="string-length($inString) <= $length"> <xsl:value-of select="$inString"/> </xsl:when> <xsl:when test="contains($delimiters,substring($inString,$length + 1,1))"> <xsl:value-of select="substring($inString,1,$length)"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="foo:substrWordBoundary($inString,$length - 1)"/> </xsl:otherwise> </xsl:choose> </xsl:function> <xsl:template match="/"> 20 chars: <xsl:value-of select="foo:substrWordBoundary('This is a test.Right? Yes.',20)"/> 10 chars: <xsl:value-of select="foo:substrWordBoundary('This is a test.Right? Yes.',19)"/> already short enough: <xsl:value-of select="foo:substrWordBoundary('catatonic',15)"/> no boundaries: <xsl:value-of select="foo:substrWordBoundary('catatonic',5)"/> </xsl:template> </xsl:stylesheet>
The four strings passed to the function test several possible outcomes. With any source document, the stylesheet creates this result:
20 chars: This is a test.Right 10 chars: This is a test already short enough: catatonic no boundaries: catatonic
What happens if we pass a bad parameter to the function? For example, what if we added this new line after the "no boundaries" line, passing the string "five" instead of a numeric digit as the second parameter?
bad parameter: <xsl:value-of select="foo:substrWordBoundary('catatonic','five')"/>
Without executing the function on any of the legitimate input, Saxon 7 immediately tells us about the following problem:
Error at xsl:choose on line 13 of file:/C:/dat/writing/trxml/temp/sswb1.xsl: Cannot compare xs:string to xs:integer Transformation failed: Run-time errors were reported
The stronger typing offered by XSLT 2.0 lets us plan for this a little better. By adding an
as attribute to the function's declaration for the
length parameter, like this,
<xsl:param name="length" as="xs:integer"/>
we tell the XSLT processor to check the types of the parameters when they're passed, instead of waiting for the bad data to blow up in some line of the stylesheet that doesn't know what to do with it. (Don't forget to add
xmlns:xs="http://www.w3.org/2001/XMLSchema" to the other namespace declarations in the stylesheet's start-tag.) With
length declared using this typing, Saxon 7 catches the error sooner and delivers a more informative error message:
Error at xsl:value-of on line 35 of sswb2.xsl: Required type of second argument of *** call to user function ***() is xs:integer; supplied value has type xs:string Transformation failed: Failed to compile stylesheet. 1 error detected.
Nearly all serious programming languages offer the ability to declare and use your own functions; most programmers have become accustomed to the modularity and scalability advantages that this gives them. Now XSLT 2 developers will have these advantages as well.
xsl:function element and its idea of node sequences, I realized that I could implement the classic
cdr functions that return either the first item or the remainder of a list, respectively. LISP does stand for "LISt Processing," after all, and not "Lots of Irritating Silly Parentheses". These two functions don't do much by themselves, but as two of the basic building blocks of LISP and later Scheme, they've provided the foundation for useful applications for over 40 years. (The origin of the names "car" and "cdr," pronounced "could-er," is one of the classic old twisted history of computer science stories.)
After the following stylesheet declares these two functions, it outputs the sample input list delimited by pipe characters. It then tests the functions individually and combines them into a more complex expression to extract the third member of the list sequence:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:foo="http://whatever"> <xsl:output method="text"/> <xsl:variable name="seq1" select="('a','b','c','d')"/> <xsl:function name="foo:cdr"> <xsl:param name="seq"/> <xsl:for-each select="subsequence($seq,2)"> <xsl:value-of select="."/> </xsl:for-each> </xsl:function> <xsl:function name="foo:car"> <xsl:param name="seq"/> <xsl:value-of select="item-at($seq,1)"/> </xsl:function> <xsl:template match="/"> seq1: <xsl:value-of select="string-join($seq1,'|')"/> car(seq1): <xsl:value-of select="string-join(foo:car($seq1),'|')"/> cdr(seq1): <xsl:value-of select="string-join(foo:cdr($seq1),'|')"/> car(cdr(cdr(seq1))): <xsl:value-of select= "string-join(foo:car(foo:cdr(foo:cdr($seq1))),'|')"/> </xsl:template> </xsl:stylesheet>
The output shows that it works. It may not look particularly useful, but it should provoke a smirk from some of the grayer-haired developers out there:
seq1: a|b|c|d car(seq1): a cdr(seq1): b|c|d car(cdr(cdr(seq1))): c
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.