XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

What's New in XSLT 2.0
by Evan Lenz | Pages: 1, 2

Grouping Simplified

XSLT 1.0 did not include built-in support for grouping. Certain grouping problems certainly can be solved using various techniques, such as the Muenchian Method, but such solutions tend to be rather complex and verbose. One of XSLT 2.0's requirements was that it must simplify grouping. As we shall see from a simple example below, it is well on its way to meeting that goal.

An example that's used in both the Requirements document and the XSLT 2.0 working draft involves converting the list of cities in the following simple XML document,

<cities>
  <city name="milan"  country="italy"   pop="5"/>
  <city name="paris"  country="france"  pop="7"/>
  <city name="munich" country="germany" pop="4"/>
  <city name="lyon"   country="france"  pop="2"/>
  <city name="venice" country="italy"   pop="1"/>
</cities>

to an HTML table that groups the cities by the country they are in, as follows:

<table>
   <tr>
      <th>Country</th>
      <th>City List</th>
      <th>Population</th>
   </tr>
   <tr>
      <td>italy</td>
      <td>milan, venice</td>
      <td>6</td>
   </tr>
   <tr>
      <td>france</td>
      <td>paris, lyon</td>
      <td>9</td>
   </tr>
   <tr>
      <td>germany</td>
      <td>munich</td>
      <td>4</td>
   </tr>
</table>

The difficult part of this transformation is generating the last three rows (in bold). An XSLT 1.0 solution can be seen below:

  <xsl:for-each select="cities/city[not(@country =
                           preceding::*/@country)]">
    <tr>
      <td><xsl:value-of select="@country"/></td>
      <td>
        <xsl:for-each select="../city[@country = current()/@country]">
          <xsl:value-of select="@name"/>
          <xsl:if test="position() != last()">, </xsl:if>
        </xsl:for-each>
      </td>
      <td><xsl:value-of select="sum(../city[@country =
                 current()/@country]/@pop)"/></td>
    </tr>
  </xsl:for-each>

In the above example, we first identify the first city for each unique country, which is selected by the following XPath expression:

cities/city[not(@country = preceding::*/@country)]

Then, for each group, we need to be able to refer back to all other members of the group, in order to get the list of city names for each country as well as the total population for each country. In each case, we have some redundancy because the only way to refer to the current group is with an expression such as the following:

../city[@country = current()/@country]

This is clearly not an ideal situation, since the redundancy tends to make it rather error-prone. Enter xsl:for-each-group, XSLT 2.0's answer to many of your grouping problems. The following example shows the much simpler XSLT 2.0 solution to this problem (with new features in bold):

  <xsl:for-each-group select="cities/city" group-by="@country">
    <tr>
      <td><xsl:value-of select="@country"/></td>
      <td>
        <xsl:value-of select="current-group()/@name" separator=", "/>
      </td>
      <td><xsl:value-of select="sum(current-group()/@pop)"/></td>
    </tr>
  </xsl:for-each-group>

In the above example, xsl:for-each-group initializes the "current group" as part of the XPath evaluation context. The current group is simply a sequence. Once we've set up our group using the group-by attribute, we can thereafter refer to the current group using the current-group() function. This completely eliminates the redundancy that was present in the XSLT 1.0 solution.

Note also the separator attribute on xsl:value-of. The mere presence of this attribute instructs the processor to output not just the string value of the first member of the sequence (XSLT 1.0's behavior), but the string values of all members of the sequence, in sequence order. The value of the separator attribute is an optional string that is used as a delimiter between each string in the output. For the sake of backward compatibility with XSLT 1.0, only the sequence's first member's string value is output when the separator attribute is not present.

Finally, xsl:for-each-group is able to solve different kinds of grouping problems depending on which of the three attributes you choose from: group-by (which we saw in action above), group-adjacent (which enables grouping based on adjacency of nodes in document order, e.g. transforming inline <para> elements into block <para> elements), and group-starting-with (which groups by patterns of elements in a sequence). Examples of each of these can be found in the latest XSLT 2.0 Working Draft in "13.3 Examples of Grouping".

User-defined Functions

XSLT 2.0 introduces the ability for users to define their own functions which can then be used in XPath expressions. This is an extremely powerful mechanism that should prove to be very useful. Stylesheet functions, as they are called, are defined using the xsl:function element. This element has one required attribute, the name attribute. It contains zero or more xsl:param elements, followed by zero or more xsl:variable elements, followed by exactly one xsl:result element. This restricted content model may sound limiting, but you will discover that the real power lies in the use of XPath 2.0 to define the result in the select attribute of the xsl:result element. As you may recall, XPath 2.0 includes the ability to do conditional expressions (if...then) and iterative expressions (for...return).

As the following example (taken straight from the latest working draft) shows, most of the work is done inside the select attribute of xsl:result. This stylesheet invokes the user's recursively-defined function, str:reverse(), to output the string "MAN BITES DOG".

<xsl:transform 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:str="http://user.com/namespace"
  version="2.0"
  exclude-result-prefixes="str">

<xsl:function name="str:reverse">
  <xsl:param name="sentence"/>
  <xsl:result 
     select="if (contains($sentence, ' '))
             then concat(str:reverse(substring-after($sentence, ' ')),
                         ' ',
                         substring-before($sentence, ' '))
             else $sentence"/>             
</xsl:function>

<xsl:template match="/">
<output>
  <xsl:value-of select="str:reverse('DOG BITES MAN')"/>
</output>
</xsl:template>

</xsl:transform>

Other Useful Stuff

XSLT 2.0 includes a number of other useful features that we won't go into detail here. They include a mechanism for defining a default namespace for XPath expressions, the ability to use variables in match pattern predicates, named sort specifications, the ability to read external files as unparsed text, and so on.

In addition, a large part of the XSLT 2.0 specification remains to be written, particularly the material dealing with the construction and copying of W3C XML Schema-typed content. About this, the latest working draft says, "This is work in progress. Facilities for associating type information with constructed elements and attributes are likely to appear in future drafts of XSLT 2.

Getting Your Hands Dirty

For those of you who can't wait to start trying some of this stuff out, Michael Kay has released Saxon 7.0, which includes an "experimental implementation of XSLT 2.0 and XPath 2.0". It implements a number of features in the XSLT 2.0 and XPath 2.0 working drafts, with particular attention to those features that are likely the most stable. I've tested each of the examples in this article, and Saxon 7.0 executes them all as expected.

XSLT 2.0 is still very much a work in progress, so be forewarned that a number of things could change between now and the time it reaches Recommendation status. Until then, the public is encouraged to review the specification and send their comments to xsl-editors@w3.org.



1 to 1 of 1
  1. Wow
    2002-04-16 09:40:10 Bernhard Zwischenbrugger
1 to 1 of 1