XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Finding the First, Last, Biggest, Smallest

Finding the First, Last, Biggest, Smallest

August 07, 2002

Sometimes you want to know which element or record is first or last in a given set, or you want to know which element or record has a value that is the greatest or smallest among the corresponding values in that set -- for example, which employee element has the lowest value for a hireDate subelement or attribute. These operations are typically performed by a query language. You don't need a separate query language, however, to do these when you're developing with XSLT. If you can describe a set of nodes from a document with a single-step XPath expression, then you can get the first of those nodes by adding a predicate of [1] to that expression, and you can find out the last one by adding a predicate of [last()]. To get an element or attribute value with the greatest or smallest value in it, you can sort the nodes using any of the sorting options that we saw in last month's column and then use the same predicates to pick out the one at either end.

To demonstrate, we'll pull out different titles from the following chapter document. (All stylesheets, sample input, and sample output can be found in this zip file.) Note the nesting structure of the chapter element and its descendants that begin with a title element.

<chapter><title>"Paradise Lost" Excerpt</title>
  <para>Then with expanded wings he steers his flight</para>
  <figure><title>"Incumbent on the Dusky Air"</title>
  <graphic fileref="pic1.jpg"/></figure>
  <para>Aloft, incumbent on the dusky Air</para>
  <sect1>
    <para>That felt unusual weight, till on dry Land</para>
    <figure><title>"He Lights"</title>
    <graphic fileref="pic2.jpg"/></figure>
    <para>He lights, if it were Land that ever burned</para>
    <sect2>
      <para>With solid, as the Lake with liquid fire</para>
      <figure><title>"The Lake with Liquid Fire"</title>
      <graphic fileref="pic3.jpg"/></figure>
    </sect2>
  </sect1>
</chapter>

The following template lists the first and last title elements in the chapter document by adding the [1] and [last()] predicates to the XPath expression descendant::title, which contains all of the title elements within the chapter element whether they're children, grandchildren, or descendants of the grandchildren.

<!-- xq358.xsl: converts xq357.xml into xq359.txt -->

<xsl:template match="chapter">
  First title in chapter: 
  <xsl:value-of select="descendant::title[1]"/>
  Last title in chapter: 
  <xsl:value-of select="descendant::title[last()]"/>
</xsl:template>

Although the first title element in the chapter is a child of the chapter element and the last title element is a great-great-grandchild (being a grandchild of the sect2 element, which is a grandchild of chapter), the template rule finds them and adds their contents to the result tree:


  First title in chapter: 
  "Paradise Lost" Excerpt
  Last title in chapter: 
  "The Lake with Liquid Fire"

Why doesn't this work with a multi-step XPath expression? Because a predicate in an XPath location step is only applied to the nodes in that location step. For example, let's say we want the last title of the last figure element in the chapter shown above. The XPath expression in the following template won't do it.

<!-- xq360.xsl: converts xq357.xml into xq361.txt -->

<xsl:template match="chapter">
  Last figure title in chapter?
     <xsl:value-of select="descendant::figure/title[last()]"/>
  No.
</xsl:template>

The [last()] predicate here isn't asking for the last figure title in the chapter; it's looking for the last title element within each figure element. Each of those figure elements only has one title, so the expression returns a node list of all those figure elements' title elements. When the xsl:value-of instruction converts a node list to a text node for the result tree, it only gets the first one, so we see the first figure element's title element in the result:


  Last figure title in chapter?
     "Incumbent on the Dusky Air"
  No.

What if we really do want the title of the last figure element in the chapter? The secret to getting the first or last node of a node list described by a more complex XPath expression is to have an xsl:for-each instruction get the list of nodes in question and to then get the last (or first) one in that list.

For example, the following template rule has an xsl:for-each instruction going through the title elements of all the figure elements descended from the context node. As it goes through them, one xsl:if element checks whether each node is the first in this list, and if so, it adds a message about this to the result tree. A second xsl:if element does the same for the last node in the list.

<!-- xq362.xsl: converts xq357.xml into xq363.txt -->

<xsl:template match="chapter">

  <xsl:for-each select="descendant::figure/title">

    <xsl:if test="position() = 1">
      First figure title in chapter: <xsl:value-of select="."/>
    </xsl:if>

    <xsl:if test="position() = last()">
      Last figure title in chapter: <xsl:value-of select="."/>
    </xsl:if>

  </xsl:for-each>

</xsl:template>

The result shows just what we wanted: the title of the first figure element in the document and the title of the last figure element in the document.

      First figure title in chapter: "Incumbent on the Dusky Air"
      Last figure title in chapter: "The Lake with Liquid Fire"

What if we wanted the figure titles that were the first and last alphabetically? We simply add an xsl:sort instruction inside the xsl:for-each element.

<!-- xq364.xsl: converts xq357.xml into xq365.txt -->

<xsl:template match="chapter">

  <xsl:for-each select="descendant::figure/title">

    <xsl:sort/>

    <xsl:if test="position() = 1">
      First figure title in chapter: <xsl:value-of select="."/>
    </xsl:if>

    <xsl:if test="position() = last()">
      Last figure title in chapter: <xsl:value-of select="."/>
    </xsl:if>

  </xsl:for-each>

</xsl:template>

The result shows the first and last entries from an alphabetically sorted list of figure titles.


      First figure title in chapter: "He Lights"
      Last figure title in chapter: "The Lake with Liquid Fire"

Related Reading

XSLT

XSLT
By Doug Tidwell

Because the xsl:sort instruction has no select attribute to identify a sort key, a default sort key of "." is used, which uses the string-value of the current node—in this case, the nodes that the xsl:for-each element is counting through—as the sort key. (Last month's column described the various attributes of xsl:sort to help you control how the sorting was performed and their default values; see also my book XSLT Quickly for this discussion of the use of the xsl:sort instruction.)

In addition to using the xsl:sort instruction to find the first and last values alphabetically, you can use it to find the first and last or greatest and smallest values for any sort key. For example, let's say we want to know who has the highest and lowest salaries of all the employees in the following list.

<employees>

  <employee hireDate="04/23/1999">
    <last>Hill</last>
    <first>Phil</first>
    <salary>100000</salary>
  </employee>

  <employee hireDate="09/01/1998">
    <last>Herbert</last>
    <first>Johnny</first>
    <salary>95000</salary>
  </employee>

  <employee hireDate="08/20/2000">
    <last>Hill</last>
    <first>Graham</first>
    <salary>89000</salary>
  </employee>

</employees>

The following template rule sorts the employee elements within the employees element by their salary, with a data-type attribute telling the XSLT processor to treat the salary values as numbers and not as strings. (Otherwise, a salary of "100000" would come before a salary of "89000".) As with the previous example, two xsl:if elements add messages to the result for the first and last nodes in the list that the xsl:for-each instruction is counting through.

<!-- xq367.xsl: converts xq366.xml into xq368.txt -->

<xsl:template match="employees">
  <xsl:for-each select="employee">
  <xsl:sort select="salary" data-type="number"/>

    <xsl:if test="position() = 1">
      Lowest salary: <xsl:apply-templates/>
    </xsl:if>

    <xsl:if test="position() = last()">
      Highest salary: <xsl:apply-templates/>
    </xsl:if>

  </xsl:for-each>
</xsl:template>

Because this list is sorted numerically by employee salary, the result tells us which employees have the lowest and highest salaries:


      Lowest salary: 
    Hill
    Graham
    89000
  
      Highest salary: 
    Hill
    Phil
    100000
  
    

Also in Transforming XML

Automating Stylesheet Creation

Appreciating Libxslt

Push, Pull, Next!

Seeking Equality

The Path of Control

If the employees' salary figures were stored in an attribute instead of in an element, finding the largest and smallest salary figures would be the same, except that the template would sort the employee elements using the salary attribute value as a sort key instead of the salary child element.

Remember, for anything you can sort on, you can always find the first or last values of the sorted list. This makes it easy to find the biggest, smallest, earliest, latest, or whatever values the first and last entries of that sorted list represent.



1 to 1 of 1
  1. eliminating duplicates
    2002-08-13 06:08:56 charles asafo
1 to 1 of 1