Menu

Sorting in XSLT

July 3, 2002

Bob DuCharme

XSLT's xsl:sort instruction lets you sort a group of similar elements. Attributes for this element let you add details about how you want the sort done -- for example, you can sort using alphabetic or numeric ordering, sort on multiple keys, and reverse the sort order.

To demonstrate different ways to sort, we'll use the following document.

<employees>

  <employee hireDate="04/23/1999">
    <last>Hill</last>
    <first>Phil</first>
    <salary>100000</salary>
  </employee>

  <employee hireDate="09/01/1998">
    <last>Herbert</last>
    <first>Johnny</first>
    <salary>95000</salary>
  </employee>

  <employee hireDate="08/20/2000">
    <last>Hill</last>
    <first>Graham</first>
    <salary>89000</salary>
  </employee>
</employees>

(All stylesheets, input documents, and output documents shown in this article are in this zip file.) This first stylesheet sorts the employee children of the employees element by salary.

<!-- xq424.xsl: converts xq423.xml into xq425.xml -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">

  <xsl:output method="text"/>

  <xsl:template match="employees">
    <xsl:apply-templates>
      <xsl:sort select="salary"/>
    </xsl:apply-templates>
  </xsl:template>

  <xsl:template match="employee">
    Last:      <xsl:apply-templates select="last"/>
    First:     <xsl:apply-templates select="first"/>
    Salary:    <xsl:apply-templates select="salary"/>
    Hire Date: <xsl:apply-templates select="@hireDate"/>
    <xsl:text>
  </xsl:text>

  </xsl:template>

</xsl:stylesheet>

It's pretty simple. The employees element's template has an xsl:apply-templates instruction with an xsl:sort child to tell the XSLT processor to sort the employees element's child elements. The xsl:sort instruction's select attribute specifies the sort key to use: the employee elements' salary values. (If you omit the select attribute, the XSLT processor uses a string version of the elements to be sorted as a sort key.) The employee element's template rule adds each of its child node's values to the result tree preceded by a label, and a final xsl:text element adds a carriage return after each hire date value.

Note Most xsl:apply-templates elements you see in XSLT stylesheets are empty. When you sort an element's children, the xsl:sort element goes between the start- and end-tags of the xsl:apply-templates element that tells the XSLT processor to process these children. The only other place you can put an xsl:sort instruction is inside of the xsl:for-each instruction used to iterate across a node set, as we'll see below.

With the document above, this stylesheet gives us this output:

    Last:      Hill
    First:     Phil
    Salary:    100000
    Hire Date: 04/23/1999
  
    Last:      Hill
    First:     Graham
    Salary:    89000
    Hire Date: 08/20/2000
  
    Last:      Herbert
    First:     Johnny
    Salary:    95000
    Hire Date: 09/01/1998

The employees are sorted by salary, but they're sorted alphabetically -- "1" comes before "8" and "9", so a salary of "100000" comes first. But we want the salary values treated as numbers, so we make a simple addition to the template's xsl:sort instruction:

<!-- xq426.xsl: converts xq423.xml into xq427.xml -->

<xsl:template match="employees">
  <xsl:apply-templates>
    <xsl:sort select="salary" data-type="number"/>
  </xsl:apply-templates>
</xsl:template>

Now, the output is sorted by the salary element's numeric value:

  Last:      Hill
  First:     Graham
  Salary:    89000
  Hire Date: 08/20/2000

  Last:      Herbert
  First:     Johnny
  Salary:    95000
  Hire Date: 09/01/1998

  Last:      Hill
  First:     Phil
  Salary:    100000
  Hire Date: 04/23/1999

To reverse the order of this or any other sort, add an order attribute with a value of "descending":

<!-- xq428.xsl: converts xq423.xml into xq429.xml -->

<xsl:template match="employees">
  <xsl:apply-templates>
    <xsl:sort select="salary" data-type="number" order="descending"/>
  </xsl:apply-templates>
</xsl:template>

Whether the data-type attribute has a value of "number" like the stylesheet above or "text" (the default), an order value of "descending" reverses the order of the sort:

  Last:      Hill
  First:     Phil
  Salary:    100000
  Hire Date: 04/23/1999

  Last:      Herbert
  First:     Johnny
  Salary:    95000
  Hire Date: 09/01/1998

  Last:      Hill
  First:     Graham
  Salary:    89000
  Hire Date: 08/20/2000

If your xsl:apply-templates (or xsl:for-each) element has more than one xsl:sort instruction inside of it, the XSLT processor treats them as multiple keys to the sort. For example, the stylesheet with this next template sorts the employees by last name and then by first name so that any employees with the same last name will be in first name order.

<!-- xq430.xsl: converts xq423.xml into xq431.xml -->

<xsl:template match="employees">
  <xsl:apply-templates>
    <xsl:sort select="last"/>
    <xsl:sort select="first"/>
  </xsl:apply-templates>
</xsl:template>

When applied to the document above, the result shows Johnny Herbert before Phil and Graham Hill, and the secondary sort puts Graham Hill before Phil Hill:

  Last:      Herbert
  First:     Johnny
  Salary:    95000
  Hire Date: 09/01/1998

  Last:      Hill
  First:     Graham
  Salary:    89000
  Hire Date: 08/20/2000

  Last:      Hill
  First:     Phil
  Salary:    100000
  Hire Date: 04/23/1999
  

The sort key doesn't need to be an element child of the sorted elements. The xsl:sort instruction's select attribute can take any XPath expression as a sort key. For example, the following version sorts the employees by their hireDate attribute values:

<!-- xq432.xsl: converts xq423.xml into xq433.xml -->

<xsl:template match="employees">
  <xsl:apply-templates>
    <xsl:sort select="@hireDate"/>
  </xsl:apply-templates>
</xsl:template>

Treating the dates as strings doesn't do much good, because they're sorted alphabetically,

  Last:      Hill
  First:     Phil
  Salary:    100000
  Hire Date: 04/23/1999

  Last:      Hill
  First:     Graham
  Salary:    89000
  Hire Date: 08/20/2000

  Last:      Herbert
  First:     Johnny
  Salary:    95000
  Hire Date: 09/01/1998
  

but it's easy enough to have three sort keys based on the year, month, and day substrings of the date string:

<!-- xq434.xsl: converts xq423.xml into xq435.xml -->

<xsl:template match="employees">
  <xsl:apply-templates>
    <xsl:sort select="substring(@hireDate,7,4)"/> <!-- year  -->
    <xsl:sort select="substring(@hireDate,1,2)"/> <!-- month -->
    <xsl:sort select="substring(@hireDate,3,2)"/> <!-- day   -->
  </xsl:apply-templates>
</xsl:template>

This stylesheet sorts the dates properly. (An important feature of XSLT 2.0 -- and, some say, the one that's going to slow its progress toward Recommendation status the most -- is the ability to handle typed data. Once in place, you'll be able to just say "this attribute is a date, so sort it that way.")

  Last:      Herbert
  First:     Johnny
  Salary:    95000
  Hire Date: 09/01/1998

  Last:      Hill
  First:     Phil
  Salary:    100000
  Hire Date: 04/23/1999

  Last:      Hill
  First:     Graham
  Salary:    89000
  Hire Date: 08/20/2000
  

All the examples so far have sorted the children (the employee elements) of an element (employees) using one or more child nodes of those children (the salary, first, and last elements or the hireDate attribute) as sort keys. The previous example's use of the hireDate attribute showed that the expression used as the xsl:sort element's select attribute doesn't have to be a child element name, but can be an attribute name instead, or even a value returned by a function.

Your sort key can be an even more complex XPath expression. For example, the next stylesheet sorts the wine elements in this document's winelist element, but not by a child of the wine element; it sorts the wine elements by a grandchild of the wine elements: the prices child's discounted element.

<winelist>

  <wine grape="Chardonnay">
    <winery>Lindeman's</winery>
    <product>Bin 65</product>
    <year>1998</year>
    <prices>
      <list>6.99</list>
      <discounted>5.99</discounted>
      <case>71.50</case>
    </prices>
  </wine>

<wine grape="Chardonnay">
  <winery>Benziger</winery>
  <product>Carneros</product>
  <year>1997</year>
  <prices>
    <list>10.99</list>
    <discounted>9.50</discounted>
    <case>114.00</case>
  </prices>
</wine>

  <wine grape="Cabernet">
    <winery>Duckpond</winery>
    <product>Merit Selection</product>
    <year>1996</year>
    <prices>
      <list>13.99</list>
      <discounted>11.99</discounted>
      <case>143.50</case>
    </prices>
  </wine>

  <wine grape="Chardonnay">
    <winery>Kendall Jackson</winery>
    <product>Vintner's Reserve</product>
    <year>1998</year>
    <prices>
      <list>12.50</list>
      <discounted>9.99</discounted>
      <case>115.00</case>
    </prices>
  </wine>
</winelist>

The sort key is only slightly more complicated than those shown in the earlier examples. It's an XPath expression saying "the discounted child of the prices element".

<!-- xq437.xsl: converts xq436.xml into xq438.xml -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">

  <xsl:template match="winelist">
    <xsl:copy>
      <xsl:apply-templates>
         <xsl:sort data-type="number" select="prices/discounted"/>
         </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="*">
    <xsl:copy>
      <xsl:apply-templates/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

The entire stylesheet is not very big. It just copies the wine elements, sorted according to the sort key:

<?xml version="1.0" encoding="UTF-8"?>
<winelist>
<wine>
    <winery>Lindeman's</winery>
    <product>Bin 65</product>
    <year>1998</year>
    <prices>
      <list>6.99</list>
      <discounted>5.99</discounted>
      <case>71.50</case>
    </prices>
  </wine><wine>
  <winery>Benziger</winery>
  <product>Carneros</product>
  <year>1997</year>
  <prices>
    <list>10.99</list>
    <discounted>9.50</discounted>
    <case>114.00</case>
  </prices>
</wine><wine>
    <winery>Kendall Jackson</winery>
    <product>Vintner's Reserve</product>
    <year>1998</year>
    <prices>
      <list>12.50</list>
      <discounted>9.99</discounted>
      <case>115.00</case>
    </prices>
  </wine><wine>
    <winery>Duckpond</winery>
    <product>Merit Selection</product>
    <year>1996</year>
    <prices>
      <list>13.99</list>
      <discounted>11.99</discounted>
      <case>143.50</case>
    </prices>
  </wine></winelist>

Let's look at how the xsl:for-each instruction can use xsl:sort. The following stylesheet takes the same winelist document above and lists the wines. When it gets to a Chardonnay, it lists all the other Chardonnays alphabetically.

<!-- xq439.xsl: converts xq436.xml into xq440.txt -->
<!DOCTYPE stylesheet [
<!ENTITY space "<xsl:text> </xsl:text>">
]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
  <xsl:output method="xml" omit-xml-declaration="yes" indent="no"/>

  <xsl:template match="wine">
   <xsl:apply-templates select="winery"/>&space;
   <xsl:apply-templates select="product"/>&space;
   <xsl:apply-templates select="year"/>&space;
   <xsl:apply-templates select="@grape"/>
   <xsl:if test="@grape = 'Chardonnay'">
    <xsl:text>
  other Chardonnays:
</xsl:text>
    <xsl:for-each 
      select="preceding-sibling::wine[@grape = 'Chardonnay'] |
                     following-sibling::wine[@grape = 'Chardonnay']">
      <xsl:sort select="winery"/>
      <xsl:text>    </xsl:text>
      <xsl:value-of select="winery"/>&space;
      <xsl:value-of select="product"/><xsl:text>
</xsl:text>
    </xsl:for-each>
    </xsl:if>
</xsl:template>

Before we examine how the stylesheet does this, let's take a look at the result:

Lindeman's Bin 65 1998 Chardonnay
  other Chardonnays:
    Benziger Carneros
    Kendall Jackson Vintner's Reserve

Benziger Carneros 1997 Chardonnay
  other Chardonnays:
    Kendall Jackson Vintner's Reserve
    Lindeman's Bin 65

  Duckpond Merit Selection 1996 Cabernet

  Kendall Jackson Vintner's Reserve 1998 Chardonnay
  other Chardonnays:
    Benziger Carneros
    Lindeman's Bin 65
    

(First, notice the "&space;" entity references throughout the stylesheet. Instead of writing "<xsl:text> </xsl:text>" over and over because I needed single spaces in so many places, it was easier to declare an entity named space in the DOCTYPE declaration with this xsl:text element as content and to then plug it in with an entity reference whenever I needed it.) The xsl:template template rule for the wine element has xsl:apply-templates instructions for its winery, product, and year element children followed by one for its grape attribute. Then, if the grape attribute has a value of "Chardonnay", it adds the text "other Chardonnays:" to the result tree followed by the list of Chardonnays, which are added to the result tree using an xsl:for-each instruction.

The select attribute of the xsl:for-each attribute selects all the nodes that are either preceding siblings of the current node with a grape value of "Chardonnay" or following siblings of the current node with the same grape value. (The "|" symbol is the "or" part.) For each wine element that meets this select attribute's condition, the template first adds some white space indenting with an xsl:text element, then the value of the wine element's winery child, a space, and the value of its product child. The first instruction in this xsl:for-each element is an xsl:sort element, which tells the XSLT processor to sort the nodes selected by the xsl:for-each instruction alphabetically in "winery" order. That's how they look in the result: after the first "other Chardonnays:" label, "Kendall Jackson" comes after "Benziger"; after the second, "Lindeman's" comes after "Kendall Jackson"; and after the last one, "Lindeman's" comes after "Benziger".

Because the xsl:for-each instruction lets you grab and work with any node set that you can describe using an XPath expression, the ability to sort one of these node sets makes it one of XSLT's most powerful features.

Next month, we'll see how xsl:sort can help find the first, last, biggest, and smallest value in a document according to a certain criteria. (If you can't wait, see my book XSLT Quickly, from which these columns are excerpted.)