XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Automatic Numbering, Part Two

Automatic Numbering, Part Two

December 11, 2002

In last month's column, we saw how XSLT's xsl:number element lets you add numbers to your output for numbered lists or chapter and section headers. We looked at some of the attributes of this element that let you control its appearance -- for example, how to output the numbers as Roman numerals, as letters instead of numbers, and how to add leading zeros. We also saw the basics of numbering sections and subsections, but only the basics. This month we'll learn how to gain real control over section numbering, and we'll look at a more efficient alternative to xsl:number that's sometimes better for simple numbering.

We left off adding section numbers to the output of the following document's sections. They were all whole numbers: 1., 2., 3., and so forth, restarting the numbering with each chapter. (All sample documents and stylesheets are available in this zip file.)

<book><title>Title of Book</title>
 <chapter><title>First Chapter</title>
  <sect1><title>First Section, First Chapter</title>
    <figure><title>First picture in book</title>
      <graphic fileref="pic1.jpg"/></figure>
  </sect1>
 </chapter>
 <chapter><title>Second Chapter</title>
  <sect1><title>First Section, Second Chapter</title>
   <sect2>
    <title>First Subsection, First Section, Second Chapter</title>
    <figure><title>Second picture in book</title>
      <graphic fileref="pic2.jpg"/></figure>
   </sect2>
   <sect2>
    <title>Second Subsection, First Section, Second Chapter</title>
    <figure><title>Third picture in book</title>
      <graphic fileref="pic1.jpg"/></figure>
   </sect2>
   <sect2>
    <title>Third Subsection, First Section, Second Chapter</title>
    <figure><title>Fourth picture in book</title>
      <graphic fileref="pic1.jpg"/></figure>
   </sect2>
  </sect1>
  <sect1><title>Second Section, Second Chapter</title>
   <para>The End.</para>
  </sect1>
 </chapter>
</book>

If we want the sect1 elements in chapter 2 to be numbered 2.1, 2.2, 2.3, the "sect1" template rule from our last stylesheet needs the count attribute to tell the XSLT processor which of the multiple levels of elements to count. The template rule for the nested color elements didn't need this because when no count attribute is specified, the XSLT processor counts any node with the same type as the current node. In that case the current node was a color element, so it counted all the color element nodes, even when some were inside of others.

In this next template rule the xsl:number instruction counts chapter and sect1 elements to figure out the number to assign to each sect1 element:

<!-- xq408.xsl: converts xq405.xml into xq409.txt -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>

<xsl:template match="sect1">
  <xsl:number format="1. " level="multiple" count="chapter|sect1"/>
  <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

Applying this template to the same source document numbers the sect1 elements as 2.1 and 2.2, but it doesn't number the chapter elements:

Title of Book
 First Chapter
  1.1. First Section, First Chapter
    First picture in book
      
  
 
 Second Chapter
  2.1. First Section, Second Chapter
   
    First Subsection, First Section, Second Chapter
    Second picture in book
      
   
   
    Second Subsection, First Section, Second Chapter
    Third picture in book
      
   
   
    Third Subsection, First Section, Second Chapter
    Fourth picture in book
      
   
  
  2.2. Second Section, Second Chapter
   The End.
  
 

That's because it's a template rule for the sect1 element, and the stylesheet had no template rule to add numbers for the chapter elements. The next stylesheet includes a "chapter" template rule along with a sect2 template rule that counts the chapter, sect1, and sect2 elements to create three-level numbers for the sect2 elements.

<!-- xq410.xsl: converts xq405.xml into xq411.txt -->

<xsl:template match="chapter">
  <xsl:number format="1. "/>
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="sect1">
  <xsl:number format="1. " level="multiple" count="chapter|sect1"/>
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="sect2">
  <xsl:number format="1. " level="multiple" 
              count="chapter|sect1|sect2"/>
  <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

With the same document as input, the output shows first, second, and third level numbered headers for the chapter, sect1, and sect2 elements:

Title of Book
 1. First Chapter
  1.1. First Section, First Chapter
    First picture in book
      
  
 
 2. Second Chapter
  2.1. First Section, Second Chapter
   2.1.1. 
    First Subsection, First Section, Second Chapter
    Second picture in book
      
   
   2.1.2. 
    Second Subsection, First Section, Second Chapter
    Third picture in book
      
   
   2.1.3. 
    Third Subsection, First Section, Second Chapter
    Fourth picture in book
      
   
  
  2.2. Second Section, Second Chapter
   The End.
  
 

What if we don't want renumbering to restart with the beginning of each chapter or section? For example, if we number the figure elements with the following template rule,

<!-- xq412.xsl: converts xq405.xml into xq413.txt -->

<xsl:template match="figure">
  <xsl:number format="1. "/><xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

they all come out as number 1, because each is the first figure element within its particular container element:

Title of Book
 First Chapter
  First Section, First Chapter
    1. First picture in book
      
  
 
 Second Chapter
  First Section, Second Chapter
   
    First Subsection, First Section, Second Chapter
    1. Second picture in book
      
   
   
    Second Subsection, First Section, Second Chapter
    1. Third picture in book
      
   
   
    Third Subsection, First Section, Second Chapter
    1. Fourth picture in book
      
   
  
  Second Section, Second Chapter
   The End.
  
 

We want the figures to be numbered sequentially throughout the book, so we set the xsl:number element's level attribute to "any". This tells the XSLT processor to count all the nodes that are the same as the current node (in this case, figure nodes) throughout the document. With this one small change in the previous template rule,

<!-- xq414.xsl: converts xq405.xml into xq415.txt -->

<xsl:template match="figure">
  <xsl:number format="1. " level="any"/><xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

the result versions of the figure elements are numbered 1, 2, 3, and 4 throughout the document regardless of the source tree level where each one is located:

Title of Book
 First Chapter
  First Section, First Chapter
    1. First picture in book
      
  
 
 Second Chapter
  First Section, Second Chapter
   
    First Subsection, First Section, Second Chapter
    2. Second picture in book
      
   
   
    Second Subsection, First Section, Second Chapter
    3. Third picture in book
      
   
   
    Third Subsection, First Section, Second Chapter
    4. Fourth picture in book
      
   
  
  Second Section, Second Chapter
   The End.
  
 

If you don't want numbering to start at the beginning of your document and keep advancing throughout that document, but you also don't want an element type's number reset to 1 each time that element shows up in a new container, you can use the xsl:number element's from attribute to constrain a level value of "any". If the from attribute names an element type (and it can name several, because you can use a pattern here), counting restarts each time one of those elements starts, and only when one of those named elements starts.

This gives you more flexibility than using a level value of multiple because the XSLT processor won't worry about the number of hierarchical levels between the elements being counted and the ones used for resetting the counting. For example, the following template rule is just like the last one except for its from value of "chapter":

<!-- xq416.xsl: converts xq405.xml into xq417.txt -->

<xsl:template match="figure">
  <xsl:number format="1. " level="any" from="chapter"/>
  <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>

It numbers the figure elements sequentially, regardless of their level, restarting the count with each new chapter:

Title of Book
 First Chapter
  First Section, First Chapter
    1. First picture in book
      
  
 
 Second Chapter
  First Section, Second Chapter
   
    First Subsection, First Section, Second Chapter
    1. Second picture in book
      
   
   
    Second Subsection, First Section, Second Chapter
    2. Third picture in book
      
   
   
    Third Subsection, First Section, Second Chapter
    3. Fourth picture in book
      
   
  
  Second Section, Second Chapter
   The End.
  
 

You can use the from attribute with a level value of "multiple", but the XSLT processor will still reset the counting with descendants of any from elements. For example, if that xsl:number tag in the last example had said <xsl:number format="1. " level="multiple" from="chapter|figure"/>, all the figures would have been number 1 because the level setting of "multiple" would have told the XSLT processor to reset the numbering for each sect2 element.

Heavy use of the xsl:number element can cost you, in the computer science sense of the word. (By "heavy use" I don't necessarily mean stylesheets that use this instruction a lot -- if your stylesheet only uses it once in one template and your source document has the XSLT processor calling that template 1000 times, that's heavy use.) For really simple numbering, an xsl:value-of instruction that uses the position() function in its select attribute can mean a faster document transformation than using the xsl:number instruction.

The following template rule uses the position() function in an xsl:value-of element's select attribute to put numbers before each color element's content as it's added to the result tree. The xsl:text element adds a carriage return after the contents of each color element.

<!-- xq418.xsl: converts xq394.xml into xq419.txt -->

  <xsl:template match="colors">
    <xsl:for-each select="color">
      <xsl:value-of select="position()"/>. <xsl:value-of select="."/>
      <xsl:text>
</xsl:text>
    </xsl:for-each>
  </xsl:template>

With the simple colors document shown at the beginning of this chapter, it creates this output:

1. red
2. green
3. blue
4. yellow

Adding numbers this way doesn't give you all the formatting control that you have with the xsl:number instruction, but if you only need a simple sequence of numbers in your list, doing it this way can mean much faster transformation times.

Remember that position() in this example refers to the node's position among the nodes selected for the xsl:for-each instruction -- in this case, the color children of the colors element. If you had tried to number the color elements this way,

<!-- xq420.xsl: converts xq394.xml into xq421.txt -->

  <xsl:template match="color">
      <xsl:value-of select="position()"/>. <xsl:apply-templates/>
  </xsl:template>
    

Also in Transforming XML

Automating Stylesheet Creation

Appreciating Libxslt

Push, Pull, Next!

Seeking Equality

The Path of Control

the XSLT processor would have counted each color element's position among all of the colors element's children, including the text nodes storing the carriage returns between each color element in the source tree. The result would be this:


  2. red
  4. green
  6. blue
  8. yellow

The text nodes holding those carriage returns are the first, third, fifth, seventh, and ninth child nodes of the colors element. Numbers don't show up for them because the stylesheet only has a template rule for color elements, and none to add the carriage returns to the result tree. Still, the position() function counts both the color children and the text node children between them to determine the numbers to put before each color name in the result document. (One thing that makes it confusing is that carriage returns are whitespace, so the extra nodes being counted are nodes that you can't see.) In the previous example, the xsl:for-each instruction ensured that the count() function only counted the nodes that we wanted it to count: the color element nodes.

For more on the vagaries of handling whitespace with XSLT, see the three-part series I did in this column last November , December , and January , or see my book XSLT Quickly.