XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Axis Powers: Part Two

January 03, 2001

In part one of this two-part column, we looked at the role of axes in the XPath expressions that are used so often in XSLT stylesheets, and we looked more closely at the child, parent, ancestor, ancestor-or-self, following-sibling, preceding-sibling, and attribute axes. In this column, we'll examine the remaining axes offered by the XPath specification and see what they can do in XSLT stylesheets.

The preceding and following Axes

The preceding and following axis specifiers let you address nodes that aren't necessarily siblings. The preceding axis contains all the nodes that end before the context node begins, and the following axis contains all the nodes that begin after the context node ends. We'll use these axes in a template rule for this sample document's test elements. We want this template to add messages naming the titles of the preceding and following chapters.

<story>

  <chapter><title>Chapter 1</title>
    <para>A Dungeon horrible, on all sides round</para>
  </chapter>

  <chapter><title>Chapter 2</title>
    <para>More unexpert, I boast not: them let those</para>
    <test/>
    <para>Contrive who need, or when they need, not now.</para>
    <sect><title>Chapter 2, Section 1</title>
      <para>For while they sit contriving, shall the rest,</para>
      <test/>
      <para>Millions that stand in Arms, and longing wait</para>
    </sect>
  </chapter>

  <chapter><title>Chapter 3</title>
    <para>So thick a drop serene hath quenched their Orbs</para>
  </chapter>

</story>

The first test element needs to point at the nodes preceding and following its grandparent because it's in a para element inside of a chapter element. The second test element must point at the nodes preceding and following its great-grandparent because it's in a para element in a sect element in a chapter element.

Despite the two test element's different levels of depth in the source tree, the preceding and following axes let a stylesheet use the same template rule. (The comment in each sample stylesheet identifies the filenames of the stylesheet, the sample input, and output in the zip file of sample code.)

<!-- u85-8-7.xsl: converts u85-8-19.xml into u85-8-8.xml -->

  <xsl:template match="test">

    Previous chapter:
    (<xsl:value-of select="preceding::chapter[1]/title"/>)
    Next chapter:
    (<xsl:value-of select="following::chapter/title"/>)
    <xsl:apply-templates/>

  </xsl:template>

The first xsl:value-of element's XPath expression tells the XSLT processor "go to the first node named chapter that finished before the context node (for this template, the test element) started, and get its title element's contents." As with the preceding-sibling example in last month's column, this XPath expression needs the [1] predicate to show that of all the chapter elements preceding the context node, the XSLT processor should grab the first one. Because the counting is done from the end of the node list when using the preceding axis, that means the first one counting backwards.

The second xsl:value-of element's XPath expression tells the XSLT processor to get the contents of the title element in the first chapter element that begins after the context node ends. When a stylesheet with this template is run with the sample document above, the output for both test elements in Chapter 2 is the same.

<story>

  <chapter><title>Chapter 1</title>
    <para>A Dungeon horrible, on all sides round</para>
  </chapter>

  <chapter><title>Chapter 2</title>
    <para>More unexpert, I boast not: them let those</para>
    
    Previous chapter:
    (Chapter 1)
    Next chapter:
    (Chapter 3)
    
    <para>Contrive who need, or when they need, not now.</para>
    <sect><title>Chapter 2, Section 1</title>
      <para>For while they sit contriving, shall the rest,</para>

    Previous chapter:
    (Chapter 1)
    Next chapter:
    (Chapter 3)
    
      <para>Millions that stand in Arms, and longing wait</para>
    </sect>
  </chapter>

  <chapter><title>Chapter 3</title>
    <para>So thick a drop serene hath quenched their Orbs</para>
  </chapter>

</story>

descendant and descendant-or-self

The descendant axis refers to the context node's children, the children's children, and any other descendants of the context node at any level. For example, let's say that when we transform the following document we want to list all of a chapter's pictures at the beginning of the result tree's chapter. Of its three figure elements, the first is a child of the chapter element, the second is a grandchild, and the third is a great-grandchild.

<chapter>
  <par>Then with expanded wings he steers his flight</par>
  <figure><title>"Incumbent on the Dusky Air"</title>
  <graphic fileref="pic1.jpg"/></figure>
  <par>Aloft, incumbent on the dusky Air</par>
  <sect1>
   <par>That felt unusual weight, till on dry Land</par>
   <figure><title>"He Lights"</title>
   <graphic fileref="pic2.jpg"/></figure>
   <par>He lights, if it were Land that ever burned</par>
   <sect2>
    <par>With solid, as the Lake with liquid fire</par>
    <figure><title>"The Lake with Liquid Fire"</title>
    <graphic fileref="pic3.jpg"/></figure>
   </sect2>
  </sect1>
</chapter>

After the chapter start-tag in the result tree document, we want to see the string "Pictures:" as a header; after that, we want the titles of all the chapter's figure elements:

<chapter>
Pictures: 
"Incumbent on the Dusky Air"
"He Lights"
"The Lake with Liquid Fire"

  <par>Then with expanded wings he steers his flight</par>
  <figure><title>"Incumbent on the Dusky Air"</title>
  <graphic fileref="pic1.jpg"/></figure>
  <par>Aloft, incumbent on the dusky Air</par>
  <sect1>
    <par>That felt unusual weight, till on dry Land</par>
    <figure><title>"He Lights"</title>
    <graphic fileref="pic2.jpg"/></figure>
    <par>He lights, if it were Land that ever burned</par>
    <sect2>
      <par>With solid, as the Lake with liquid fire</par>
      <figure><title>"The Lake with Liquid Fire"</title>
      <graphic fileref="pic3.jpg"/></figure>
    </sect2>
  </sect1> </chapter>

The following template rule uses the descendant axis specifier to find these figure titles even though they are at three different levels in the source tree document. The template first adds the string "Pictures:" to the result tree after the chapter start-tag.

We saw with our earlier examples of using axis specifiers that xsl:value-of returns a text node of only the first of the nodes that the axis points to. In this case we want all of them, so the template uses an xsl:for-each instruction to iterate across the node set. The xsl:for-each element's select attribute names the node set to iterate over, and an xsl:value-of instruction adds the contents of each figure element's title to the result tree. The xsl:text element with only a carriage return as its content adds this character after each title that gets added to the result tree by the xsl:value-of element.

<!-- u85-8-10.xsl: converts u85-8-9.xml into u85-8-11.xml -->

<xsl:template match="chapter">
  <chapter>
Pictures: 
<xsl:for-each select="descendant::figure">
<xsl:value-of select="title"/><xsl:text>
</xsl:text>
    </xsl:for-each>
    <xsl:apply-templates/>
  </chapter>
</xsl:template>

(Indenting the elements of your stylesheets makes them easier to read, but it can also indent your output. For the templates above, some of its instructions are not indented so that the author list in the result tree will not be indented either.)

As you may guess about an axis named descendant-or-self, it checks a context node's descendants and the context node itself. Imagine that you wanted to list all the people who worked on the following chapter by listing the values of any author attributes in the whole thing.

<chapter author="jm">
  <par>Then with expanded wings he steers his flight</par>
  <par author="ar">Aloft, incumbent on the dusky Air</par>
  <sect1 author="bd">
    <par>That felt unusual weight, till on dry Land</par>
    <par>He lights, if it were Land that ever burned</par>
    <sect2 author="jm">
      <par>With solid, as the Lake with liquid fire</par>
    </sect2>
  </sect1>
</chapter>

The chapter template rule that makes this possible uses an asterisk node test with a descendant-or-self axis specifier in an xsl:for-each instruction to go through all the elements that qualify as descendant-or-self nodes.

<!-- u85-8-13.xsl: converts u85-8-12.xml into u85-8-14.xml -->

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
     version="1.0">

<xsl:template match="chapter">
  <chapter>
Authors
<xsl:for-each select="descendant-or-self::*/@author">
<xsl:value-of select="."/><xsl:text>
</xsl:text>
    </xsl:for-each>
    <xsl:apply-templates/>
  </chapter>
</xsl:template>

</xsl:stylesheet>

That's the first step of the location path in the xsl:for-each element's select attribute. The second step (@author) checks for an author attribute of each of these descendant-or-self nodes. (Remember, @author is just an abbreviation of attribute::author.) Inside the xsl:for-each instruction, the xsl:value-of element grabs a string version of each selected node and adds it to the result tree; the xsl:text element after that adds a carriage return.

The output shows every author value in document order, even when they're repeated:

<?xml version="1.0" encoding="utf-8"?><chapter>
Authors
jm
ar
bd
jm

  Then with expanded wings he steers his flight
  Aloft, incumbent on the dusky Air
  
    That felt unusual weight, till on dry Land
    He lights, if it were Land that ever burned
    
      With solid, as the Lake with liquid fire
    
  
</chapter>

There is an abbreviation for this axis as well or, rather, for a common XPath fragment that uses this axis: "//" means the same as "/descendant-or-self::node()/". Note the slashes beginning and ending the fragment represented by this abbreviation; it's as if you can leave out the "descendant-or-self::node()" part when you want to write "/descendant-or-self::node()/" to leave "//".

It's a nice shortcut to refer to any descendant of a given element with a particular name. For example, while chapter/title means "any title child of a chapter element," chapter//title means "any title descendant of a chapter element." An XPath expression that begins with these two slashes refers to any descendant of the document's root with that name -- in other words, any element with that name. For example, //title refers to any title element anywhere in the document.

This abbreviation doesn't have to be used with elements. The XPath expression chapter//@author refers to any author attribute in a chapter or in one of its descendants and //comment() refers to all the comments in a document.

The "//" abbreviation has one interesting advantage over the XPath fragment it represents: in addition to using it in XPath expressions, you can use it in match patterns (that is, in the value of an xsl:template instruction's match attribute). Match pattern syntax is a subset of XPath expression syntax; there are various things you can use in an XPath expression that you can't do in a match pattern. The descendant-or-self axis is one of these things, but you are free to use the "//" abbreviation in both XPath expressions and match patterns.

self

The self node refers to the context node itself. We've already used an abbreviation of it several times in the last few examples: the single period (.) is an abbreviation of self::node(), which says "give me the context node, whatever its type is." (The node() node test is true for any node. The XPath expression needs it there because the node test is a required part of an XPath expression or match pattern.) Where you see <xsl:value-of select="."/> in the last few examples, the stylesheets would work exactly the same if the select attribute had the value self::node().

namespace

The last axis is namespace. This points to a node set consisting of the default xml namespace and any additional ones that are in scope for the context node. For example, the following template rule lists the prefixes for a test element's namespace nodes:

<!-- u85-8-16.xsl: converts u85-8-17.xml into u85-8-18.xml -->

<xsl:template match="test">
  <xsl:for-each select="namespace::*"> 
    <xsl:value-of select="name()"/><xsl:text> </xsl:text>
  </xsl:for-each>
</xsl:template>

For a document like this with three namespaces declared,

<test xmlns:snee="http://www.snee.com/dtds/test"
      xmlns:glikk="http://www.glikk.com/dtds/test"
      xmlns:flunn="http://www.flunn.com/dtds/test">
this is a test.
</test>

the stylesheet lists the prefixes for the default xml namespace and the three declared ones with a space after each name:

xml flunn glikk snee

Some XPath axes are more popular than others, so popular that many XSLT developers don't realize that there are 13 in all for use in XPath expressions. Play around with all of them (you can download the examples shown in this column here), and you'll find that the lesser-known ones can often solve your thornier XPath problems.