
Top Ten Tips to Using XPath and XPointer
John Simpson is the author of XPath and XPointer
XPath and XPointer allow XML developers and document authors to find and manipulate specific needles of content in an XML document's haystack. From mindful use of predicates, to processor efficiency, to exploring both the standards themselves and extensions to them, this article offers ten tips -- techniques and gotchas -- to bear in mind as you use XPath and XPointer in your own work.
1. Beware of whitespace when counting nodes.
Consider a simple document, such as the following:
<year>
<month monthnum="4">April</month>
</year>
Ask yourself how many children does the year element
have?
If you think the answer is one, the month element, you're
wrong. Answering the question with XPath might look something like
this:
|
Related Reading
XPath and XPointer |
count(/year/node())
If you submit this XPath expression to a "pure" XPath-aware processor,
such as the one built into the Saxon XSLT engine, you're told that
year has three children. What's going on? The first bit of
content that follows the opening of the year element isn't
the month element, although it looks that way to the human
eye. Rather, the first bit of content within the year element
is a text node (a number of blank spaces, a newline, and a few more blank
spaces), which precede the opening of the month
element. There's also a child text node (a simple newline) following the
month element's close, just before the close of the
year element. That is, to an XPath-aware processor, this
document resembles Figure 1.
![]() |
| Figure 1: An XPath processor's-eye view of a document with "invisible" whitespace |
What do I mean by a "pure" XPath-aware processor? The one to look out for is the MSXML processor, freely provided by Microsoft both for use as a stand-alone product and embedded in Internet Explorer (MSIE). When you use the preceding XPath expressions to view the document in MSIE, via an XSLT transformation, you get the mistaken (albeit, perhaps, more common sense) answer: one child.
MSXML includes an XML parser, an XPath-compliant XSLT
engine, and an interface to the outside world (like MSIE). The parser
and the XSLT engine are both excellent, standards-compliant
components. It's the latter which produces the seemingly non-conformant
behavior when dealing with whitespace-only text nodes. This behavior is
controlled by a property, preserveWhiteSpace, with true or
false values. The default is false, which causes MSIE to display the
document incorrectly. In order to change this behavior -- and make MSIE
behave "purely" -- you must use scripting to set
preserveWhiteSpace to true explicitly.
2. Keep an open mind about predicates: nested, "compound," and so on.
The predicate is such a powerful, valuable piece of a location step's real estate that you may be reluctant to try anything beyond the simplest ones. Don't be. The predicate is there to enable you to grab exactly the node(s) you need from among all those candidates visible along a given axis from the context node. Why limit yourself to selecting, say, just the nth child or elements with a particular attribute? Stretch your wings by experimenting with multiple predicates in a given location step or path.
Here's a simple XML document:
<roofers>
<roofing-material>
<manufacturer>Salitieri</manufacturer>
<type>thatch</type>
</roofing-material>
<roofing-material>
<manufacturer>Nash</manufacturer>
<type>shingles</type>
</roofing-material>
<roofing-material>
<manufacturer>DeBrutus</manufacturer>
<type>fiberglass</type>
</roofing-material>
<roofing-material>
<manufacturer>Short</manufacturer>
<type>shingles</type>
</roofing-material>
</roofers>
Suppose you want to locate a nodeset consisting of any roofing-material
element whose type is "shingles", but only if the
manufacturer is "Nash". Either of the following approaches
works (pay special attention to the predicates):
//roofing-material[type="shingles"][manufacturer="Nash"]
//roofing-material[type="shingles" and manufacturer="Nash"]
//roofing-material[type[preceding-sibling::manufacturer="Nash"]="shingles"]
While the results of these three approaches are identical for this document, in other documents they might be quite different. And all three -- including that weird-looking "nested predicate" in the third example -- are perfectly legal XPath 1.0.
Beware of one trap when using the approach employed in the first of the three preceding location paths -- I think of it as a "stacked" predicate. The order in which predicates appear on the stack can affect the final result. Each succeeding predicate is evaluated in terms of the narrowed context provided by the preceding one(s), and not just in terms of the general context in which a single predicate would be evaluated. Here's a sample document to illustrate this point.
<tosses>
<toss result="heads"/>
<toss result="heads"/>
<toss result="tails"/>
<toss result="heads"/>
</tosses>
Now consider the following two location paths into this document, each using a stacked predicate:
(//toss)[@result="heads"][3]
(//toss)[3][@result="heads"]
See the difference? The first path locates (a) all toss elements
whose result attribute equals "heads", and then (b) the
third one of those toss elements. Therefore, it selects the
fourth toss element in the document.
The second path selects the third toss
element, and then the stacked predicate applies a further screen,
selecting the third toss element only if its
result attribute has a value of "heads". Because the third
toss element's result attribute is "tails",
therefore, this location path returns an empty node-set.

