XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Top Ten Tips to Using XPath and XPointer

Top Ten Tips to Using XPath and XPointer

August 21, 2002

John Simpson is the author of XPath and XPointer

XPath and XPointer allow XML developers and document authors to find and manipulate specific needles of content in an XML document's haystack. From mindful use of predicates, to processor efficiency, to exploring both the standards themselves and extensions to them, this article offers ten tips -- techniques and gotchas -- to bear in mind as you use XPath and XPointer in your own work.

1. Beware of whitespace when counting nodes.

Consider a simple document, such as the following:

      <month monthnum="4">April</month> 

Ask yourself how many children does the year element have?

If you think the answer is one, the month element, you're wrong. Answering the question with XPath might look something like this:

Related Reading

XPath and XPointer

XPath and XPointer
Locating Content in XML Documents
By John E. Simpson


If you submit this XPath expression to a "pure" XPath-aware processor, such as the one built into the Saxon XSLT engine, you're told that year has three children. What's going on? The first bit of content that follows the opening of the year element isn't the month element, although it looks that way to the human eye. Rather, the first bit of content within the year element is a text node (a number of blank spaces, a newline, and a few more blank spaces), which precede the opening of the month element. There's also a child text node (a simple newline) following the month element's close, just before the close of the year element. That is, to an XPath-aware processor, this document resembles Figure 1.

Figure 1: An XPath processor's-eye view of a document with "invisible" whitespace

What do I mean by a "pure" XPath-aware processor? The one to look out for is the MSXML processor, freely provided by Microsoft both for use as a stand-alone product and embedded in Internet Explorer (MSIE). When you use the preceding XPath expressions to view the document in MSIE, via an XSLT transformation, you get the mistaken (albeit, perhaps, more common sense) answer: one child.

MSXML includes an XML parser, an XPath-compliant XSLT engine, and an interface to the outside world (like MSIE). The parser and the XSLT engine are both excellent, standards-compliant components. It's the latter which produces the seemingly non-conformant behavior when dealing with whitespace-only text nodes. This behavior is controlled by a property, preserveWhiteSpace, with true or false values. The default is false, which causes MSIE to display the document incorrectly. In order to change this behavior -- and make MSIE behave "purely" -- you must use scripting to set preserveWhiteSpace to true explicitly.

2. Keep an open mind about predicates: nested, "compound," and so on.

The predicate is such a powerful, valuable piece of a location step's real estate that you may be reluctant to try anything beyond the simplest ones. Don't be. The predicate is there to enable you to grab exactly the node(s) you need from among all those candidates visible along a given axis from the context node. Why limit yourself to selecting, say, just the nth child or elements with a particular attribute? Stretch your wings by experimenting with multiple predicates in a given location step or path.

Here's a simple XML document:


Suppose you want to locate a nodeset consisting of any roofing-material element whose type is "shingles", but only if the manufacturer is "Nash". Either of the following approaches works (pay special attention to the predicates):

//roofing-material[type="shingles" and manufacturer="Nash"] 

While the results of these three approaches are identical for this document, in other documents they might be quite different. And all three -- including that weird-looking "nested predicate" in the third example -- are perfectly legal XPath 1.0.

Beware of one trap when using the approach employed in the first of the three preceding location paths -- I think of it as a "stacked" predicate. The order in which predicates appear on the stack can affect the final result. Each succeeding predicate is evaluated in terms of the narrowed context provided by the preceding one(s), and not just in terms of the general context in which a single predicate would be evaluated. Here's a sample document to illustrate this point.

      <toss result="heads"/> 
      <toss result="heads"/> 
      <toss result="tails"/> 
      <toss result="heads"/> 

Now consider the following two location paths into this document, each using a stacked predicate:


See the difference? The first path locates (a) all toss elements whose result attribute equals "heads", and then (b) the third one of those toss elements. Therefore, it selects the fourth toss element in the document.

The second path selects the third toss element, and then the stacked predicate applies a further screen, selecting the third toss element only if its result attribute has a value of "heads". Because the third toss element's result attribute is "tails", therefore, this location path returns an empty node-set.

Pages: 1, 2, 3, 4

Next Pagearrow