XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Top Ten Tips to Using XPath and XPointer
by John E. Simpson | Pages: 1, 2, 3, 4

3. The string-value of a node is just a special case of the string-value of a node-set.

Consider another simple document,

   <quotation> 
      <source> 
       <author>Firesign Theatre</author> 
       <work year="1970">
        Don't Crush that Dwarf, Hand Me The Pliers
       </work> 
      </source> 
      <text>
       And there's hamburger all over the highway in Mystic, Connecticut.
      </text> 
   </quotation>

Ignoring the "invisible" whitespace (as discussed in tip #1), what's the string-value of the quotation element? In XPath terms, we're seeking the string-value of the element node identified by this location path:

   /quotation

Like the string-value of any other element, it's the concatenated values of all text nodes in the element's scope -- that is:

Firesign TheatreDon't Crush that Dwarf, Hand Me The 
PliersAnd there's hamburger all over the highway in Mystic, 
Connecticut.

That old devil, common sense, might lead you to conclude that the following XPath location path has the exact same string-value:

   /quotation/*

Run this path through an XPath processor, though, and what you get is simply

Firesign TheatreDon't Crush that Dwarf, Hand Me The 
Pliers

The behavior here is summed up in the rule of thumb, "The string-value of any node-set is the string-value of only the first node in the set." Thus, the second location path returns a node-set consisting of two nodes, the source and text elements. And the first of these, source, is the only one whose string-value counts as the string-value of the entire node-set.

So what about the first case? Does that imply an exception to the general rule of thumb, an exception for root elements? No. The first location path selects a node-set which just happens to consist of a single node, and that node, of course, is thus the "first" node in the node-set. Both location paths obey the general rule.

4. Remember the difference between value1 != value2 and not(value1 = value2).

This one can be very confounding if you're not careful. Take a look at these two sample location paths:

   //employee[@id != "emp1002"] 
   //employee[not(@id = "emp1002")]

The first example selects each employee element node which has an id attribute whose value does not equal "emp1002" -- note that this excludes those with no id attribute at all. The second selects all employee element nodes which do not have an id attribute whose value is "emp1002". So assume, then, a document with an employee element such as this:

   <employee>...</employee>

The first location path will not select this employee element: since it has no id attribute at all, it does not have an id attribute whose value is not emp1002 (or anything else, for that matter). The second, on the other hand, will select this employee element: it has no id attribute whose value is not emp1002.

5. Find and use a good tool for testing your XPath expressions.

After a while, you may become so self-assured when slinging XPath that you never need to test the results: you'll instinctively know the effect of one expression versus slightly different ones. Still, it's hard to imagine you'll be prepared for absolutely every eventuality, every nuance in a given document's node tree, every wrinkle in the XPath spec. At such a time, you absolutely must have a good tool to reassure you that you're on the right, well, path.

One common class of XPath testing tools, naturally, is comprised of all the production-grade XSLT processors: Saxon, Xalan, MSXML, and so on. In order to interpret and act on the template rules and other code in XSLT stylesheets, these processors must first "know" XPath. If you've coded your location paths properly, the transformation cranked through the processor will work properly; if you haven't, it won't -- properly or maybe at all.

This is kind of indirect confirmation, though. Wouldn't it be more useful to have a tool which, say, "lit up" the selected node(s) for a given XPath expression, especially if you're using XPath in a non-XSLT context like XPointer or XQuery?

My own work takes place mostly on Microsoft Windows platforms. Even if yours doesn't, you probably have access to a Windows PC. If you do, you're in luck: there's a great tool written by Dimitre Novachev, "XPath Visualiser". You can obtain it from the Top XML (formerly VBXML) site.

XPath Visualiser isn't a standalone software package. Instead, it's a frameset HTML document which you open in MSIE. The top frame includes some form elements, such as fields for entering the name of the document you want to view and the XPath expression you want to test, and a "Select Nodes" button to highlight all nodes selected by the XPath expression you've entered. The default expression, as shown in Figure 2, selects all element nodes in the document.

Figure 2: XPath Visualiser (default XPath expression).

Pages: 1, 2, 3, 4

Next Pagearrow