What's New in XPath 2.0
by Evan Lenz
|
Pages: 1, 2
Cardinal rule #4: Sequences may contain duplicates.
Also unlike XPath 1.0 node-sets (and sets in general), sequences may contain duplicates. For example, we can modify our expression above slightly:
(/foo/bar, /foo, /foo/bar)
This sequence consists of the bar element(s), followed
by the foo element(s), followed again by the same
bar element(s). In XPath 1.0, it was impossible to
construct such a collection, because, by definition, node-sets may not
contain the same node more than once.
The rise and fall of the node-set
In XPath 1.0, if you wanted to process a collection of nodes, you had to deal with node-sets. In XPath 2.0, the concept of the node-set has been generalized and extended. As we've seen, sequences may contain simple-typed values as well as nodes. We've also seen that sequences differ from node-sets in that they are ordered and may contain duplicates. The question naturally arises: how can you do away with node-sets without breaking XPath?
How to emulate sets in a sequence-only world
Indeed, XPath 1.0 node-sets were unordered. However, in XPath's most common context, XSLT, the nodes within the node-set are always processed in some order. The default order used to process node-sets was document order (since there is a document order that is always defined for all nodes). In XSLT 2.0, the default order used to process a node collection (i.e. sequence) is not necessarily document order but, rather, the order of the sequence. To maintain backward compatibility with XPath 1.0, path expressions (and other 1.0 expressions such as union expressions) are defined to always return in document order. Specifically, whenever "/" is used in the immediate expression, you can expect the result to be in document order. In addition, duplicates are automatically removed from the result. XPath 2.0 is thus able to emulate node-sets in a sequence-only world.
If you didn't follow all of that, don't worry. You may not have even realized before now that XPath 1.0 node-sets were unordered. It's mostly for the benefit of specification writers who like to reassure ourselves that everything is consistent and well-defined. Just rest assured that sequences are in fact ordered and path expressions pretty much behave the way they used to.
Some good keywords to learn
In addition to introducing many new datatypes and functions, XPath 2.0 introduces a number of new keyword-based operators, some of which we'll look at below.
Operations on sequences
Perhaps the most powerful new operator in XPath 2.0 for processing
sequences is the for expression. It enables iteration
over sequences, returning a new value for each member in the argument
sequence. This is similar to what can be done with
xsl:for-each, but it is different in that it is an actual
expression that returns a sequence which can, in turn, be processed as
such.
Consider the following example, which returns a sequence of simple-typed values, each consisting of the total cost of each item in a purchase order.
for $x in /order/item return $x/price * $x/quantity
We could then get the total cost of the order by using the
sum() function.
sum(for $x in /order/item return $x/price * $x/quantity)
Cases such as these are much easier to solve using sequences in
XPath 2.0 than they were in XSLT/XPath 1.0. Without sequences, this
problem is much harder to solve and usually involves constructing a
temporary "result tree fragment" and then using the
node-set() extension function.
Conditional expressions
Among the more powerful (and oft-requested) constructs added to XPath 2.0 is the conditional expression. Here's an example that's included in the XPath 2.0 working draft.
if ($widget1/unit-cost < $widget2/unit-cost) then $widget1 else $widget2
Quantifiers
The XPath 1.0 equals operator (=) was one of the more
powerful aspects of the language. It was powerful because it could
compare node-sets. Consider the following expression.
/students/student/name = "Fred"
In XPath 1.0, this expression returns true if any student name is equal to "Fred". This might be called existential quantification because it tests for the existence of a member satisfying some condition. XPath 2.0 preserves this functionality but also provides a more explicit way of testing:
some $x in /students/student/name satisfies $x = "Fred"
This formulation is more powerful because you can replace the
$x = "Fred" with any comparison you want, not just
equality comparisons. Also, XPath 1.0 does not provide a way for
testing to see if every student is named "Fred". XPath 2.0
introduces this ability to do universal quantification, using a
similar syntax to the above.
every $x in /students/student/name satisfies $x = "Fred"
Intersections, differences, unions
In XPath 1.0, the only real set operator was the union operator
(|). This meant that it was very awkward to determine
whether a given node was in a given node-set. For example, to
determine whether the node $x is included in the
/foo/bar node-set, we'd have to write something like
/foo/bar[generate-id(.)=generate-id($x)]
or like
count(/foo/bar)=count(/foo/bar | $x)
XPath 2.0's introduction of the intersect operator
alleviates some of the pain. Instead of going through the above
gyrations, we can simply write
$x intersect /foo/bar
XPath 2.0 also introduces the except operator, which
can be very handy when we need to select all of a given node-set,
except for certain nodes. In XPath 1.0, if we wanted to, for example,
select all attributes except for the one with a given
namespace-qualified name, we'd have to write
@*[not(namespace-uri()='http://example.com' and local-name()='foo')]
or
@*[not(generate-id(.)=generate-id(../@exc:foo)]
Once again, XPath 2.0 comes to our rescue with the following pleasant alternative:
@* except @exc:foo
Worrying about data types
If you take a peek at the XPath 2.0 spec, you'll see that I've left
out a lot of keywords, including things like cast,
treat, assert, and instance
of. These are important parts of the language, but their
importance partially depends on which context you're using XPath 2.0
in. If you will be using XPath in the context of XSLT 2.0, you may not
need to use these every day. You certainly will want to use them in
certain cases (for example, when casting a string to a date), but you
won't be required to use them. In the context of XQuery 1.0, however,
you may need to become intimately familiar with them.
The reason is that XQuery 1.0 is designed to be a statically typed language. Query analysis and optimization are aided by knowledge about what datatypes query expressions will be returning before the query is ever executed. This is only possible if the user explicitly specifies what type each of her expressions are to return. The other advantage of this approach is that errors can be caught early, thereby helping to enforce the correctness of queries.
There is certainly a tradeoff between usability and type safety. To serve the needs of both communities (sometimes artificially divided into the document-oriented and data-oriented worlds), XPath 2.0 provides a means by which the context can decide where it would like to stand in this tradeoff. Effectively, XPath 2.0 can be parameterized by its context. This may sound like a recipe for non-interoperability. However, it is important to identify the guiding principle behind the approach that has been taken. The principle is that any XPath 2.0 expression that does not first return an error will always return the same result as in another context. Thus, while an expression in one context may produce an error and not in another, it will never produce two different expression results. In other words, you always get either a right answer or an error. There is never more than one right answer.
The intended upshot for XSLT users is that they won't have to worry about most of this stuff, most of the time. A given XPath 2.0 expression may throw an "exception" in the XQuery context, but the same expression results in a silently invoked fallback conversion when in the context of XSLT.
Conclusion
It will likely become clear that XPath 2.0 represents a very significant upgrade to XPath 1.0. Its growth has been driven both by the demands of the XPath 1.0 user community, as well as the requirements for XQuery 1.0. Even if you don't agree with the entire outcome, it's hard to deny that it represents a remarkable collaboration. With any luck, it will also represent a very powerful, standard tool for several user communities.
Got questions about XPath 2.0? Ask them in our forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- string-to-codepoints
2005-07-31 04:05:34 wshubeir [Reply]
<xsl:value-of select="string-to-codepoints('A')"/>
gives me the following error:
'string-to-codepoints' is not a valid XSLT or XPath function.
Please advice.
Is it the matter of upgrading to XPATH 2.0. If so, what is the procedure for this upgrade?
thanks
- Data Mapping with XPath
2003-08-04 07:19:59 Faisal Azhar [Reply]
Hi
I want to map data with xml schema and than generate XML file based on that schema and data, can Xpath help me regarding this problem.
Thanks
Faisal.
- XPath 1.0 *can* do universals, no?
2003-06-19 13:47:05 Lars Huttar [Reply]
I don't understand the statement that
"Also, XPath 1.0 does not provide a way for testing to see if every student is named "Fred"."
What about
not(/students/student/name != "Fred")
?
Or do you mean that there is no way to do this for arbitrary conditions? It seems to me that if you want to do
every $x in /students/student/name satisfies p($x)
(where p is some condition on $x)
then in XPath 1.0 you could do
not(/students/student/name[not p(.)])
I'm not complaining about the new syntax -- it's fine with me, and certainly clearer than the last expression. I'm just questioning the statement about XPath 1.0, because I think I might have missed something.
Lars
- Why losing time in XPath 2.0?
2002-12-31 06:54:17 Sandro camillo [Reply]
I do not see the necessity of a W3C work around XPath 2.0
For me, we have only two languages in this scope: XPath 1.0 and
new XQuery 1,0 (December WD)
In my point of view the mix of XPath2.0/XQuery1.0 is not useful to the understanding and creation of clear standards.
Perhaps the work around XPath 2.0 is correct, but the name "XPath 2.0" for it could be modified for something as "common source of XQuery/XSLT".
Has nobody ideas in this direction?
P.S. I bag your pardom in my poor english.
- Next article?
2002-07-14 07:10:06 Dave Pawson [Reply]
Nice one Evan.
How about an article on grouping using XSLT 2?
- Is this better?
2002-04-15 12:59:58 Ian Ornstein [Reply]
XML is find. I have used it in production.
XSLT or more correctly XPATH should have been thown out like a first pancake. The language is not easily read. Each developer that picks up some XSLT written by someone other than himself will be guessing at what it means. Why is this?
It is because the language was designed using what I call Morse Code - dots, dashes, slashes and dollar signs. What ever happened to computer languages with words? I suppose there are some programmers who...enjoy regular expressions. Then there are those of us who have managed to avoid them. I suppose thats why there are different flavors of ice cream. But for a standard for all of us, I really wish that the language used words so that the intent of the code was more apparent.
- Is this better?
2003-06-19 13:54:42 Lars Huttar [Reply]
Ian, it looks to me like XPath 2.0 is quite a bit better in this respect than 1.0.
Most of the added keywords are words, not punctuation. And they're used in actual sentences! (gasp)
Not sure how non-native-English speakers feel about this, but I would think people of your persuasion would be pleased.
Lars
- Is this better?
- XPath and XSLT 2.0 support
2002-04-11 20:08:21 Paul Strand [Reply]
Understandably, the W3C isn't done with their working draft, so I can't expect to start writing "code" based off of it right now. However, books have been written and enhancements made based off of the aborted 1.1 spec, so I'm naturally curious when I can get my hands on the new functionality (most notably, a replacement to using extension functions for node-set).
With that at the heart of my earnestness, my questions are understandably erratic--I want my new toy. Is there and ETA on 2.0? What is it? In the meantime, what support exists in "beta" form? Where is Xalan in all of this? Are they getting prepped (so I can take the plunge a month or two later), or will they start once there is something formal to work from?
In unbridled anticipation,
Paul
- XPath and XSLT 2.0 support
2002-04-12 02:01:34 Anthony Coates [Reply]
Try Saxon 7.0 from
http://saxon.sourceforge.net/
Cheers,
Tony.
- XPath and XSLT 2.0 support
- getting attribute names of an XML element
2002-04-07 16:09:41 Morgan Nagarajan [Reply]
Is there any provisions to display the attribute names of an XML element. It's easy to get an attribute value but I could not find any easy way at all to get the names of the attribute (note that I don't know attribute name existing in a XML element, beforehand)
For example, consider an XML element,
<CurrentStatus
ServiceHealth="Available"/> I wanna print the attribute name (ie ServiceHealth) by means of XPath in XSLT stylesheet.
any help?
thanks,
-Morgan
- getting attribute names of an XML element
2002-04-16 10:11:23 Bernhard Zwischenbrugger [Reply]
Try this:
<xsl:for-each select="/CurrentStatus/@*">
<xsl:value-of select="name()"/>
</xsl:for-each>
Hope it helps
Bernhard Zwischenbrugger
http://datenkueche.com
- getting attribute names of an XML element
