XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Implementing XPath for Wireless Devices, Part II

July 17, 2002

In the first part of this article, we introduced XPath and discussed various XPath queries ranging from simple to complex. By applying XPath queries to sample XML files, we elaborated upon various important definitions of XPath such as location step, context node, location path, axes, and node-test. We then discussed complex XPath queries that combine more than one simple query. We also discussed the abstract structure of Wireless Binary XML (WBXML), which is the wireless counterpart of XML. Finally we presented the design of a simple XPath processing engine.

In this part, we will discuss the features of XPath which allow for complex search operations on an XML file. We will discuss predicates or filtered queries and the use of functions in XPath. We will present various XPath queries for the processing of WSDL and WML. We will also enhance the simple design of our XPath engine to include support for predicates, functions, and different data types.

Filtered Queries and Predicates

Let's start with a simple query which will return the root node of any XML file:

Related Reading

XML in a Nutshell, 2nd Edition

XML in a Nutshell, 2nd Edition
A Desktop Quick Reference
By W. Scott Means, Elliotte Rusty Harold

./node()

We can take this further with another simple query, which selects all the immediate children of the root node:

./node()/*

What if you want to find all the nodes that are the immediate children of the root node and have a type attribute? The following query will help:

./node()/*[attribute::type]

This query will return the binding element from Listing 1. This shows that the code attribute::query written within square brackets acts as a filter. Filters in XPath are called predicates and are written inside square brackets. A predicate acts on a node-set -- in this example, the node-set consists of all immediate children of the root node -- and applies the filtering condition -- here: the node must have a type attribute -- to the node-set. The result is a reduced, that is, filtered node-set.

Predicates can range from simple to very complex. Perhaps the simplest form of XPath predicate is just a number as shown in the following query which returns the second child (message element) of the root element:

./node()/*[2]

The query, ./node()/message[attribute::name="TotalBill"]/text() will look for a particular message child of the root element whose attribute name has a value TotalBill. The query will return all text nodes of the particular message element. This query will return the second of the two message elements of Listing 1.

XPath Functions

Suppose you want to answer following questions about the WSDL file in Listing 1:

1. What is the value of the name attribute of last operation element?
2. How many message child elements does the definitions element have?
3. What is the name of the first child element of the root element?

The last() Function

The last() function will always point to the last node in the node set. The following query, when applied to the WSDL file in Listing 1, will return the second message element (i.e. the message element whose name is TotalBill):

./node()/message[last()]

Note that the following query also returns the same message element:

./node()/message[2]

The only difference between the two queries is that we have replaced the last() method with a number two (2). It is correct to conclude that the last() function in this case is actually returning the number 2 (the number of nodes in the node set of the particular location step). Apply the same two queries to the WSDL file of Listing 2 (you may use the XPath Tester application mentioned in the resources) and you will see that this time the two queries do not return the same result. There are three message elements in Listing 2, so the last() function is now returning the number 3.

Notice from this discussion that the last() function always returns a number.

The position() Function

If you apply the following queries to the WSDL file in Listing 2,

./node()/message[1]/part
./node()/message[2]/part
./node()/message[3]/part

they will return the part children of the first, second, and third message elements respectively. This shows that there is a proximity position of each node in the node set. The proximity position of the first node is one, the second node is two and so on.

What if you want to find all the message elements except the second? You can use the position() function which works on the proximity position of a context node. The following query will return the first and third message elements of Listing 2:

./node()/message[position()!=2]

The position() function simply returns the proximity position of the context node being evaluated. The predicate [position()!=2] will compare the proximity position with the number 2 and include the context node in the node-set only if proximity position is not equal to two.

Pages: 1, 2

Next Pagearrow