Implementing XPath for Wireless Devices, Part II
In the first part of this article, we introduced XPath and discussed various XPath queries ranging from simple to complex. By applying XPath queries to sample XML files, we elaborated upon various important definitions of XPath such as location step, context node, location path, axes, and node-test. We then discussed complex XPath queries that combine more than one simple query. We also discussed the abstract structure of Wireless Binary XML (WBXML), which is the wireless counterpart of XML. Finally we presented the design of a simple XPath processing engine.
In this part, we will discuss the features of XPath which allow for complex search operations on an XML file. We will discuss predicates or filtered queries and the use of functions in XPath. We will present various XPath queries for the processing of WSDL and WML. We will also enhance the simple design of our XPath engine to include support for predicates, functions, and different data types.
Filtered Queries and Predicates
Let's start with a simple query which will return the root node of any XML file:
|
Related Reading
XML in a Nutshell, 2nd Edition |
./node()
We can take this further with another simple query, which selects all the immediate children of the root node:
./node()/*
What if you want to find all the nodes that are the immediate children of the
root node and have a type attribute? The following query will help:
./node()/*[attribute::type]
This query will return the binding element from Listing 1. This shows that the code
attribute::query written within square brackets acts as a
filter. Filters in XPath are called predicates and are written inside square
brackets. A predicate acts on a node-set -- in this example, the node-set
consists of all immediate children of the root node -- and applies the filtering
condition -- here: the node must have a type attribute -- to the node-set. The
result is a reduced, that is, filtered node-set.
Predicates can range from simple to very complex. Perhaps the simplest form of
XPath predicate is just a number as shown in the following query which returns
the second child (message element) of the root element:
./node()/*[2]
The query, ./node()/message[attribute::name="TotalBill"]/text()
will look for a particular message child of the root element whose
attribute name has a value TotalBill. The query will
return all text nodes of the particular message element. This query
will return the second of the two message elements of Listing 1.
XPath Functions
Suppose you want to answer following questions about the WSDL file in Listing 1:
1. What is the value of the name attribute of last operation element?
2. How many message child elements does the definitions element have?
3. What is the name of the first child element of the root element?
The last() Function
The last() function will always point to the last node in the
node set. The following query, when applied to the WSDL file in Listing 1, will return the second
message element (i.e. the message element whose name
is TotalBill):
./node()/message[last()]
Note that the following query also returns the same message element:
./node()/message[2]
The only difference between the two queries is that we have replaced the
last() method with a number two (2). It is correct to conclude that
the last() function in this case is actually returning the number 2
(the number of nodes in the node set of the particular location step). Apply the
same two queries to the WSDL file of Listing 2 (you may use the XPath
Tester application mentioned in the resources) and you will see that this time
the two queries do not return the same result. There are three message elements
in Listing 2, so the
last() function is now returning the number 3.
Notice from this discussion that the last() function always returns
a number.
The position() Function
If you apply the following queries to the WSDL file in Listing 2,
./node()/message[1]/part
./node()/message[2]/part
./node()/message[3]/part
they will return the part children of the first, second, and third
message elements respectively. This shows that there is a proximity
position of each node in the node set. The proximity position of the first node
is one, the second node is two and so on.
What if you want to find all the message elements except the
second? You can use the position() function which works on the
proximity position of a context node. The following query will return the first
and third message elements of Listing 2:
./node()/message[position()!=2]
The position() function simply returns the proximity position of
the context node being evaluated. The predicate [position()!=2]
will compare the proximity position with the number 2 and include the context
node in the node-set only if proximity position is not equal to two.
Pages: 1, 2 |
