Implementing XPath for Wireless Devices, Part II
In the first part of this article, we introduced XPath and discussed various XPath queries ranging from simple to complex. By applying XPath queries to sample XML files, we elaborated upon various important definitions of XPath such as location step, context node, location path, axes, and node-test. We then discussed complex XPath queries that combine more than one simple query. We also discussed the abstract structure of Wireless Binary XML (WBXML), which is the wireless counterpart of XML. Finally we presented the design of a simple XPath processing engine.
In this part, we will discuss the features of XPath which allow for complex search operations on an XML file. We will discuss predicates or filtered queries and the use of functions in XPath. We will present various XPath queries for the processing of WSDL and WML. We will also enhance the simple design of our XPath engine to include support for predicates, functions, and different data types.
Let's start with a simple query which will return the root node of any XML file:
We can take this further with another simple query, which selects all the immediate children of the root node:
What if you want to find all the nodes that are the immediate children of the
root node and have a
type attribute? The following query will help:
This query will return the binding element from Listing 1. This shows that the code
attribute::query written within square brackets acts as a
filter. Filters in XPath are called predicates and are written inside square
brackets. A predicate acts on a node-set -- in this example, the node-set
consists of all immediate children of the root node -- and applies the filtering
condition -- here: the node must have a type attribute -- to the node-set. The
result is a reduced, that is, filtered node-set.
Predicates can range from simple to very complex. Perhaps the simplest form of
XPath predicate is just a number as shown in the following query which returns
the second child (
message element) of the root element:
will look for a particular
message child of the root element whose
name has a value
TotalBill. The query will
return all text nodes of the particular
message element. This query
will return the second of the two
message elements of Listing 1.
Suppose you want to answer following questions about the WSDL file in Listing 1:
1. What is the value of the name attribute of last operation element?
2. How many message child elements does the definitions element have?
3. What is the name of the first child element of the root element?
last() function will always point to the last node in the
node set. The following query, when applied to the WSDL file in Listing 1, will return the second
message element (i.e. the
message element whose name
Note that the following query also returns the same message element:
The only difference between the two queries is that we have replaced the
last() method with a number two (2). It is correct to conclude that
last() function in this case is actually returning the number 2
(the number of nodes in the node set of the particular location step). Apply the
same two queries to the WSDL file of Listing 2 (you may use the XPath
Tester application mentioned in the resources) and you will see that this time
the two queries do not return the same result. There are three message elements
in Listing 2, so the
last() function is now returning the number 3.
Notice from this discussion that the
last() function always returns
If you apply the following queries to the WSDL file in Listing 2,
they will return the part children of the first, second, and third
message elements respectively. This shows that there is a proximity
position of each node in the node set. The proximity position of the first node
is one, the second node is two and so on.
What if you want to find all the
message elements except the
second? You can use the
position() function which works on the
proximity position of a context node. The following query will return the first
message elements of Listing 2:
position() function simply returns the proximity position of
the context node being evaluated. The predicate
will compare the proximity position with the number 2 and include the context
node in the node-set only if proximity position is not equal to two.
message children does the
portType element in
Listing 1 have? Count them and
you will find two
message elements. Specifying a "how many"
question in XPath is a two-step procedure. First write an XPath query that will
find all those elements that you wish to count. Then pass the XPath query to
count() function as shown below:
count() function calculates and returns the number of nodes in
the resulting node-set of the XPath query.
What does the following query return when applied to the WSDL file of Listing 1?
It returns the fifth child (the
service element) of the root
service element itself is a complete structure and
contains child elements. Therefore, the returned value of this XPath query is
actually an XML node and not just the name of an element.
name() function returns the name of the XML node in
question. For example, the following query will return the string "service" when
applied to Listing 1:
Similarly, the following query will return the string "wsd:definitions" (fully qualified name of the root element with the namespace prefix):
namespace-uri() functions are
similar to the
name() function, except that the local-name method
returns only the local name of the element without the namespace prefix, and the
namespace-uri function returns only the namespace URI. For example, try the
following queries on Listing 1:
The first query returns a string "definitions", while the second returns "http://schemas.xmlsoap.org/wsdl/".
We have seen that the
namespace-uri() functions return strings. XPath offers several
functions for the processing of strings, such as
starts-with() etc. For example the following query demonstrates how
to use the
The above query will look for the second child of the root element, then it will
find all the part child elements of the root's second child. It will then look
for the name attribute of the part child elements, and, as a last step, it will
convert the value of the name attribute to a string form. When applied to Listing 1, it will yield
XPath also provides several functions that return true or false (Boolean data type). Consider the following query:
It returns true when applied to Listing 1. That's because the
boolean() function checks whether a node-set resulting from an
XPath query is empty or not (in our case, it contains two
children of the root element). If it is empty, the
function returns false, otherwise true.
The following WSDL processing scenario uses all the XPath concepts which we've discussed so far. The search requirement for the scenario is as follows:
serviceelement which is a direct child of
definitions(root) element and whose
nameattribute matches with the
nameattribute of the
definitionselement. Then look into that
serviceelement and find a
bindingattribute matches the
nameattribute of a
bindingelement, which is a direct child of the
This WSDL processing can be fulfilled in four steps:
1. Find the value of the
name attribute of the
definitions (root) element. The following XPath query (which
returns the string
BillingService from Listing 1) performs this job:
2. Then find the
service element whose
name of the
definitions element. The
following query contains the query of point 1 in a predicate and will return
3. Then find the value of the
name attribute of the
4. Finally look for the required
port element (whose
binding attribute matches the
name of the
binding element of point 3) inside the
element of point 2:
This example demonstrates that XPath predicates can contain simple logical conditions, function calls or even complete XPath queries.
WML is an XML language defined by the WAP Forum. WML provides a presentation format for small-device displays. WML is to a small-device display what HTML is to a personal computer.
Imagine a WML file consisting of a deck of cards, where each card is wrapped
card element. Listing 3 is a simple WML file that
The following XPath query will return all
elements contained within the first
card (the card element whose id
is "first") of Listing 3:
The next query returns the textual contents of the first paragraph of the second card:
We will now see how to include the support of predicates and Functions in the simple design of our XPath Engine.
The four pseudo-code classes
XPathExpression (Listing 4),
XPathLocationStep (Listing 5),
Predicate (Listing 7) form the updated design
that includes support of predicates and functions. We have introduced the
following enhancements to the classes presented in part 1:
1. XPath can return various types of data. Examples of data types XPath may
Booleans. Our XPath engine design supported only XML nodes as
return data types. We have now provided a generic class named
6) to support the different data types. Implementations based on our design
will need to extend
XPathResult for each data type separately.
2. The updated design now includes an architecture to support functions. A function call may occur at the beginning of an XPath query or inside any XPath location step. Therefore, both the XPathExpression Listing 4 and XPathLocationStep (Listing 5) classes now have added support for function calls.
3. We have provided a separate class for predicates (Listing 7). A predicate may consist of only a logical condition or an entire XPath query. Therefore, the Predicate class constructor will check whether the predicate is a complete query or just a condition. If it is a complete XPath query, the Predicate expression will instantiate a new XPathExpression object, otherwise it will just evaluate the logical condition to evaluate the filtered results.
In the preceding, we discussed the syntax and use of predicates and functions in XPath. We presented various WSDL and WML processing examples and demonstrated how to form complex XPath queries. Finally, we enhanced the design of the XPath engine introduced in the first article.
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.