Implementing XPath for Wireless Devices

June 5, 2002

Bilal Siddiqui

In this two-part article, I will discuss XPath and its use in Wireless applications. In the first part, we will consider applying XPath to both XML and its Wireless counterpart WBXML (Wireless Binary XML). We will also discuss the design of an XPath processing engine suitable for small wireless devices. This discussion will focus on XPath 1.0, which is a W3C recommendation.

What can XPath do for us?

XPath allows us to process XML and search for specific nodes in an XML document. To a certain extent, XPath is to XML what SQL is to relational databases, although the forthcoming W3C XML Query language takes that further. This section gives a brief overview of XPath.

Listing 1 (opens in a new window) is a Web Services Definition Language file. WSDL is an XML-based grammar to describe Web Service interfaces. The root element in Listing 1 is definitions. It has a name attribute, which indicates the name of the web service that it is defining. In our case it is BillingService.

While processing a WSDL file, we want to extract the name of a web service. The following simple XPath query will extract the name:


You can see the XPath query has three sections, separated by slashes. The first section is only a dot (.). A dot at the start of a query means you want to start your search from the beginning of the XML document.

The second section (child::node()[1]) says: Find the first child element node. The combined effect of the first and second portions (./child::node()[1]) is like saying: Find the first element of this XML document.

The third section (attribute::name) says: Find the value of an attribute name. The combined effect of the query is like saying: Find value of the name attribute of the first element in the XML document. Applying this query to the WSDL file of Listing 1 returns, BillingService.

The following query is similar to the one discussed above and finds the target namespace of this WSDL file:


Now let's take this idea further and see a more complex example. Have a look at the following XPath query and try to translate it into simple English:


In simple English, it says:

Find the first child node of the XML document, then find a service child element within the first child, then find a port element inside the service element, then look inside the port element for an address element that belongs to the SOAP namespace and return the value of its location attribute.

Important XPath definitions

Location step, context node, node-set and location path:

The parts of an XPath query are evaluated sequentially. Each part is referred to as a location step. The results of the first location step are given to the next location step and so on till you reach the end of the query. The result of each location step is a set of nodes referred to as a node-set. Each node in a node-set is known as a context-node during evaluation.

In this way, every location step produces a node-set, which is handed over to the next location step. Each node in the node-set will be treated as a context-node and another node-set will be evaluated by the next location step and so on. A series of location steps forms the complete location path.

Axes and node-test:

Specifying a child after a slash means you want to look for child elements. Specifying attribute after a slash means you are looking for attributes. Child and attribute key words are called axes. An axis specifies the direction of your search. Other XPath axes include descendant (means you are not only interested in immediate child elements but you are also interested in grand children, their children, and so on), parent (immediate parent), ancestor (parent, grand parent, great grand parent, and so on) etc.

You will also notice that every axis is followed by double colon mark (::, referred to as a resolution operator in programming languages), which is followed by the name of the node that we are looking for (targetNamespace in attribute::targetNamespace). The name that appears after the resolution operator is called a node-test (a test that our required nodes will pass).

Abbreviated Syntax

The query ./child::node()[1]/attribute::name that we considered earlier can also be written as ./child::node()[1]/@name. Here we have abbreviated attribute::name as @name. @ is the short name of the attribute axis.

Another common abbreviation is the omission of child axis. You don't need to specify child axis at all because XPath takes child as the default axis. Therefore, the following two queries will produce the same result:


What does an XPath query return?

The previous queries returned values of attributes. XPath can also return XML elements. Let's apply the following XPath query to the WSDL file of Listing 1:


This query looks for the first node (definitions element) in the WSDL file. Then it looks for all children of the definitions element that have the name message. There is a special condition attached with the message element that the query looks for ([@name="MonthNumber"]). The condition states that the message element should have an attribute named MonthNumber.

There are two message elements in the WSDL file of Listing 1. The query returns the message element whose name attribute is MonthNumber.

Multiple XPath queries working together

What if we want to find a message element with a dynamically evaluated name? For example, in WSDL processing applications, we often need to look for a message element, whose name attribute matches the value of message attribute of an input or output element.

In Listing 1, we have one input element and two output elements, each of which has a message attribute. The value of each of the message attributes matches the name attribute of a message element.

All input and output elements of Listing 1 reside inside two operation elements. Each operation element has a name attribute. The WSDL processing scenario provides us with the name of an operation element and asks us to find the message element associated with its input or output.

The following query performs this job for us, broken over two lines for clarity:



Compare this query with the last one. You will find that the only difference is that the hard coded value "MonthNumber" has been replaced with another XPath query:


This example shows that XPath syntax allows us to combine multiple XPath queries into a single complex query. This helps us in many real life applications where the XML processing task requires finding particular information and then, depending on the information read, jumping to another location in the same XML file.

XPath for Wireless

XML works for wireless applications the same way it works for the wired Web. The WSDL processing requirements inside a wireless client are the same as those inside desktop clients.

However, in order to address the bandwidth limitation issues, WAP Forum has proposed a compressed version of XML called Wireless Binary XML (WBXML), which is in use inside WAP-enabled devices.

As we have already seen, XPath is designed to operate on the tree-like XML structure. WBXML maintains the same abstract XML structure, although its concrete expression differs, so there is no conceptual difference in applying an XPath query to an XML file or its WBXML counterpart.

In order to demonstrate this point, let's examine how an XML document is transformed to WBXML.

A sample XML to WBXML transformation

Consider the simple XML file of Listing 2. We have kept it very simple in order to cover only the basics of WBXML.

A WBXML file starts with a WBXML version declaration. We are using WBXML version 1.3, which is represented by a single byte 0x03. The full translation is presented in Listing 3)

The next byte represents the DTD. We are working with an unknown DTD, which is represented by 0x01.

The third byte is for the character set. We are using the US ASCII character set, represented by 0x03 (third byte in WBXML stream of Listing 3).

The fourth byte is the length of the string table, a technique used to reuse strings that occur at multiple places in an XML file. String tables can help in better compression by writing a string in the table only once and then referring to it from various places. We do not have any strings occurring more than once in our simple XML file, so we will not use a string table; we record its length as 0x00 (fourth byte of WBXML file).

Next is the operation element. We are using the following bytes (called tag tokens) to represent our XML elements:

Operation element 0x05
Input element 0x06
Output element 0x07

WBXML assumes that all concerned parties know the meaning of tag tokens, in the same as all concerned understand (or don't) the meaning of XML tags. So we can directly use the tag token and assume that a tag token can be translated back to the name of an element wherever needed.

Therefore, our next byte is 0x05 (operation element) followed by 0x06 (input element).

Next is the string "My input" that needs to appear inline in a WBXML file. 0x03 marks the start of an inline string. A NULL character (0x00) marks the end of the string.

Notice that the same byte value 0x03 was used earlier to mark the WBXML version and character set declarations. This does not produce any ambiguity, because the first three bytes of a WBXML file are reserved for XML, DTD, and charset declarations. The actual WBXML data payload begins with the fourth byte.

0x01 marks all end tags. The combination of start tags (0x05, 0x06 and 0x07) and end tags (0x01) produce the same nested structure as that of the original XML file.

The rest of the WBXML file follows the same logic. Readers can trace Listing 3 to follow the sequence. This example illustrates that WBXML reduces the size of an XML fileq without changing its structure.

Design of a small footprint XPath engine

We will now consider the design of a small footprint XPath engine which operates on top of the DOM. DOM implementations are available for major wireless platforms, such as WinCE and Java 2 Micro Edition (J2ME).

Consider the two pseudo-code classes XPathExpression (Listing 4) and LocationStep (Listing 5). These two classes together implement an XPath engine.

XPathExpression's constructor takes two string type arguments: an XML file and an XPath expression. It loads the XML file into a DOM object. It also tokenizes the XPath expression string into an array of smaller strings, where each small string represents an XPath location step. While tokenizing, it also translates abbreviated XPath syntax into normal unabbreviated expressions.

The next step is to execute a loop in which each location step string is passed to the LocationPath constructor, one after the other. After the creation of a LocationPath object, its getResult method is called.

Every call to the getResult method needs an array of nodes (a node-set) as an argument. The getResult method will carry out the search on each node in the node-set and return another array of nodes (another node-set) that matches the search requirements of the location path.

The resulting node-set from each location step is fed to the next location step, until all location steps in the XPath query are consumed.

Now let's have a look at what's happening inside the getResult method of the LocationPath class. The constructor has already resolved the location step into axis and node-test strings. The getResult method will detect which axis we have to work with and call the appropriate sequence of operations. We have only implemented the child and descendant axes as a sample guideline. Parent and ancestor axes can also be implemented in a similar manner.


In this article I presented various XPath queries and discussed how to apply them to XML and WBXML formats. I also presented a basic design of an XPath engine. Next time, we will take this design further and include support for more XPath features.