DOM for Web Services, Part 1

October 14, 2003

Faheem Khan

In this first article of a three-part series, I offer a tutorial of the W3C Document Object Model (DOM) with particular application to web services. Here we will discuss, explain, and demonstrate what DOM can do for the XML authoring and processing required by web service applications. We will consider each of the three DOM levels and apply the DOM functionality to explain why, when, and how to use the various features that W3C DOM offers. Each level provides a certain collection of XML authoring and processing features. Levels 1 and 2 have become W3C recommendations, while the third level is currently a working draft at W3C.

The first article is divided into two sections. The first section discusses the XML authoring and processing requirements in web services, and the second introduces DOM and two of its implementations, namely, Microsoft XML (MSXML) and Xerces.

Note that we acknowledge that DOM is not always the only or best way to process either XML or web services, but it has advantages in ubiquity and deployment.

XML Authoring and Processing Requirements in Web Services

Listing 1 is a simple WSDL file. The purpose of WSDL is to define the syntax for describing web service interfaces. The syntax includes three major components: the name of the request method, the parameters that go along with the request, and the details of the message that the web service will send in response. Further explanation of WSDL can be found in various articles on

A WSDL description can be processed to produce an HTML-based presentation format for human interaction with web services and to produce SOAP messages that will interact with a SOAP server.

Listing 2 is a simple HTML file that we have derived from the WSDL file of Listing 1. At the moment we are only concerned with explaining the XML processing and authoring requirements of transforming a WSDL file into an HTML file for presentation to web service users. We will later demonstrate how this is accomplished using DOM APIs.

Notice the following points about the WSDL to HTML transformation:

  1. The HTML file contains a paragraph (a p element) and a table.

  2. The paragraph contains a heading (an h3 element) and a sentence.

  3. Notice that the string wrapped inside the h3 element is "WeatherService" , which has been taken from the name attribute of the service element in Listing 1. Recall from earlier discussion that the name attribute value of the service element is the name of the service. That's why we have used this name as the heading for our web service invocation page.

  4. Similarly, we have copied the description of our web service from the WSDL file and included it immediately after the heading in the HTML file.

  5. The table in the HTML file of Listing 2 contains a form. The form in turn contains a number of rows.

  6. The first and second rows contain the name and description of the GetCityWeatherReport request method. These two bits of information are also copied from the corresponding elements of the WSDL file of Listing 1.

  7. The table also contains an input element, which represents the parameter that this request method takes to invoke the GetCityWeatherReport web service method. The user will need to enter the name of the city for which he wants to get the whether report.

The HTML file of Listing 2 is derived from the WSDL file of Listing 1 to enable human interaction with our web service. Similarly, we can also author a SOAP request by reading bits of information from a WSDL file.

Look at Listing 3, which is a SOAP message that carries the GetCityWeatherReport method invocation request to our web service. Of particular interest,

  1. The name of the method (GetCityWeatherReport) defined in the WSDL file becomes the name of the immediate child element of the SOAP Body element.

  2. The user who wants to invoke our web service will need to provide the CityName parameter value (the part element inside the first message element in Listing 1). This value is wrapped inside the CityName element in Listing 3. Notice that the name attribute value (CityName) of the part element in Listing 1 becomes an element name in the SOAP request of Listing 3.

This description of the WSDL-to-HTML transformation and SOAP authoring is meant to give you an idea about the XML processing requirements in web services. This type of transformation can be accomplished in several ways, one of which is by using DOM.

When you are using DOM, you have the option of using both client and server side DOM implementations. The client side DOM runs inside a browser, while a server side DOM sits on the web server. If you decide to use client side DOM, you can send your WSDL file as such to a browser along with the necessary script (e.g. JavaScript) that will use the browser's DOM implementation to author HTML and SOAP.

On the other hand, if you decide to use server side DOM, your web server will use its DOM implementation to author HTML and SOAP.

This series of article will demonstrate the use of both client and server side DOM implementations. It is time when we start looking at the DOM APIs. The following section introduces DOM and two of the most important DOM implementations, MSXML and Xerces.

Document Object Model

DOM represents an XML document as a tree of nodes. DOM defines various types of nodes corresponding to the different XML constructs. For example an XML element is an element node, an XML attribute-value pair is an attribute node, the content of an element is a text node and so on.

You can load an XML document into a DOM tree of nodes. For example a partial DOM representation of Listing 1 is shown in Figure 1. We have shown all element nodes as green, and all attribute nodes as blue in Figure 1.

Figure 1: Partial DOM representation of Listing 1

Nodes in a DOM model are related to each other in a parent-child or sibling relationship. For example, note from Figure 1 that message and portType elements are children of the definitions element and are siblings of each other. Notice that the document node in Figure 1 is the parent of the root (definitions) node, which means all nodes in a DOM model are direct or indirect children of the document node.

DOM interfaces use the concept of inheritance from Object Oriented Programming. The different types of nodes (such as element node, attribute node, text node etc.) inherit from a generic Node interface.

The tree structure of DOM is object oriented, which means that the different nodes shown in Figure 1 are not just data structures. Each node is an object, which contains both data and methods to manipulate the object. For example, you can take the definitions node and call its methods to get a list of its child element nodes. You can then take a particular child element and call its methods to move further down the DOM tree. This way you can traverse the entire DOM document.

In order for all this to work, the DOM specification defines the programmatic interfaces for different types of nodes to expose their functionality. Using these programmatic interfaces of DOM, you can fulfill all the XML authoring and processing requirements of web services discussed earlier in this article. You will take your WSDL file, load it into a DOM document, and then use the various DOM interfaces for XML authoring and processing.

Recall from the discussion in the last section that we may need to generate HTML pages from our WSDL files. For such applications, DOM provides an HTML-specific interface, which is especially designed to author or process HTML documents. Therefore, we can load our WSDL file into a DOM document, read the various bits of information from the WSDL file, and author HTML using the HTML-specific DOM interfaces.

From Level 1 to 3

W3C has developed DOM in levels. Level 1 included the basic features for XML processing such as traversing through the structure of an XML file, getting all children of a particular node, checking the type of a node (whether it is an element node or an attribute node or some other type of node), setting and reading the attributes of a particular element node, jumping to the next or previous sibling of a particular node, adding or removing attributes of a particular element node, appending a new child to a node, getting a list of all elements with a particular name, working with XML processing instructions, etc.

In addition to these XML features, DOM Level 1 also contains an HTML-specific set of interfaces, which can handle individual HTML elements. For example, in DOM Level 1, you can work with HTML tables, forms, and selection lists etc.

Level 2 has added several features to those of DOM Level 1. For example, there was no support of namespaces in DOM Level 1. While working with several namespaces in the same DOM Level 1 document, you have to write your own programmatic logic to manage namespace URIs, prefixes, and element names. DOM Level 2 interfaces contain methods to manage namespace related authoring and processing requirements.

DOM Level 3 is currently a working draft at W3C and adds further to DOM Level 2 functionality. One of the important features that is being added to DOM in its Level 3 is the ability to work with multiple schema-specific DOM extensions.

For example, you may want to define schema-specific programmatic interfaces for your own XML vocabularies, just like the HTML-specific interfaces in DOM Level 1. If you do so, you have developed a high level programmatic interface to manipulate XML according to your own schema. Similarly many companies will come up with their own schema-specific XML authoring and processing interfaces, which can all be implemented using DOM interfaces at a lower level.

Sometimes you will need that such schema-specific interfaces be used in one application together with standard DOM interfaces. DOM level 3 provides a mechanism for developing schema-specific interfaces as DOM extensions and using different DOM extensions together in one application.

In web service applications it is very common that different namespaces be used together in the same WSDL-based web service interface. Therefore, schema-specific DOM interfaces may be developed and used together in a single web service application. The second and third articles of this series will discuss this concept further.


MSXML, by Microsoft, implements DOM Level 2 interfaces. In addition MSXML also adds some extended features not included in DOM Level 2.

Some of the MSXML's DOM level 2 interfaces include:

  1. IXMLDOMNode interface is the generic DOM node from which all the different types of nodes extend. You will never use this interface directly. Rather you will use the different types of nodes (e.g. element node, attribute node etc.) that extend from the IXMLDOMNode interface.

  2. IXMLDOMDocument interface exposes the functionality of a document node, which holds the entire DOM document as its direct or indirect children. You will normally start DOM authoring and processing with this interface.

  3. IXMLDOMElement interface defines the functionality of a DOM element node. For example, it has methods to get the list of attributes associated with an element or fetch a particular attribute value, etc.

  4. IXMLDOMNodeList interface helps in manipulating a list of DOM nodes (such as all child element nodes of an XML element). It contains methods to iterate through the child element list and find a particular node in the list.

  5. IXMLDOMText interface exposes methods to manipulate the textual content of an XML element.

  6. IXMLDOMComment interface is used to play with comments in an XML file.

  7. IXMLDOMAttribute interface represents the functionality of an attribute node (for example to edit an attribute value).

In addition, MSXML also implements the HTML-specific DOM interfaces for HTML authoring and processing. The HTML-specific interfaces provide simple set and get methods to populate and HTML body with <a>, <p>, <table>, <form>, <input> and other HTML elements.

You can use MSXML on both client and server sides. On the client side, you will use MSXML inside Microsoft Internet Explorer. For example, if you want to generate the HTML file of Listing 2 from the WSDL file of Listing 1, you can generate JavaScript from the server side that will use MSXML to produce HTML on the client side.

Or otherwise you can directly generate HTML on the client side using the same DOM features of MSXML.

Have a look at Listing 4, which is meant to introduce XML processing with MSXML. Listing 4 is actually an HTML file with a simple JavaScript method named FindServiceDocumentation, which reads the contents of the documentation element within the service element of Listing 1 and prompts the comments to the user as a browser alert.

Following is a brief explanation of how the FindServiceDocumentation method in Listing 4 works using MSXML:

  1. The first line in the FindServiceDocumentation method instantiates an MSXML object.

  2. The second, third, and fourth lines of code method are variable declarations, which we will use to hold different DOM element nodes.

  3. The fifth line (xmlDoc.load) loads a WSDL file into DOM.

  4. The next line is an if statement, which checks whether there were any errors while loading the XML document. The most probable error is perhaps that the XML that you want to load into DOM is not well formed.

  5. If there were no problems while loading XML into DOM, we will read the root element and check if it is definitions. If it is not definitions, we are not interested in further processing.

  6. If the root element is definitions, we will try to find its service element. Note that the line serviceElement = definitionsElement.getElementsByTagName("service").item(0) actually works in two steps. First the getElementsByTagName returns a list of all service elements and then the method item(0) returns the first element from the list.

  7. Similarly, we read the documentation element in the service element.

  8. Next look at the alert line, which contains documentationElement.firstChild.nodeValue method call. This method works in two steps. The documentationElement.firstChild method call fetches the first child node of the documentation element. The first child of the documentation element is the required text node that we were looking for. The nodeValue property returns the value of the text node as a textual string. The alert statement displays the textual string as an alert.

We have kept Listing 4 and its explanation very simple and brief, as this is only an introduction. Real world XML authoring and processing is never so simple. We will take this example further in the second article of this series and demonstrate the use of MSXML for both XML and HTML processing as well as authoring in detail.


Xerces is a Java-based DOM Level 2 implementation by Apache. In addition Xerces partially supports DOM Level 3. You can create Java-based SOAP client side applications using Xerces, however it is mostly used on the server side.

Xerces interfaces are similar to the interfaces that you have seen while discussing MSXML, which is natural to expect, as both are DOM implementations.

The third article of this series will demonstrate the use of Xerces in detail. For the moment, just notice the following points Xerces interfaces:

  1. Almost all interfaces extend from the Node interface.

  2. The Document interface allows you to manipulate the complete XML document. Normally you will use the Document interface to get the root element of the XML document. You can also use this interface to import a node from some other DOM document into this document.

  3. The Element interface allows you to process or author elements (e.g. adding attributes to or removing attributes from an element).

  4. The NodeList interface is used to manipulate a list of nodes.

We have provided a brief introduction of MSXML and Xerces. The second article will demonstrate the use of MSXML, while the third article will demonstrate the use of Xerces. Both the second and third articles will use the web services usage models and XML authoring and processing requirements as an example application scenario to demonstrate the use of DOM features.