DOM for Web Services, Part 3
In the first article of this series I discussed the XML authoring and processing requirements in web services, explained the DOM architecture along with the features in the three DOM levels, and introduced MSXML and Xerces, two popular DOM implementations.
In the second article I showed readers how to use MSXML, especially how to process WSDL files and develop web service user interfaces on the client side using MSXML inside JavaScript code. I also showed the use of MSXML on the server via an ASP.NET page.
In this third and final article of this series I demonstrate the use of Xerces, which is the most popular Java-based implementation of DOM. In this article's first section I develop a couple of Java classes that can create and process SOAP messages. This will demonstrate the basic DOM features of Xerces. In the second section I demonstrate the use of some other important features, including:
- Working with multiple XML documents in which you need to import XML nodes from one document into another.
- The use of Xerces to generate DOM events, and writing your own event handlers to handle the events generated.
- The use of DOM range and DOM document fragments. The DOM range specification provides an easy to use method for grouped processing of several XML nodes.
The third section contains a discussion of the Load and Save module, an important feature in DOM level 3 which is not yet supported in Xerces.
The last section wraps up this series by explaining the scenarios in which you will most likely use DOM for XML authoring and processing requirements in your web service applications.
W3C DOM and Xerces
Xerces is part of the Apache XML project. It is available for Java and C++. In this article I only cover Xerces for Java, which is commonly called Xerces-J. The most recent version of Xerces-J available at the time of writing is 2.6.
Note that W3C DOM is not the only XML API that Xerces supports. Xerces also supports SAX and a proprietary interface called Xerces Native Interface (XNI). Complete documentation about Xerces is available from the Xerces site. I will discuss only the W3C DOM features of Xerces.
Also note that the Java Web Services Developer Pack (JWSDP) from Sun includes standard XML processing Java APIs, including the Java API for XML Processing (JAXP). The current reference implementation of JAXP uses Xerces as its default XML processing engine. If you download JWSDP from Sun's site, you will get Xerces, and you won't need to download it separately.
However if you are using JDK1.4, you have a small problem to take care
of before starting to use Xerces. JDK1.4 ships with an older version of
Xerces. Even if you include Xerces jars in your classpath, the Java
runtime will use the older version of Xerces and not the one that comes
with JWSDP. The instructions for solving this problem come with the JWSDP
installation. When you install JWSDP under Windows (the latest release
for now is version 1.3), you will see instructions for JDK1.4 users saying
"Create the directory: <JAVA_HOME>\jre\lib\endorsed and
then copy the files in the following directory to the newly created
directory: C:\jwsdp-1.3\jaxp\lib\endorsed".
The files in the C:\jwsdp-1.3\jaxp\lib\endorsed directory
of JWSDP include a Xerces-J jar file named xercesImpl.jar. When you create
the new <JAVA_HOME>\jre\lib\endorsed directory and copy
the files from the C:\jwsdp-1.3\jaxp\lib\endorsed directory
to the newly created location, you are telling the Java runtime to use the
new version of Xerces instead of the old Xerces implementation that comes
as part of JDK1.4.
However, if you don't want to download and install JWSDP, you can
download Xerces
and copy the xercesImpl.jar file into the
<JAVA_HOME>\jre\lib\endorsed directory.
Once you have the xercesImpl.jar file at its correct
place, you will not need to include anything in your classpath to compile
and run the samples of this article. The source
code download contains source and compiled form of all the samples
that we are going to use for demonstration in this article.
Xerces for SOAP authoring
Look at Listing
3 of the first article of this series, which was a SOAP message that
we used to describe the usage model of web services. Notice that the SOAP
message contains elements belonging to two XML namespaces. The first is
the SOAP namespace and the second is an application specific namespace
(http://www.cityportal.com).
The use of these namespaces demonstrate that XML and SOAP
specifications allow building layered applications, where the
application-specific layer works on top of the SOAP layer. The SOAP
specification defines the Envelope, Header, and
Body elements and allows applications to define their own
namespaces to fill in the header and the body of a SOAP envelope.
This layered architecture is a great strength of XML web
services. It allows vendors to develop off-the-shelf standard solutions
(e.g. a SOAP client or a SOAP server) and application developers to add
only the application-specific bit of the layered framework. For example,
if you consider the SOAP message of Listing
3 of the first article, you will see that the only application
specific elements are GetCityWeatherReport and
CityName. The rest of the markup is standard SOAP.
We are going to use the same idea of layering application bits. We will have two classes in our sample DOM-based SOAP engine:
The DataWrapper class creates the application-specific
data that go along with the SOAP method call (e.g. the
CityName element in Listing
3 of the first article). The SOAPMessage class creates
the SOAP Envelope along with the SOAP Body. As
a SOAP request usually contains the name of a web service method, so the
same SOAPMessage class will also author the method element
(usually the immediate child of the SOAP Body).
But how do these classes use DOM to create XML?
Look at the add() method in Listing
1, which takes three parameters. The first parameter is the name of
the data element (e.g. CityName in Listing
3 of the first article). The second parameter specifies the namespace
to which the data element belongs. The third parameter specifies the
contents of the data element (e.g. "Karachi" in Listing
3 of the first article). The add() method simply stores
these parameters in a list. An application can call this method any number
of times. Every time an application calls this method, a new set of data
will be added to the items already stored in the list.
The appendAsChildren() method in Listing
1 takes just one parameter named parentElement, which is
a DOM element. The appendAsChildren() method takes all the
entries in the list one by one and adds them as child nodes to the
parentElement.
Notice from Listing
1 that the appendAsChildren() method first calls the
getOwnerDocument() method of the parentElement
object. The getOwnerDocument() method belongs to the DOM
Node interface. It returns the Document object
to which a DOM node belongs. We need to know the owner document whenever
we want to add a child element to an existing element.
After getting the owner Document object, the
appendAsChildren() method performs the following operations
for every entry in the list:
- Create a new element using the
createElementNS()method of the ownerDocumentobject. ThecreateElementNS()method takes two parameters. The first parameter is the namespace URI string for the element that you want to create. The second parameter is the name of the element. ThecreateElementNS()method returns the newly createdElementobject, which represents the name of a parameter that goes along with a SOAP method invocation request (e.g.CityNamein Listing 3 of the first article). - Append the newly created DOM
Elementas a child to parentElement by calling theappendChild()method of parentElement. - Create a new text node by calling the
createTextNode()method of the owner document object and append the text node as a child to the newly createdElementnode. This text node represents the value of the parameter that goes with a SOAP message call (e.g. "Karachi" in Listing 3 of the first article).
Just for the sake of demonstration, we have written a simple
main() method in Listing
1. The main() method demonstrates how an application will
use the functionality of the add() and
appendAsChildren() methods.
Now have a look at the SOAPMessage constructor in Listing
2. It takes three parameters: methodName,
methodNamespace, and parameters. The
methodName parameter represents the name of the SOAP method
that the SOAP message will invoke on a remote server
(e.g. GetCityWeatherReport in Listing
3 of the first article). The methodNamespace parameter
represents the namespace to which the methodName element belongs
(e.g. "http://www.cityportal.com" in Listing
3 of the first article). The parameters parameter is a
DataWrapper object which wraps all the data that goes with
the SOAP method invocation request.
The SOAPMessage constructor creates a SOAP message. So
you first have to create a new empty XML document. Creating a new XML DOM
document in Xerces takes three steps. You first instantiate a
DocumentBuidlerFactory, then you create a
DocumentBuilder, and then using the
newDocument() method of the DocumentBuilder, you
create a DOM Document object. You will use the
newDocument() method whenever you want to create a new empty
XML DOM document containing no data. The Document object that
the newDocument() returns exposes the DOM
Document interface.
Once you have the DOM document, you can author the root
Envelope element by using the createElementNS()
method discussed earlier.
After creating the Envelope element, you need to attach the
element to its parent. As Envelope is the root element, so
the Document object is its parent. Therefore, you will call
the appendChild() method of the Document object
to attach the Envelope element to the document.
Note that an XML document can have only one root element. That's why you
can attach only one element node to a Document object. If you
try to attach more than one element node, you will get an exception at
runtime.
In a similar manner we have created the Body element (the
bodyElement object), attached it to the Envelope
element, created the SOAP method name element (the
methodElement object), and attached it to the
Body element.
Finally we have to author the elements that represent parameters
associated with the SOAP method invocation request. This is the job of the
appendAsChildren() method of the DataWrapper
class that we have already explained. You will call the
appendAsChildren() method of the parameters object and pass
the methodElement object along with the method call. This
will automatically append the parameters data to the SOAP method call.
Also look at the getSOAPRequestText() method in Listing
2, which was written to demonstrate XML processing in Xerces. It takes
a Document object and returns its XML data in string form. It
uses a method called getElementAsText(), which is recursive
and is responsible for creating the XML data corresponding to the root
element and all its children.
The following points are worth noting from the
getElementAsText() method in Listing
2:
- We have used the
getTagName()method of theElementobject to read the tag name of the element. The tag name consists of both the prefix and the local name (i.e. if the prefix is "env" and the local name is "Envelope", the tag name will be "env:Envelope"). - We have used the
getAttributes()method of theElementobject to read all the attributes of an element into aNamedNodeMapobject. ANamedNodeMapobject is used to hold a number of nodes, where each node is accessible by name or index number. We have usedgetLength()anditem()methods of theNamedNodeMapinterface to fetch all attribute nodes . ThegetLength()method returns the total number of nodes in aNamedNodeMapand theitems()method returns the node at a particular index. - We have used the
getNamespaceURI()method to get the namespace URI of each element. Recall from earlier discussion that thecreateElementNS()method creates an element with a namespace URI and a tag name. ThegetNamespaceURI()method returns the same URI. - We have used the
getPrefix()method to fetch the namespace prefix of all elements. - The
Node.getNodeType()method tells the type of a node (e.g. whether a node is a text node or an element node). We have used this method to differentiate text nodes from element nodes.
The main() method in Listing
2 simulates a simple SOAP application. We have instantiated a
DataWrapper class and called its add() method
once to add one parameter. We have then instantiated a
SOAPMessage object and passed the DataWrapper
object to the SOAPMessage constructor. Listing
3 shows the resulting SOAP message.
Some Important DOM Features
This section demonstrates some important DOM features of Xerces that are not covered in the sample SOAP application of the previous section.
Copying DOM Nodes from one document into another
Have a look at Listing
4, which is a simple Java class named DOMCopySample.java. The
main() method of this class demonstrates how to copy DOM
nodes from one document into another.
Notice from Listing
4 that we have used the parse() method of the
DocumentBuilder object to load an XML file into the DOM
Document object named sourceDoc. The name of the
file that the parse() method will parse is "inputXML.xml". We
have shown the "inputXML.xml" file in Listing
5, which contains several invoice elements.
Recall that when we were creating the SOAP message document in Listing
2, we used the newDocument() method of the
DocumentBuilder class to create an empty DOM document with no
XML data. You will use the parse() method (instead of the
newDocument() method) when you want to create a DOM document
from an existing XML file or an input data stream containing XML data. The
parse() method parses the input XML data, loads the data into
a DOM Document object, and returns the Document
object.
After loading the XML file into the sourceDoc object, we have
called the getElementsByTagName() method of the
Document object and passed "invoice" as a parameter. The
getElementsByTagName() method belongs to the DOM
Document interface. It takes the name of an element as a
parameter and returns a NodeList object, which contains a
list of all elements in the DOM document that have names matching the
input parameter to the getElementsByTagName() method call.
NodeList is a DOM interface, which exposes the abstract
functionality of a list of nodes. It contains just two methods,
getLength() and item(int index). The
getLength() method returns the number of nodes in the
NodeList and the item(int index) method returns
the node at a particular index.
Some readers may want to compare the NodeList interface with
the NamedNodeMap interface discussed earlier. The main
difference is that you cannot access individual nodes in a
NodeList by names of nodes, while you can do this in a
NamedNodeMap.
After getting the NodeList object in Listing
4, we have created a new empty DOM document object named
targetDoc. We have then created an
invoiceWrapper element, which serves as the root element of
the newly created targetDoc object.
Next we have taken each element in the NodeList and passed it
to the importNode() method of the targetDoc
object. The importNode() method imports a node from one
document into another document. It takes two parameters. The first
parameter is a node which you want to import from some other DOM
document. The second parameter is of boolean type. If the second parameter
is true, the importNode() method will import the node along
with all its child nodes (i.e. the complete tree of nodes whose root
starts at the node being imported). If the second parameter is false, the
importNode() method only imports the node without any of its
children.
After importing the invoice elements from
sourceDoc to targetDoc, we have appended the
imported elements as children of the invoiceWrapper
element. Listing
6 shows how targetDoc looks like after importing all the
invoice nodes of the sourceDoc (the inputXML.xml
file of Listing
5).
Pages: 1, 2 |