jQuery and XML

October 15, 2007

Whether you're an admirer of AJAX, or one who can't stand all the hype, if you're a web developer you must admit that it's proven very useful in driving explosive competition among JavaScript utility libraries. And the embarrassment of riches keeps on growing. jQuery emerged a couple of years ago to great acclaim for its performance, elegant design, and handy features, and now it's one of the most popular JavaScript frameworks.

jQuery offers a lot of facilities, but it's best known for offering a cross-browser model for accessing and manipulating web page elements that means you don't have to deal with the endless pain of DOM. jQuery can be used for XML processing on the Web as well as HTML processing, and in this article I show some examples of this use. In developing code examples for this article I downloaded the uncompressed bundle of jQuery 1.2.1 and tested on Firefox 2.0.0.7.

The "X" in AJAX

The most cross-platform way to process XML these days is by using XMLHttpRequest. We can hope overall browser support of XML improves, but I'll start by showing how you can use jQuery to load and manipulate XML from an HTML web page. Listing 1 is a page representing a roster created from mailing labels in XML. When loaded all the user sees is a link. When the user clicks the link, the XML file is loaded and parsed to generate a list of names and IDs.

Listing 1. HTML with JavaScript to load XML for dynamic update

<!DOCTYPE html PUBLIC "-//W3C//  DTD HTML 4.01//EN"

   "http://www.w3.org/TR/html4/strict.dtd">;

 <html>

   <head>

     <meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type">

     <title>Address book</title>

     <script src="jquery.js" type="text/javascript"></script>

     <script type="application/javascript">

     $(function() {

         $('#update-target a').click(function() {

             $.ajax({

                 type: "GET",

                 url: "labels.xml",

                 dataType: "xml",

                 success: function(xml) {

                     $(xml).find('label').each(function(){

                         var id_text = $(this).attr('id')

                         var name_text = $(this).find('name').text()



                         $('<li></li>')

                             .html(name_text + ' (' + id_text + ')')

                             .appendTo('#update-target ol');

                     }); //close each(

                 }

             }); //close $.ajax(

         }); //close click(

     }); //close $(

     </script>

   </head>

   <body>

     <p>

       <div id='update-target'>

         <a href="#">Click here to load addresses</a>

         <ol></ol>

       </div>

     </p>

   </body>

 </html>

Notice the two script elements. The first loads the jQuery library itself, and the second is the page-specific script. I'll briefly explain the outlines of the code, but to understand the example better you'll want to check out the jQuery tutorials. The $(function() {}) construct is a special jQuery construct; the code block within the curly braces will not be executed until the document is fully loaded. This alone eliminates error-prone code every web developer has reinvented to prevent premature access of the DOM. The contained construct is also emblematic of jQuery—$('#update-target a') is a jQuery selector, much like a CSS selector, specifying anchor children of the element with ID update-target. jQuery selectors return a collection of all matching elements, and the code takes the first (and in this case only) selected element and sets an onClick event handler. The event handler is given as a parameter to the click method, in this case an AJAX invocation to load the XML document using the special jQuery $.ajax({}) construct.

jQuery of course gets a lot of headlines for making AJAX easier to work with, but that's not the point of this article. I kept the AJAX code bog-simple, and left out even error handling, so you can focus on what happens when the document is successfully loaded. This is the success item of the AJAX parameter structure, and it's another anonymous function. The resulting document is XML, and the function uses jQuery API to process that XML. The find method applies a jQuery selector to a context, rather than the whole document, and in this case it's applied relative to the loaded XML, selecting all the label elements. The each method specifies an action to be performed over all the selected elements, and this functional approach to iteration is another of jQuery's strengths, especially when dealing with DOM structures. For each label the id attribute's value is kept in a variable, using the attr method, and the text content of the name element is kept in another variable. The $('<li/>') constructs an element on the fly and the html method adds HTML, or in this case plain text, as the element content. Finally, this newly created element is added to the target ol element.

Listing 2 is the XML file that's loaded and parsed by the above code.

Listing 2: Address label XML

<?xml version="1.0" encoding="iso-8859-1"?>

 <labels>

   <label id='ep' added="2003-06-10">

     <name>Ezra Pound</name>

     <address>

       <street>45 Usura Place</street>

       <city>Hailey</city>

       <province>ID</province>

     </address>

   </label>

   <label id='tse' added="2003-06-20">

     <name>Thomas Eliot</name>

     <address>

       <street>3 Prufrock Lane</street>

       <city>Stamford</city>

       <province>CT</province>

     </address>

   </label>

   <label id="lh" added="2004-11-01">

     <name>Langston Hughes</name>

     <address>

       <street>10 Bridge Tunnel</street>

       <city>Harlem</city>

       <province>NY</province>

     </address>

   </label>

   <label id="co" added="2004-11-15">

     <name>Christopher Okigbo</name>

     <address>

       <street>7 Heaven's Gate</street>

       <city>Idoto</city>

       <province>Anambra</province>

     </address>

   </label>

 </labels>

Straight up XML

You can escape the safety of XML loaded through AJAX and still get a lot of use from jQuery, but it is certainly less steady ground because browsers are all over the place when dealing with XML directly. Listing 3 (designers.xml) is an example XML file you can load directly into a browser to generate a simple list display. The XML uses CSS to offer a reasonable look, and loads scripts to enable user interaction, in this case to simulate links.

Listing 3 (designers.xml): XML for browser viewing

<?xml version='1.0' encoding='utf-8'?>

 <?xml-stylesheet type="text/css" href="designers.css"?>

 <designers>

   <blurb>

     <designer homepage="http://doria.co.uk">Doria Ltd.</ 

 designer>

     of London

   </blurb>

   <blurb>

     <designer homepage="http://samsara.biz">Samsara Saris</ 

 designer>

     of Mumbai

   </blurb>

   <blurb>

     <designer homepage="http://pcp.co.uk">Pikeman Camouflage,  

 Plc.</designer>

     of London

   </blurb>

   <blurb>

     <designer homepage="http://mandalay.co.jp">Mandalay</ 

 designer>

     of Tokyo

   </blurb>

   <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml"

                 src="jquery.js"

                 type="application/javascript"/>

   <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml"

                 src="designers.js"

                 type="application/javascript"/>

 </designers>

The script elements are in the XHTML namespace, and are processed as such by most modern browsers. The first script element loads the jQuery library and the second the more specific script. Listing 4 is the referenced CSS file (designers.css).

Listing 4 (designers.css): CSS for listing 3 XML

* { display: inherit; }



 designers { display: block; }



 blurb {

   margin: 1em;

   width: 20em;

   display: block;

 }



 designer {

   display: inline;

   text-decoration: none;

   color: green;

   border: thin blue solid;

 }



 script { display: none; }

This is pretty straightforward CSS for XML, including the trick of changing the default display setting to inherit and the top level to block. This minimizes the huge, run-in block of text display effect. Notice also the rule for hiding the script elements. Listing 5 is the relevant bit for this article, designers.js.

Listing 5 (designers.js): JavaScript for listing 4 XML

$(function() {

     $('blurb').each(function() {

         $(this).find('designer').click(function() {

             //document.location.href = ... works in Mozilla but not  

 Safari

             window.location.href = $(this).attr('homepage')

         }); //close click(

     }); //close each(

 }); //close $(



 //append and html don't work for XML in FF2.  replace does

This short and sweet number just simulates basic linking behavior on all the designer elements. $('blurb').each (function() {}) selects all the blurb elements and then iterates over the lot. $(this).find('designer').click (function() {}) sets a mouse click event handler on each designer element, and the handler simply grabs the homepage attribute and redirects to that location.

I've learned that it can become quite unpredictable which jQuery API bits will work once you are handling XML directly in this way. For example, the append and replace methods don't seem to work on XML elements, thought the jQuery documentation provides no warning of this. The docs do warn that the html method doesn't work, but I expected that, anyway. But enough of jQuery does work in this scenario to get some useful work done.

Patching up XML Namespaces

jQuery has no selectors that understand XML namespaces. Even prior to version 1.2 when there was an option for XPath-like selectors, there was no namespaces support. This doesn't mean you can't use jQuery to process XML with namespaces. It just means you may have to sometimes take the escape hatch to DOM. Listing 6 is just the first relevant bits of listing 3 modified to use a default namespace.

Listing 6: XML in default namespace (excerpts)

 <?xml version='1.0' encoding='utf-8'?>

 <?xml-stylesheet type="text/css" href="designers.css"?>

 <designers xmlns='http://example.com/designers'>;

 ...

   <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml"

                 src="designers2.js"

                 type="application/javascript"/>

 </designers>

My first attempt at a script (designers2.js) that actually did a namespace-aware iteration over the {http://example.com/ designers}blurb (to use James Clark's notation of namespace names) is listing 7.

Listing 7: Using jQuery to select by XML namespace


NS = 'http://example.com/designers'



 $(function() {

     //Iterate over all elements with local name 'blurb'

     $('blurb').each(function() {

         //Check that the element truly is in the right namespace

         if ($(this).get(0).namespaceURI == NS) {

             $(this).find('designer').click(function() {

                 window.location.href = $(this).attr('homepage')

             }); //close click(

         }

     }); //close each(

 }); //close $(

Here within the primary iterator over the selected result, I use an if statement to further check the namespace URI of the matched DOM node. To get the DOM node itself I use the get method, which takes an index and gives you the DOM node with that index, in document order, from the selected results. Once I have the DOM node I can use DOM methods and properties such as namespaceURI. Listing 7 is not quite the jQuery way, though. After a while with the library you quickly get into the habit of simplifying things into the likes of listing 8, which is functionally equivalent to listing 7.

Listing 8: Using jQuery's filter method to select by XML namespace

NS = 'http://example.com/designers'



 $(function() {

     //Iterate over all elements with local name 'blurb'

     $('blurb').filter(function(index) {

         //Check that the element truly is in the right namespace

         return $(this).get(0).namespaceURI == NS;

     }).each(function() {

         $(this).find('designer').click(function() {

             window.location.href = $(this).attr('homepage')

         }); //close click(

     }); //close each(

 }); //close $(

The filter method takes an arbitrary function. Clearly this code is not as neat as operations that fit entirely into jQuery's world-view, but it gets the job done. If you do a lot of this sort of thing you might want to define a standalone function, say nsmatch which does the check, and can be passed to filter by name, reducing verbosity.

Wrap Up

JavaScript libraries are a matter of taste, and we can just thank our stars there is one for just about any taste. I came to enjoy jQuery because it made processing mainstream web content so much easier, and when I tried to make it do cool things with XML, I was pleased with how many things did just work, though some of the blind alleys were a bit unexpected. In this article I've given a quick overview of how you can use jQuery to process XML. I hope someone gets around to writing a plug-in that helps even more, providing the likes of namespace matching. jQuery is great work, and especially in the area of XMl processing, it can only get greater.