jQuery and XML
October 15, 2007
Whether you're an admirer of AJAX, or one who can't stand all the hype, if you're a web developer you must admit that it's proven very useful in driving explosive competition among JavaScript utility libraries. And the embarrassment of riches keeps on growing. jQuery emerged a couple of years ago to great acclaim for its performance, elegant design, and handy features, and now it's one of the most popular JavaScript frameworks.
jQuery offers a lot of facilities, but it's best known for offering a cross-browser model for accessing and manipulating web page elements that means you don't have to deal with the endless pain of DOM. jQuery can be used for XML processing on the Web as well as HTML processing, and in this article I show some examples of this use. In developing code examples for this article I downloaded the uncompressed bundle of jQuery 1.2.1 and tested on Firefox 2.0.0.7.
The "X" in AJAX
The most cross-platform way to process XML these days is by using XMLHttpRequest. We can hope overall browser support of XML improves, but I'll start by showing how you can use jQuery to load and manipulate XML from an HTML web page. Listing 1 is a page representing a roster created from mailing labels in XML. When loaded all the user sees is a link. When the user clicks the link, the XML file is loaded and parsed to generate a list of names and IDs.
Listing 1. HTML with JavaScript to load XML for dynamic update
<!DOCTYPE html PUBLIC "-//W3C// DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">; <html> <head> <meta content="text/html; charset=iso-8859-1" http-equiv="Content-Type"> <title>Address book</title> <script src="jquery.js" type="text/javascript"></script> <script type="application/javascript"> $(function() { $('#update-target a').click(function() { $.ajax({ type: "GET", url: "labels.xml", dataType: "xml", success: function(xml) { $(xml).find('label').each(function(){ var id_text = $(this).attr('id') var name_text = $(this).find('name').text() $('<li></li>') .html(name_text + ' (' + id_text + ')') .appendTo('#update-target ol'); }); //close each( } }); //close $.ajax( }); //close click( }); //close $( </script> </head> <body> <p> <div id='update-target'> <a href="#">Click here to load addresses</a> <ol></ol> </div> </p> </body> </html>
Notice the two script
elements. The first loads the jQuery library itself,
and the second is the page-specific script. I'll briefly explain the outlines of the
code,
but to understand the example better you'll want to check out the jQuery tutorials.
The
$(function() {})
construct is a special jQuery construct; the code block
within the curly braces will not be executed until the document is fully loaded. This
alone eliminates error-prone code every web developer has reinvented to prevent premature
access of the DOM. The contained construct is also emblematic of
jQuery—$('#update-target a')
is a jQuery selector, much like a CSS
selector, specifying anchor children of the element with ID update-target
.
jQuery selectors return a collection of all matching elements, and the code takes
the
first (and in this case only) selected element and sets an onClick event handler.
The
event handler is given as a parameter to the click
method, in this case an
AJAX invocation to load the XML document using the special jQuery $.ajax({})
construct.
jQuery of course gets a lot of headlines for making AJAX easier to work with, but
that's
not the point of this article. I kept the AJAX code bog-simple, and left out even
error
handling, so you can focus on what happens when the document is successfully loaded.
This
is the success
item of the AJAX parameter structure, and it's another
anonymous function. The resulting document is XML, and the function uses jQuery API
to
process that XML. The find
method applies a jQuery selector to a context,
rather than the whole document, and in this case it's applied relative to the loaded
XML,
selecting all the label
elements. The each
method specifies an
action to be performed over all the selected elements, and this functional approach
to
iteration is another of jQuery's strengths, especially when dealing with DOM structures.
For each label the id
attribute's value is kept in a variable, using the
attr
method, and the text content of the name
element is kept
in another variable. The $('<li/>')
constructs an element on the fly
and the html
method adds HTML, or in this case plain text, as the element
content. Finally, this newly created element is added to the target ol
element.
Listing 2 is the XML file that's loaded and parsed by the above code.
Listing 2: Address label XML
<?xml version="1.0" encoding="iso-8859-1"?> <labels> <label id='ep' added="2003-06-10"> <name>Ezra Pound</name> <address> <street>45 Usura Place</street> <city>Hailey</city> <province>ID</province> </address> </label> <label id='tse' added="2003-06-20"> <name>Thomas Eliot</name> <address> <street>3 Prufrock Lane</street> <city>Stamford</city> <province>CT</province> </address> </label> <label id="lh" added="2004-11-01"> <name>Langston Hughes</name> <address> <street>10 Bridge Tunnel</street> <city>Harlem</city> <province>NY</province> </address> </label> <label id="co" added="2004-11-15"> <name>Christopher Okigbo</name> <address> <street>7 Heaven's Gate</street> <city>Idoto</city> <province>Anambra</province> </address> </label> </labels>
Straight up XML
You can escape the safety of XML loaded through AJAX and still get a lot of use from jQuery, but it is certainly less steady ground because browsers are all over the place when dealing with XML directly. Listing 3 (designers.xml) is an example XML file you can load directly into a browser to generate a simple list display. The XML uses CSS to offer a reasonable look, and loads scripts to enable user interaction, in this case to simulate links.
Listing 3 (designers.xml): XML for browser viewing
<?xml version='1.0' encoding='utf-8'?> <?xml-stylesheet type="text/css" href="designers.css"?> <designers> <blurb> <designer homepage="http://doria.co.uk">Doria Ltd.</ designer> of London </blurb> <blurb> <designer homepage="http://samsara.biz">Samsara Saris</ designer> of Mumbai </blurb> <blurb> <designer homepage="http://pcp.co.uk">Pikeman Camouflage, Plc.</designer> of London </blurb> <blurb> <designer homepage="http://mandalay.co.jp">Mandalay</ designer> of Tokyo </blurb> <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml" src="jquery.js" type="application/javascript"/> <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml" src="designers.js" type="application/javascript"/> </designers>
The script elements are in the XHTML namespace, and are processed as such by most modern browsers. The first script element loads the jQuery library and the second the more specific script. Listing 4 is the referenced CSS file (designers.css).
Listing 4 (designers.css): CSS for listing 3 XML
* { display: inherit; } designers { display: block; } blurb { margin: 1em; width: 20em; display: block; } designer { display: inline; text-decoration: none; color: green; border: thin blue solid; } script { display: none; }
This is pretty straightforward CSS for XML, including the trick of changing the default
display setting to inherit
and the top level to block
. This
minimizes the huge, run-in block of text display effect. Notice also the rule for
hiding the
script elements. Listing 5 is the relevant bit for this article, designers.js.
Listing 5 (designers.js): JavaScript for listing 4 XML
$(function() { $('blurb').each(function() { $(this).find('designer').click(function() { //document.location.href = ... works in Mozilla but not Safari window.location.href = $(this).attr('homepage') }); //close click( }); //close each( }); //close $( //append and html don't work for XML in FF2. replace does
This short and sweet number just simulates basic linking behavior on all the designer
elements. $('blurb').each (function() {})
selects all the blurb elements and
then iterates over the lot. $(this).find('designer').click (function() {})
sets
a mouse click event handler on each designer element, and the handler simply grabs
the
homepage
attribute and redirects to that location.
I've learned that it can become quite unpredictable which jQuery API bits will work
once
you are handling XML directly in this way. For example, the append
and
replace
methods don't seem to work on XML elements, thought the jQuery
documentation provides no warning of this. The docs do warn that the html
method doesn't work, but I expected that, anyway. But enough of jQuery does work in
this
scenario to get some useful work done.
Patching up XML Namespaces
jQuery has no selectors that understand XML namespaces. Even prior to version 1.2 when there was an option for XPath-like selectors, there was no namespaces support. This doesn't mean you can't use jQuery to process XML with namespaces. It just means you may have to sometimes take the escape hatch to DOM. Listing 6 is just the first relevant bits of listing 3 modified to use a default namespace.
Listing 6: XML in default namespace (excerpts)
<?xml version='1.0' encoding='utf-8'?> <?xml-stylesheet type="text/css" href="designers.css"?> <designers xmlns='http://example.com/designers'>; ... <xhtml:script xmlns:xhtml="http://www.w3.org/1999/xhtml" src="designers2.js" type="application/javascript"/> </designers>
My first attempt at a script (designers2.js) that actually did a namespace-aware
iteration over the {http://example.com/ designers}blurb
(to use James Clark's
notation of namespace names) is listing 7.
Listing 7: Using jQuery to select by XML namespace
NS = 'http://example.com/designers' $(function() { //Iterate over all elements with local name 'blurb' $('blurb').each(function() { //Check that the element truly is in the right namespace if ($(this).get(0).namespaceURI == NS) { $(this).find('designer').click(function() { window.location.href = $(this).attr('homepage') }); //close click( } }); //close each( }); //close $(
Here within the primary iterator over the selected result, I use an if
statement to further check the namespace URI of the matched DOM node. To get the DOM
node
itself I use the get
method, which takes an index and gives you the DOM node
with that index, in document order, from the selected results. Once I have the DOM
node I
can use DOM methods and properties such as namespaceURI
. Listing 7 is not
quite the jQuery way, though. After a while with the library you quickly get into
the
habit of simplifying things into the likes of listing 8, which is functionally equivalent
to listing 7.
Listing 8: Using jQuery's filter method to select by XML namespace
NS = 'http://example.com/designers' $(function() { //Iterate over all elements with local name 'blurb' $('blurb').filter(function(index) { //Check that the element truly is in the right namespace return $(this).get(0).namespaceURI == NS; }).each(function() { $(this).find('designer').click(function() { window.location.href = $(this).attr('homepage') }); //close click( }); //close each( }); //close $(
The filter
method takes an arbitrary function. Clearly this code is not as
neat as operations that fit entirely into jQuery's world-view, but it gets the job
done.
If you do a lot of this sort of thing you might want to define a standalone function,
say
nsmatch
which does the check, and can be passed to filter
by
name, reducing verbosity.
Wrap Up
JavaScript libraries are a matter of taste, and we can just thank our stars there is one for just about any taste. I came to enjoy jQuery because it made processing mainstream web content so much easier, and when I tried to make it do cool things with XML, I was pleased with how many things did just work, though some of the blind alleys were a bit unexpected. In this article I've given a quick overview of how you can use jQuery to process XML. I hope someone gets around to writing a plug-in that helps even more, providing the likes of namespace matching. jQuery is great work, and especially in the area of XMl processing, it can only get greater.