DOM and SAX Are Dead, Long Live DOM and SAX
November 14, 2001
Most serious books, tutorials, or discussions of XML processing for programmers mention, even if only in passing, DOM and SAX, the two dominant APIs for handling XML. A general discussion of XML programming which failed to mention DOM and SAX would be as neglectful of its duty as would a monarchical subject who, upon entering the royal chambers, failed to acknowledge the presence of the King and Queen. Failing to acknowledge the potentate is simply one of the things one must never do.
Just as the permissible forms of obligatory acknowledgment of the royal personage are both highly ritualized and customary, so, too, are the forms with which DOM and SAX are customarily introduced. We will be told, one may rest assured, that DOM is a tree-based API, one which builds an in-memory representation of an XML instance. Likewise, we will be told that SAX is an event-driven API, one which, rather than building an in-memory representation of an XML, calls event handlers as it encounters, serially, particular features of the XML instance. The moral of this highly ritualized story is invariably: one may often wish to use DOM parsing, but SAX is helpful when the size of XML instances exceeds available system memory.
One might suppose that every competent programmer with real-world XML experience has a firm grasp upon, if not fully mastered, both DOM and SAX, and that in consequence of such widespread competence there is very little left of interest to be said about them by and to an expert audience. And yet as a recent XML-DEV discussion seems to have proved, things are neither always as one might reasonably suppose, nor are they always as purely technical as one might wish.
DOM and SAX In the Real World
The discussion commenced with a question posed by Len Bullard.
How often do you as experienced XML developers find people in your shop using DOM for work more appropriate to SAX? Have you asked them why and what do they say? What are the costs of picking the wrong API?
Innocent and simple enough. But Bullard is asking a very particular, rather interesting question: do the programmers you work with rely too much on DOM, using it at times when SAX is more appropriate?
The aggregate response to Bullard's question suggests that he was spot-on to ask it. Most respondents have seen less experienced programmers overusing DOM, and no one seemed to represent the contrary position, that of less experienced programmers overusing SAX.
The interesting follow-ons to Bullard's question are to think about why DOM is overused, why SAX isn't, and what the future of XML processing APIs might look like.
The Psychology of SAX
David Hunter's answer to Bullard was suggestive of an important reality -- namely, SAX is hard for some programmers.
A lot of programmers are not really used to event-based programming, as used by SAX. They're more comfortable working with an in-memory object model than in keeping track of context as events are passed in, etc.
Hunter voices here a theme that was repeated, with some variation, through the discussion. It seems that the event-driven nature of SAX processing discomforts some programmers enough that they use DOM processing when possible.
Many other diagnoses of this discomfort were offered. Mike Champion said,
The event processing paradigm is just plain foreign to most people who haven't dealt with low-level grammars/parsers since college, which describes the overwhelming majority of professional programmers, I suspect. (Hmmm, maybe I'm wrong ... the low-level GUI APIs are event driven ... but I'll bet lots of people can handle "OK/Cancel" button event handlers but would be overwhelmed by the detailed thought required to write a SAX application).
Michael Brennan extended this point by pointing out that the event-driven nature of SAX leads to programming by state-machine, with which some programmers also have difficulty.
...[T]he difficulty, here, is not just with the notion of event-based programming, but with conceptualizing the design of a component as a state-machine. I've found many developers have difficulty thinking in those terms for whatever reason.
It seems, then, that for some programmers, SAX's event-driven nature, and the kind of conceptual moves that it requires, are off-putting. And, of course, the flip side, which was echoed nicely by Mike Kay, is that DOM processing is more clearly seen to fit with ways of conceptualizing program flow that are far less off-putting. As Kay says,
I think for many inexperienced programmers, the imperative navigational style, where their own program is in control and issues requests to other subsystems, is the only model they really feel comfortable with. It's a control thing, a perception that the job of the programmer is to tell the computer what to do next.
Perhaps it's not just that some programmers are uncomfortable with events and state machines but that SAX is, despite its many virtues and charms, a fairly spartan interface. Bob Hutchinson suggests as much.
I've found that SAX is anything but obvious to the programmers I've worked with, even programmers with extensive GUI experience...And even after being pointed to SAX they don't always have much of an idea of how to proceed. This isn't entirely their fault. We have nice frameworks for dealing with events generated by GUIs. With SAX there is no such thing, that I'm aware of. The developer is faced with a stream of events and no framework for dealing with them. Yes, I know that you can quickly put something together but I've been doing that for years, not every one has.
Two explanations for SAX discomfort ran through the discussion: inexperience and, well, psychology. Many people find SAX hard because it's new and odd. Others find it hard, even after it's no longer new, because it requires a kind of conceptualization which they simply do not favor. There's no right or wrong here, merely the well-known fact that people's brains are wired differently. People understand this pretty well in other fields, and it should not be surprising that it's relevant to computer programming too. No one who's struggled with, or helped others struggle through, for example, Scheme and recursive functions can deny that different programming paradigms often imply the need for different kinds of conceptual ability or propensity.
Michel Rodriguez summed this point nicely by saying,
I guess I am in the vast majority of programmers that find DOM-type (tree-oriented) processing much easier to grasp than SAX processing. It feels much easier to "be in control" of the document and to act on it than to let it drive my code.
The Social Dominance of DOM
Len Bullard, in a follow up to his own question, gave a bit more insight into why he'd asked it.
I'm curious about this subject because I keep seeing people who learn just the barebones, get to DOM, then use it for everything...Is XML just that hard to learn, too obscure, too different, or is this just ossification brought on by years of copying code and not looking beneath?
Which offers another way of looking at this issue. In addition to the (for some) psychological oddness of SAX, there's also a kind of social dominance of the DOM. It is, after all, the W3C's blessed XML processing API. It has vastly more corporate marketing and sales and training resources behind it.
Some people who answered Bullard's question suggested that they knew of XML programmers, who were competent using DOM, who simply did not know about SAX at all.
Several others mentioned the issue of programming platform dominance, which, in the XML world, comes down to Microsoft and Java, both of which are rather DOM-friendly, even if for different reasons.
Michael Brennan fingered a very crucial reason for DOM being very widely used:
The case of MSXML offers another good example of why many programmers use the DOM. Microsoft's DOM includes integrated XPath support. Developers can load XML into a DOM, then easily query the structure with XPath to extract the data they are after. Switching to SAX adds substantial complexity to the code, which now has to deal with state management strategies to keep track of where it is in the document at any moment.
I've found many developers in the Microsoft world take the integrated XPath support for granted and don't realize that it is not a standard DOM feature (yet).
Which actually suggests that sometimes, perhaps often, XPath is just the right tool for the job, and its association -- whether formal or informal -- with DOM matters. But in looking for reasons for DOM's overuse at the expense of wider SAX usage, it's hard to overestimate the degree to which the dominant client-side computing platform, the dominant web browser, and one of the dominant server platforms are all Microsoft products, that MSXML is widely used in all three products, and that it is DOM and XPath-friendly.
The utility of XPath suggests that its role in the repertoire of XML processing may well expand in the future. Again, Michael Brennan made this point as clearly as anyone.
I've been eyeing the dom4j, SAXPath, and Jaxen stuff with great interest, lately...[T]his notion of registering a handler to match subtrees based on XPath is very interesting. Using XPath as the glue between object models that can support an infoset abstraction is also very interesting. We commonly load XML into a DOM just so we can leverage XPath. We use tools for mapping XML elements/attributes to internal data structures and functions using XPath expressions. It would be great to have that same abstraction and ease of implementation without having to load a DOM to do it...
I hope this sort of approach gains wider acceptance and adoption. I think having the sort of abstraction that Jaxen affords offers far greater potential in the long run than looking toward the DOM (or even SAX) as the glue between XML and other object models.
In the open source community, XML programming often means Java programming. Of course open source languages, like Perl and Python and many others, have good or even excellent XML support. But Java is at the top of the heap in terms of number of tools, corporate support for those tools, number of training materials, and so on. While SAX support is very good in Java, there is an embarrassment of riches from which to choose when considering Java DOM programming, as Bob McWhirter suggested.
I think that in the open-source Java world, focus has been more on the infoset than on any given object-model. Since we have JDOM, dom4j, EXML, along with normal DOM, and only certain utilities are supported under certain models (ie, Xalan won't work with dom4j Documents directly), there's been a lot of work on translating one model to another.
Then, you have things like dom4j's ElementHandler interfaces, which allow folks who are used to processing object trees deal appropriately with very large datasets. You can register a handler to match particular subtrees. Do whatever processing you need (including XPath expressions), and then detach the sub-tree, freeing up memory for the rest of the parse...
In my experience, it's not just DOM vs SAX, but competition between the DOMs (sometimes mixing several in the same application) and SAX. And typically, dom4j's sub-tree mechanisms have keep me from having to venture into hard-to-maintain SAX code.
In short, when processing XML in Java, there are often good technical reasons why the DOM will work and the additional (for some) burdens of doing it the SAX way can be avoided. It's not at all clear that this richness of DOM implementations owes to a technical feature of Java itself; it's rather more likely that it's a function of Java's market dominance in server-side Internet projects, which was one of the first areas of obvious XML utility.
The Way Forward
Also in XML-Deviant
Given this confluence of technical and social and psychological factors, the decision of whether to use DOM or SAX or both or neither is a lot more complicated than the standard account often suggests. The issues are much more complex than simply memory usage.
And it may be a strategically crucial decision, as Bob Hutchinson reminded us.
What's the consequence of getting it wrong? Serious trouble. You end up with slow, ugly, unmaintainable code. Worse, I've seen developers using the resultant mess to avoid using XML altogether (we really are still in the early days of XML).
So what, as Mike Champion said, is the way forward? Well, as with any other monarchy, there are always anti-monarchists hanging about, waiting to depose the King and Queen, desperate to offer the masses an alternative. As go kingdoms, so goes the XML world. Several alternatives to DOM and SAX were mentioned, including XML data binding, XML pull parser, and other combinations of tree and event, in-memory and seriatim processing.
The XML development community doesn't need to be told that technical alternatives to dominant paradigms are important, but sometimes it may need to be reminded of it, which, I think, may be one of the virtues of the xml-dev list, which tends to hash and rehash and re-rehash the same technical issues. While that can often look like and be just wasted motion, it can also be a spark that fans the flames of alternative approaches and ideas, and that's a very good thing.
Finally, it should be remembered, as Bob Hutchinson wisely pointed out, despite the occasional bout of technical ennui, these really are the early days of XML. Despite their dominance, DOM and SAX each have their own warts and vices. It is fully to be expected that XML developers will eventually depose DOM and SAX from their high perch, but probably only by making them the foundation of every other high-level XML processing API of the future.
Two participants in the recent discussion suggested precisely that, and I will give them the last word. Paul Tchistopolskii predicted that new, high-level APIs are coming.
My prediction is that the era of low-level lexer (called SAX) and low-level model (called DOM) is over and there will be soon more high-level bindings on top of these low-level APIs (or not on top of them).
I think that asking developers to write all the code in terms of SAX or DOM APIs is like asking them to write programs in assembly language.
Michael Brennan concurred in principle.
I agree entirely. DOM and SAX will be the domain for applications doing generic XML processing. Developers trying to solve business problems will be using tools that abstract away all XML-specific APIs -- either using transformation technologies with high-level modeling tools or declarative mapping/transformation languages, or using data-binding technologies that hide XML beneath simpler and more familiar object models.
DOM and SAX have reigned for about as long as there have been XML documents to process. Let's hope they endure long enough to spawn more powerful and more graceful heirs.