Architectural Style

August 15, 2001

Leigh Dodds

This week the XML-Deviant summarizes an XML-DEV discussion concerning the utility and short-comings of XSLT.

Design Choices

Fred Gomez prompted an interesting discussion this week by describing a real-world scenario of a simple web-based XML document management system, asking some key questions of the XML-DEV members. What database should be used? What presentation technology, with supporting tools, could be considered best of breed?

The best place to seek answers to the first of these questions is in Ronald Bourret's XML Databases paper and in his related XML Database product review. As the Deviant has explored in the past, there are many different opinions on how best to mix XML with a database. However it was the second half of Gomez's inquiry that prompted the most response, much of it criticizing XSLT as a presentation tool, raising strong echoes of the XSL versus CSS debates that have been prevalent over the last two years. See "XSL and CSS: One Year Later" for the most recent coverage of this issue.

Alaric Snell questioned the selection of XSLT over a programming language, like PHP, to generate presentation content.

...What will it gain you as opposed to a set of PHP pages that talk directly to the database? Unless you have particular programming experience with XSLT and not PHP/ASP/CGIs/whatever, I find it easier to set up PHP under Apache than to set up an XSLT engine and put things in place to make a web server invoke it against XML pulled from an SQL database.

Snell suggested managing the separation of content and presentation through use of appropriate function libraries that can be invoked as required by a web developer, but written by an application developer.

Michael Champion picked up the ball here and produced a list of benefits of using XML/XSLT in the presentation layer, citing strong separation of content from presentation, platform-independence, interoperability, and a declarative approach as some obvious wins. But not everyone agreed.

The XSLT Hammer

Sean McGrath was particularly critical of the usefulness of XSLT. He claimed that its limitations are quickly reached in even simple tasks:

In my experience, be it styling for WML or schema-to-schema transformations you hit the limits of any sandboxed, declarative syntax such as XSLT really, really fast.

XSLT is an example of a 20/80 point technology :-) It gets you 80% towards a solution quickly but makes the remaining 20% either impossible or so hard that the time it takes to get that last 20% done, wipes out the gains you made on the first 80%!

In a subsequent message, McGrath cited both the XSLT specification's statement that XSLT is not a "general-purpose" XML transformation tool and recent comments from James Clark that outline some of XSLT's limitations. Outlining his own experiences, McGrath also made a dire forecast for those who rely too much on XSLT.

Unless the work I do with XML (both internally and consulting with some thumpingly big international corporations) is very unrepresentative, the limits of XSLT hit hard and very fast in the development cycle.

Most non-textbook XML transformations I am involved with require either a) PCDATA based manipulations and b) external integration e.g. dbms, web services etc. Insofar as they are possible with XSLT they are complex and read-only at best. In many cases they are just not possible at all.

I repeat my assertion made in an earlier post that the irrational exuberance about XSLT will be the cause of wholesale server-side re-writes of XML transformation systems in the medium term.

Paul Tchistopolskii believed that while there may be problems with XSLT, the real issue is that its capabilities have been over-sold.

I think that the problem with XSLT is that XSLT is often misleading, pretending to be more "powerful" and "portable" than it actually is.

This statement does *not* equal to "XSLT is a us[ele]ess crap, don't use it".

In the same message Tchistopolskii expanded on his comments, claiming that there is a lot of un-needed complexity in XSLT, suggesting some ways to approach the learning curve when using the technology.

The difficulties of learning XSLT were mentioned by several contributors, particularly getting to grips with a template-driven rather than a procedural approach. However Don Park asserted that not only is XSLT low on the list of things that most developers are keen to learn, but there's no real need to learn it at all.

The problem is that there is no real need to learn XSLT. Observe that:

  1. People had to learn HTML and JavaScript because there were no alternatives.
  2. Learning HTML and JavaScript lead directly to XML and DOM.
  3. Learning an API like DOM is far easier [than learning] a new declarative language like XSLT.
  4. Once you learned DOM, there is no real need to learn XSLT.
While DOM-based solutions are harder to implement and maintain, there are plenty of people available with DOM expertise.

While there are no doubt many more people with expertise in DOM than in XSLT, the sheer volume of traffic on XSL-List demonstrates that there is significant interest in XSLT as a XML transformation tool. But, as several people observed, judging from the questions being asked in that forum, XSLT is being pressed into uses for which it wasn't designed. It was never meant as a programming language in its own right. Problems encountered when attempting to use it as such can't reasonably be leveled against XSLT. They should instead be attributed to misunderstanding (or miscommunication of) the purpose of the technology; or, perhaps, to the exuberance that accompanies a developer getting to grips with a new tool. What's that old adage about the hammer and the nail?

Common XSLT

Not everyone had negative comments about XSLT. Francis Norton acknowledged that there is a learning curve involved but noted that learning XPath gets you most of the way there.

There is a learning process, but everyone using XML needs to learn XPath anyway (*please* don't tell me anyone is seriously programming complex transformations by using pure DOM navigation) and once you've got that, the rest of XSLT isn't that indigestible -- certainly no more of a leap than going from sequential to event-based programming.

Norton cited the success of SQL as another declarative technology that has reached wide acceptance after initial criticisms about complexity. James Strachan made the point that there's a great deal to be gained from XPath alone.

Michael Champion believes that XSLT is one of the core XML technologies that actually meets its stated goals, and he noted that if you stay within sight of that goal then you can't go too far wrong.

...I'm predisposed to consider XSLT (along with "common" XML 1.0, DOM, and XPath) as part of the solid core of XML technologies that really more or less do what they are advertised to do and have a real track record of success.

I suspect that XSLT is indeed a "20/80 point technology" if you have to use it as a programming language, to access non-XML data via extensions, and generally explore the dark corners of the spec. My experience (admittedly with simple applications) is that there is a straightforward "Common XSLT" functionality in there that does hit the 80/20 point for reformatting XML into a conceptually similar but syntactically different XML or simple HTML format.

Champion invited others to share their experiences with building XSLT-based applications. Soumitra Sengupta agreed that for its intended use, XSLT does fall within the 80/20 realm. Sengupta offered a brief list of guidelines to help produce simple XSLT transforms. He also cited Norm Walsh's DocBook XSLT stylesheets as a useful learning aid.

  1. Do not try to use it as a programming language
  2. Break up the transform into steps and concentrate on getting the input XML into closer and closer to the output
  3. Learn how to avoid using for-each, choose etc. Once you start using these frequently, you would be tempted to suddenly switch into a general purpose programming language mode. Use these as a last resort.
  4. If you are familiar with Lisp, you have a better chance
  5. Try not to take a flat XML and convert it into very deeply nested structures...Similarly it is hard to generate nested lists.

Many of these comments were reiterated elsewhere in the discussion; particularly the pipeline-based approach, which is a powerful architectural design pattern in its own right. Robin Berjon noted that this technique is extremely useful.

...I use the pipe model as much as possible. Most of the transforms I get to do become much simpler with at least two style sheets in a row rather than a single hairy one. This also makes it easy to insert a non-XSLT processor in the middle, for instance any kind of SAX processors, that will do transforms on some parts that XSLT isn't good at (eg PCDATA).

In a separate lengthy posting, Berjon outlined some other lessons learned from working with XSLT.

...using XSLT to produce similarly or less complex documents than the original is easy, trying to produce a document that's richer than the original will likely lead to problems. That's to be expected, after all the T does stand for transformation, and not for destruction or for production. Whatever information you want in the output should be present more or less as-is in the input. This is obvious but a lot of people tend to forget it.

To summarize, then, it seems that XSLT may be a victim of its own success and could be in danger of being pressed into service in areas where it simply isn't a good fit. Luckily there is a rapidly growing body of experience which is beginning to show exactly where XSLT should and shouldn't be used. Developers are wise to draw on this experience, particularly if they want to avoid the 'large scale rewrites' forecasted by Sean McGrath. The message about using the right tool for the job may be an old one, but it's obviously one that can't be stressed too often.

The other side of this debate is that it's only now, roughly two years after the publication of the XSLT Recommendation, that we're beginning to see useful experience based feedback on the utility of the technology. Makes you wary of tackling another new XML technology so readily, doesn't it? Tread carefully. And let's all look forward to the XML Schema debates of 2003.