This week the XML-Deviant summarizes an XML-DEV discussion concerning the utility and short-comings of XSLT.
Fred Gomez prompted an interesting discussion this week by describing a real-world scenario of a simple web-based XML document management system, asking some key questions of the XML-DEV members. What database should be used? What presentation technology, with supporting tools, could be considered best of breed?
The best place to seek answers to the first of these questions is in Ronald Bourret's XML Databases paper and in his related XML Database product review. As the Deviant has explored in the past, there are many different opinions on how best to mix XML with a database. However it was the second half of Gomez's inquiry that prompted the most response, much of it criticizing XSLT as a presentation tool, raising strong echoes of the XSL versus CSS debates that have been prevalent over the last two years. See "XSL and CSS: One Year Later" for the most recent coverage of this issue.
Alaric Snell questioned the selection of XSLT over a programming language, like PHP, to generate presentation content.
...What will it gain you as opposed to a set of PHP pages that talk directly to the database? Unless you have particular programming experience with XSLT and not PHP/ASP/CGIs/whatever, I find it easier to set up PHP under Apache than to set up an XSLT engine and put things in place to make a web server invoke it against XML pulled from an SQL database.
Snell suggested managing the separation of content and presentation through use of appropriate function libraries that can be invoked as required by a web developer, but written by an application developer.
Michael Champion picked up the ball here and produced a list of benefits of using XML/XSLT in the presentation layer, citing strong separation of content from presentation, platform-independence, interoperability, and a declarative approach as some obvious wins. But not everyone agreed.
The XSLT Hammer
Sean McGrath was particularly critical of the usefulness of XSLT. He claimed that its limitations are quickly reached in even simple tasks:
In my experience, be it styling for WML or schema-to-schema transformations you hit the limits of any sandboxed, declarative syntax such as XSLT really, really fast.
XSLT is an example of a 20/80 point technology :-) It gets you 80% towards a solution quickly but makes the remaining 20% either impossible or so hard that the time it takes to get that last 20% done, wipes out the gains you made on the first 80%!
In a subsequent message, McGrath cited both the XSLT specification's statement that XSLT is not a "general-purpose" XML transformation tool and recent comments from James Clark that outline some of XSLT's limitations. Outlining his own experiences, McGrath also made a dire forecast for those who rely too much on XSLT.
Unless the work I do with XML (both internally and consulting with some thumpingly big international corporations) is very unrepresentative, the limits of XSLT hit hard and very fast in the development cycle.
Most non-textbook XML transformations I am involved with require either a) PCDATA based manipulations and b) external integration e.g. dbms, web services etc. Insofar as they are possible with XSLT they are complex and read-only at best. In many cases they are just not possible at all.
I repeat my assertion made in an earlier post that the irrational exuberance about XSLT will be the cause of wholesale server-side re-writes of XML transformation systems in the medium term.
Paul Tchistopolskii believed that while there may be problems with XSLT, the real issue is that its capabilities have been over-sold.
I think that the problem with XSLT is that XSLT is often misleading, pretending to be more "powerful" and "portable" than it actually is.
This statement does *not* equal to "XSLT is a us[ele]ess crap, don't use it".
In the same message Tchistopolskii expanded on his comments, claiming that there is a lot of un-needed complexity in XSLT, suggesting some ways to approach the learning curve when using the technology.
The difficulties of learning XSLT were mentioned by several contributors, particularly getting to grips with a template-driven rather than a procedural approach. However Don Park asserted that not only is XSLT low on the list of things that most developers are keen to learn, but there's no real need to learn it at all.
The problem is that there is no real need to learn XSLT. Observe that:
While DOM-based solutions are harder to implement and maintain, there are plenty of people available with DOM expertise.
- Learning an API like DOM is far easier [than learning] a new declarative language like XSLT.
- Once you learned DOM, there is no real need to learn XSLT.
While there are no doubt many more people with expertise in DOM than in XSLT, the sheer volume of traffic on XSL-List demonstrates that there is significant interest in XSLT as a XML transformation tool. But, as several people observed, judging from the questions being asked in that forum, XSLT is being pressed into uses for which it wasn't designed. It was never meant as a programming language in its own right. Problems encountered when attempting to use it as such can't reasonably be leveled against XSLT. They should instead be attributed to misunderstanding (or miscommunication of) the purpose of the technology; or, perhaps, to the exuberance that accompanies a developer getting to grips with a new tool. What's that old adage about the hammer and the nail?
Not everyone had negative comments about XSLT. Francis Norton acknowledged that there is a learning curve involved but noted that learning XPath gets you most of the way there.
There is a learning process, but everyone using XML needs to learn XPath anyway (*please* don't tell me anyone is seriously programming complex transformations by using pure DOM navigation) and once you've got that, the rest of XSLT isn't that indigestible -- certainly no more of a leap than going from sequential to event-based programming.
Norton cited the success of SQL as another declarative technology that has reached wide acceptance after initial criticisms about complexity. James Strachan made the point that there's a great deal to be gained from XPath alone.
Michael Champion believes that XSLT is one of the core XML technologies that actually meets its stated goals, and he noted that if you stay within sight of that goal then you can't go too far wrong.
...I'm predisposed to consider XSLT (along with "common" XML 1.0, DOM, and XPath) as part of the solid core of XML technologies that really more or less do what they are advertised to do and have a real track record of success.
I suspect that XSLT is indeed a "20/80 point technology" if you have to use it as a programming language, to access non-XML data via extensions, and generally explore the dark corners of the spec. My experience (admittedly with simple applications) is that there is a straightforward "Common XSLT" functionality in there that does hit the 80/20 point for reformatting XML into a conceptually similar but syntactically different XML or simple HTML format.
Champion invited others to share their experiences with building XSLT-based applications. Soumitra Sengupta agreed that for its intended use, XSLT does fall within the 80/20 realm. Sengupta offered a brief list of guidelines to help produce simple XSLT transforms. He also cited Norm Walsh's DocBook XSLT stylesheets as a useful learning aid.
- Do not try to use it as a programming language
- Break up the transform into steps and concentrate on getting the input XML into closer and closer to the output
- Learn how to avoid using for-each, choose etc. Once you start using these frequently, you would be tempted to suddenly switch into a general purpose programming language mode. Use these as a last resort.
- If you are familiar with Lisp, you have a better chance
- Try not to take a flat XML and convert it into very deeply nested structures...Similarly it is hard to generate nested lists.
Many of these comments were reiterated elsewhere in the discussion; particularly the pipeline-based approach, which is a powerful architectural design pattern in its own right. Robin Berjon noted that this technique is extremely useful.
...I use the pipe model as much as possible. Most of the transforms I get to do become much simpler with at least two style sheets in a row rather than a single hairy one. This also makes it easy to insert a non-XSLT processor in the middle, for instance any kind of SAX processors, that will do transforms on some parts that XSLT isn't good at (eg PCDATA).
In a separate lengthy posting, Berjon outlined some other lessons learned from working with XSLT.
...using XSLT to produce similarly or less complex documents than the original is easy, trying to produce a document that's richer than the original will likely lead to problems. That's to be expected, after all the T does stand for transformation, and not for destruction or for production. Whatever information you want in the output should be present more or less as-is in the input. This is obvious but a lot of people tend to forget it.
To summarize, then, it seems that XSLT may be a victim of its own success and could be in danger of being pressed into service in areas where it simply isn't a good fit. Luckily there is a rapidly growing body of experience which is beginning to show exactly where XSLT should and shouldn't be used. Developers are wise to draw on this experience, particularly if they want to avoid the 'large scale rewrites' forecasted by Sean McGrath. The message about using the right tool for the job may be an old one, but it's obviously one that can't be stressed too often.
The other side of this debate is that it's only now, roughly two years after the publication of the XSLT Recommendation, that we're beginning to see useful experience based feedback on the utility of the technology. Makes you wary of tackling another new XML technology so readily, doesn't it? Tread carefully. And let's all look forward to the XML Schema debates of 2003.
X + S + L + T equals 14 grand for me!
2001-08-23 03:17:57 N Pearse
by touting my cv with this (amongst other new acronyms, I have managed to find a new job paying 14000 GBP more than my current one. What's to hate!?
On a more serious note the world is moving in 2 directions:
just in that everything in the atomic world is build from a handful (okay maybe a GM handful) of chemical elements (standardisation) into many different things (diversity), so the world of IT will form another layer of standardised yet diverse technologies
it is just another 'tool' for certain types of application, people will use anything new for a variety of different purposes, it will settle down when it finds its 'niche'
the key to success?
make something 'evolveable'
split it into the smallest number of stable parts
then recreate in as many diverse ways as possible
XSLT - DOA?: An Opposing Viewpoint
2001-08-18 18:40:37 Kurt Cagle
I'm going to go out on a limb, and try to argue against the naysayers.
I've worked with XSLT extensively, and it has featured prominently in most books that I've written since 1999 (seven in all). I've heard this criticism come up quite frequently about the utility of XSLT versus its complexity, and inevitably there are those people who cite it as being 1) too complicated to write, 2) too limited in scope as an integration language, and 3) over-utilized to solve common problems.
I think all three of these points are in fact straw-men arguments. In the first case, the single biggest problem that XSLT faces is the fact that it is not a language for manipulating scalars but rather for manipulating sets. This point is frequently lost in articles by people who are attempting to work with it in the same way they work with procedural code, because the two paradigms are very different. Indeed, this is one of the great unifying factors between SQL and XSLT -- while in both cases you can do scalar operations with the languages, what you will have is a very bastardized format that cannot even hope to compare to the same operations done in binary compiled form, yet if you use either SQL or XSLT as a set manipulation language, you can create code of both astonishing simplicity and great power. Is such code more complicated to write? Sure -- you're dealing with data at a higher level, and you're working with a pattern recognition technology rather than a simple assignment language. That's the tradeoff you face.
Along those same lines, if you use XSLT using comparable technologies, you will get comparable results. Both Java and MS have the ability to create compiled stylesheets that work at speeds comparable to other compiled entities. They have the advantage that, because the source code can be referenced externally when needed, updating such XSLT source makes it easier to distribute and update changes to the component.
The second case strikes me as true, but vaguely silly. There are a number of purists who argue that XSLT needs to be kept as simple and pure as possible, because it is only a stylesheet language, and so have deliberately hobbled it. There are many others, myself included, who see it as an ideal language for integration, recognize that XSLT stylesheets, of all XML technologies, are the least likely to need to maintain device independence, and feel that it works very effectively as a binding or integration language. I remember back in the early 1990s when VB first appeared -- its detractors argued strongly that it was not a "pure" language and would in fact be little more than a toy. Yet it was the ability of this language to act as an easy to use binding agent that made it so immensely popular, even if it wasn't rigorously object oriented or pointer-centric.
XSLT has already spawned a number of variants that are nearly identical in most respects, save that they permit the use of extensions. If you bind a regular expression engine into a compiled translet, for instance, you are working not at interpreted speeds but at compiled ones, just as you would with any Java bean or COM object. The XSLT 1.1 working draft proposal was a significant move in the right direction, and it would have effectively solved many of the same issues that are repeatedly (and to a certain extent validly) brought up when criticising 1.0 - creating an output mechanism, creating a standard for extensions, even streamlining the language a bit so that the result of called templates could be invoked from within an XPath expression, rather than using the very roundabout mechanism that exists now (and then only by extension). This work is carried on in the EXSLT site and the people that participate in it, and by and large the solutions that they are coming up with work VERY effectively in creating integrated components.
However, XSLT 1.1 was basically dropped from the W3C agenda precisely because there are many people who fear that it gives too much power into the hands of programmers, which is another way of looking at James Clark's statement that XSLT 1.0 was only intended as a stylesheet language. A large number of languages that were intended for very limited applications tend to find utility beyond those applications, and they grow in strength and power accordingly. I think it is a mistake to argue that simply because the original designers never saw the use of this language in this way that is it should be deliberately hobbled because of that.
There are always applications that XSLT is not useful for, or that its more effort to write than its worth -- but I'm finding, so long as I'm willing to cut a few corners on how pure I make the language, that there are fewer and fewer apps that don't fall into that category, especially now that more applications work via the transmission or receipt of XML streams. I think this is worrisome to companies like Microsoft that first endorsed XSLT but are now becoming less emphatic about it as they realize that XSLT can obviate a significant amount of the advantages that their own, object based technologies offer.
I am definitely in the minority in my belief of this, but I continue to press it because I think that it is a valid and important viewpoint. Rather than limiting the utility of XSLT, look at where people are using it and expand it accordingly. Especially in comparison to XML Query, which I see as in several respects a step back, XSLT has the potential to be an integral component of XML development in the future.
-- Kurt Cagle
-- Co-author, Profession XSLT
XSLT - DOA?: An Opposing Viewpoint
2001-08-24 12:35:39 Michael Maron
Agreed, I just want to add more on XML/XSLT vs SQL comparison.
First, they are in the same boat as far as end users are concerned. They never deal with XML documents, XSLT stylesheets or SQL statements directly. All they see through the browser is Web site served by - hopefully - XML/SQL-enabled server-end application.
Next, some history. I remember when SQL was an emerging technology, very complicated SQL statements were considered in literature to show its power: unions, subqueries, outer joins - all in one piece. A few computer eras later it appeared that with stored procedures it is more convenient to keep SQL statements reasonably simple and testable glewing them together with 3G code. My guess is, same thing is going to happen with XSLT. DOM-enabled apps can take proper care of reasonably straightforward stylesheets.
XSLT - DOA?: An Opposing Viewpoint
2001-08-21 07:17:53 Henry Naylor
I agree strongly. XSLT is an exciting and fun language to work in. When I first got the call to become our resident XML guru I found XSLT to be my worst nightmare. But as was said in the posting as I used it for the first few weeks and found a good book on the subject I fell in love.
XSLT - DOA?: An Opposing Viewpoint
2001-08-20 08:29:44 Brad Clawsie
An interesting post and a good read, although I think it is useful to point out that the main proponents for XSLT posting to this board have written books on the technology, so I consider it in their vested interest to promote it (although obviously both posters saw some value in XSLT prior to writing the books).
I think one thing that is generally overlooked in such debates is the general trends of technology adoption in IT. People are still integrating interpreted/memory-managed languages (Java/Perl/Python) into their codebases. Many shops are trying modelling tools for the first time. Some shops are still migrating to relational databases.
Adoption of XML technologies in most shops is probably limited to pilot projects for document interchange needs. XML is good for this, but I would be surprised if XML adoption ever gets past this level. I have not heard one compelling argument why an XML transformation must itself be an XML document. True, you can reuse parts of your original XML parser to process the tree transformation, but then is this engine better suited for this task than mature VMs and compilers?
Although XSLT allows you to separate document transforms from the rest of your business logic, is it ever really the case that real world apps decompose functionality that well? If a document transform is part of a larger function, why not just complete all tasks related to the function (arithmetic, database access, network access, for example) instead of breaking part of it out into a new syntax?
Added to which, does the functionality of XSLT approach the data manipulation facillities of traditional languages? And by this I mean can similar functionality be expressed in comparable levels of verbosity? Every XSLT example I have looked it is utterly verbose compared to what could be cranked out in Perl.
2001-08-17 06:47:50 jim fuller
there are many scopes of problems assoc with manipulating data, most of the solutions or tools assoc with this problem space tend to be overblown in tactical situations.
when it comes to the simple transforms from one data format to another xslt can't be beat.
as someone who has just ended a 2 year effort in developing a complicated framework that delivers complicated software solutions using xslt, i relise the boundaries of xslt.
some key ideas when using xslt;
a) to poorly paraphrase steve meunch , 'get your data to its smallest slice, then hand off to xslt"
b) always incorp. metadata into your programming idioms, these dormant hooks will serve you well for things u never thought of, and u can use DOM or XSLT to manipulate when the time comes
c)it is true that XSLT is doing too much, but at least it allows for pure data centric type programming to occur, which has impressive gains elsewhere in the development workflow
i personally ( not because i have a other worldly grasp of XSLT ) do not find XSLT difficult or involved, and i think that tools and editors built on top of xslt which has to be easier to describe and use to non-techies; which of course is the goal; put the data into the hands of the people.
i personally believe the success of any technology is directly related to the amount of discourse against its use.
cheers, jim fuller
xslt as glue
2001-08-17 07:42:04 Michael Maron
I agree completely!
IMHO, following the reaction of those who are disappointed with XSLT is very useful.
-- Working on real XSLT-related projects and going through S. Muench's BUILDING ORACLE XML APPLICATIONS, it is easy to see that XSLT is by no ways the only language involved: PL/SQL is necessary to work with database, C/C++/Java are necessary to work outside of database.
-- XSLT developer takes care of portability as with any other language.
-- DOM does have transformation features. However, I use it for parsing to data structures only. Then these data structures are processed by PL/SQL(in my case).
-- A good way to develop in XSLT is to do this like with SQL. Prepare XSLT stylesheet outside of application code first, build it into application next. In fact, it is better to work with XML/XSLT with certain database experience...
2001-08-17 06:18:16 Michael Maron
XSLT is not an easy language to learn - mostly because it is close to Lisp rather than to well-known string processing languages like Perl.
XSLT QUICKLY covers XPath, XML elements and attributes manipulation, programming issues like named templates (a.k.a. functions), variables, parameters, XSLT-specific constructs like key lookups, number and string manipulation.
Readers will find _good_ ways to generate HTML, other markup and plain text from XML documents.
I think this book is a must for software developers who want to write and test robust portable XSLT scripts.
Simple, understandable and informative sample code is a true challenge for any computer book. I really appreciate samples from XSLT QUICKLY, they are easy for recycling in real-life applications. Also, like Oracle code samples, they are convenient to communicate development issues.
Last, but not least, just in the preface we find an important clarification of XML/DTD/XSLT relationship, so readers will avoid a good deal of painful confusion.
(That's my XSLT QUICKLY review at amazon.com:
2001-08-16 23:48:00 Eli Lato
I'm glad to see that I'm not the only one wondering if I was right to try XSLT to generate HTML. I'll have to persuade my customer to go with a technology that:
1 Is far, far slower than JSP.
2 Necessites learning a new, quirky language.
3 Generates HTML with Hebrew characters as &entities, so that the HTML is unreadable.
If my customer tells me, "Trash it. We'll go with JSP.", I couldn't blame him.
Usefulness of XSLT
2001-08-17 06:42:02 Michael Maron
Regarding Eli's remark on using XSLT for nationalization, I believe there is a _simple_ way to generate XML documents from proprietary documents: build proprietary2XML converter in any programming language like C with printf(). Once we have XML, use standard XML techniques like XSLT and DOM to process it.
As for Hebrew and other non-English specifics, I don't think this has much to do with XSLT as such. That's about encodings and Unicode, see W3C guidelines.
2001-08-16 09:51:56 Brad Clawsie
I still don't see any compelling argument for XSLT as opposed to a "real" programming language that provides strong XML support.
XSLT is one of many XML-related specs from the W3 that don't appear to respond to any real market demand.