XSL Considered Harmful
Michael Leventhal
CiTEC
Declaration of War
Two weeks ago XML.com published an article by G. Ken Holman,
What's the
Big Deal With XSL?, in which the author expressed
his "perplexity about the perceived controversy over XSL.
I expressed my surprise and
disappointment to XML.com that one point
of view in this "perceived controversy" had been given such
a full airing while nothing had been heard from other side.
I asked for equal time and I was given it.
I am going to start out with a brief and perhaps stark statement on
the subject, my "declaration of war". It is high time
we get to the heart of the matter.
XSL, a "sometime in the future" technology, full of
beautiful (if vague) prognostications about its "power" and
"richness", offers no useful improvement in capability over
current and implemented full W3C Recommendations for stylesheets
and transformation.
XSL has no role to play in the evolution of web technology
into the "new desktop" as it does not support interactive documents.
XSL is a great danger to a major objective of XML, the inclusion
of semantic information in Web pages, as it replaces
XML elements annotated with formatting information by XSL
formatting objects.
XSL is the most hideous and unwieldy language imaginable and
stands absolutely no chance of acceptance by the web community.
XSL advocacy has blurred the focus of the W3C by
introducing competing standards for styling and transformation
and set back the goal of vendor-independent, semantically rich
"open information highway" by at least two years by undermining
support for existing standards such as CSS and the
DOM.
This is really all I have to say, but if any one of these points
is true, that is a damning case against
XSL. I have invited and continue to invite, in good faith and with
an open mind, XSL advocates to counter these arguments.
In the remainder of this article I will elaborate on these five fatal
flaws of XSL and present
an application which will give you the chance to directly compare XSL
to the transformation and styling technologies supported by current
W3C Recommendations. But first, The Challenge:
The Challenge
Anything XSL can do in the Web environment, I can do better using
technologies supported by current W3C Recommendations.
Of course, what is "meaningful" in the Web environment is open
to a variety of interpretations. Therefore, the subject of the challenge should be
one that the XSL camp and I agree is meaningful.
I am also ready to make this bet a little bit more than an academic
exercise. If I lose I will pledge that I and my crack mozilla
development team will assist in implementing XSL in the mozilla
open source project. If my opponents lose they will agree to
desist from XSL advocacy, vote against an XSL Recommendation if
they are members of the W3C, and will join me in calling for
full, flawless, and unequivocal vendor support of CSS1 and CSS2,
DOM Level 1, and XML 1.0 as the very first and top priority of
the web community.
XSL Has Nothing New for the Web
The major justification for XSL is that it will support
better specification of printed page layout than is now
possible with CSS. In other words, XSL has little or no
relevance for virtually all types of documents on the
web today. The web is not about paper documents. Page
composition will not even occur on the Web as it is a
notoriously computationally-intensive process. There are
many solutions to page composition available today; the
sole advantage of XSL would be that it allow the handful
of people that design page layout processes to use the
same language both for producing printed pages and for
producing web pages. However, the nature of the layout
discipline is dramatically different. Frankly, people that
design page layout processes will have no difficulty creating
a CSS stylesheet for the Web and an DSSSL stylesheet for
paper, one of the solutions which is already in place
today.
A more important point, however, is that XSL as a web
page style language is a enormous unknown in terms of processing
efficiency. It is relatively straightforward to write
down on paper a specification that not only slices and
dices but also cooks the turkey and sends out your dinner
invitations. But ... will it work, can it be cleanly implemented and
maintained, how much memory is it going to use, how fast
will it be? These things are not known for XSL and good
professional opinion says it is going to be a dog. CSS
is trim and has been shown to be efficient.
Other than printed page layout, XSL has very little to add
to CSS formatting capabilities, perhaps a point or two
which would justify a few modest additions to the next
revision of CSS, but surely not an entirely new language.
Improved selectors on the element hierarchy and attribute
values are the major item in this area.
The fact is that the powerful styling capabilities of
CSS, including hierarchical and element selectors, floating,
absolute, and relative positioning, generated content,
counters and autonumbering, and table formatting, have
not been experienced either by the general web community
or the XML community. I know this because some of these
features have only appeared in the mozilla browsers as
recently as two weeks ago. We just now have the ability
see what CSS can do - and it is going to change the world view
of a whole lot of people. May I be so bold as to ask how much
real world experience with CSS2 the working group designing
XSL has? Is this not a reasonable prerequisite for designing
a new language which better addresses the needs of web developers?
XSL is not, of course, just a style language. It also supports
tree transformations, the mantra of XSL community having been
that transformation and styling were inseparable processes.
In fact, they are not inseparable and even some XSL advocates
have begun to call for a clean separation between the proposed
transformation language and the formatting language. The
fact that interests us here however is that there already exists
a way to do tree transformation that adheres to a W3C Recommendation
today. It is called the Document Object Model (DOM) and can
be used in browsers now through its binding to JavaScript and
other languages (Java, C++, Perl, Python).
In effect the XSL advocates are setting up a straw man when they
compare XSL to CSS. The valid comparison is between XSL and
CSS+DOM, the environment in W3C Recommendation compliant
browsers today. The DOM allows full access to and manipulation
of the document tree. There is no operation which can be
accomplished in XSL which cannot be accomplished in the DOM.
This is a fact which no XSL advocate can deny. XSL advocates,
of course, will state that they "like" their language better.
But the critical fact to retain is that XSL transformation does
not add a single capability beyond what can be accomplished
without XSL in DOM-capable browsers.
When compared to the DOM+CSS, XSL does not solve any Web-related
problem that the current W3C Recommendations do not adequately
handle. What is the Big Deal indeed?
XSL Does Not Support Interactive Web Documents
With HTML forms included
through the HTML namespace, but without CGI, and definitely without XSL,
we at CITEC are building the following Web browser-based
semantically rich XML applications:
DocZilla
Online Books/Manuals with Hypertext TOCs and structured search
CarZilla
Web browser designed for a car with simulated feeds to diagnostic and maintenance systems
LinkManagerZilla
Manages hyperlinks in document collections, providing
interface, maintenance, editing and visualization of a link
database and links targets and sources in documents
StockZilla
Simulated stock reporting feed, shown in this article
EmailZilla
Interface to email system
IETMZilla
Interactive Electronic Technical Manual Interface with
special safety features
AnnotationZilla
Allows insertion of annotations into documents
HelpZilla
WindowsHelp-like system
SlideZilla
Does slide shows
EditorZilla
DTD-driven XML editor
I showed some of these applications at the XML Europe conference
and also delivered my now-standard denunciation of the "XSL
conspiracy". The speaker who had the misfortune to follow me
was a very knowledgeable gentleman whom I respect very much,
Neil Bradley (author of The Concise SGML Companion and
The XML Companion), and the topic he had the misfortune to
cover was,
of course, XSL. And he began by saying that, of course, he
would never attempt to create the applications he had just
seen with XSL because "XSL is not for interactive documents".
Is there really anything more to say? If XSL is
no good for interactive documents what do I need XSL for? Is
there anyone out there that is not interested in
enabling
interactive behavior in a clean, structured, standard, and
maintainable way in their web documents? Isn't that what XML
on the Web is for, to enable such applications to be built
using semantic information from element names, attributes and
the document structure? This is what we thought and this is
what we have been doing. We think our 'Zillas are the proof
that XML on the Web does everything XML on the Web was promised
to do. And we do it with XML, CSS, and the DOM and really
cannot understand why anyone could, would or should give a damn
about XSL.
The whole XSL process, described admirably enough in Mr. Holman's
paper, is a static process.
Interactive processes are designed
around events which have event handlers associated with them.
An event could, in principle, occur on any XML element (in fact
we have done exactly this in many of our 'Zilla applications)
and can invoke a myriad of behaviors which most often involve
the modification of some element styling or perhaps the update
of a element value. Occasionally, some reordering of some
part of one or more document trees takes place. Much less
commonly, all or a large portion of the document is reordered,
as in an application which updates a sorted table.
The CSS model of attaching formatting to the
elements is perfectly adapted to handling events
which cause stylistic changes, and the DOM model is very efficient
for updates to the document tree contents and local reordering. It
is adequate for global reordering. In addition, the
document trees often reside in different frames or
windows and one must have communication between them. JavaScript
provides these kinds of facilities as it is an environment
native to the browser environment. When an
event occurs you typically want to know what element it occurred on,
what its name and attributes are, and most often you will be doing things
that will effect its immediate children or parent. This is
all elementary stuff if you're using the DOM and JavaScript. Finally, it all
has to be very fast.
The XSL folks seem to be about a million miles from this, proposing
a horse-and-buggy model in laser-guided precision missile world.
Semantic Information Threatened by XSL
I can understand why overworked undergraduates think
FONT is cool, but I'm very disappointed when a group of
highly skilled adults tell kids to stop playing, form a
committee - and then come out with a set of supercharged
FONT tags.
Håkon Lie on XSL Formatting Objects, xsl-list, 28 April 1999
Dear Mr. Lie (co-inventor of CSS) has penetrated once again to the
essence of a matter ... and
in his inimitable style. If I could only win friends and influence people
like that! I highly recommended that you go and get the full story from his paper
"Formatting Objects considered harmful" at
http://www.operasoftware.com/people/howcome/1999/foch.html. A
short version of the story is this: XSL proposes the existence of something
called a formatting object, XML elements which indeed resemble what Mr. Lie calls
a supercharged FONT tag. It is the final result of an XSL transformation which
is sent to the rendering engine. If the rendering engine reads formatting
objects directly, all the semantic information in the original information may
be lost. This is exactly the information we get through element events
and the DOM to make all of our 'Zilla applications work.
XSL, in this scenario, would destroy XML on the Web.
Among the solutions to this problem is stipulating in the standard that
formatting objects must preserve a mapping back to the original XML element.
Technically not very elegant and probably unenforcable.
Another solution is to legislate that formatting objects not be allowed to
ever physically exist as XML. Not bloody likely, but there is a logical
candidate to replace XML formatting objects: CSS! One must ask why don't we just
forget about formatting objects and just use CSS in the first place?
XSL is an Ugly, Difficult Language
XSL is DSSSL in sheep's clothing, an SGML transformation and formatting
language that an ISO committee worked on for eight years, producing finally
a language specification and an utterly heroic and breathtaking but incomplete
implementation by James
Clark that years of real world experience has shown to be
incomprehensible and unusable to nearly everyone in the document
transformation and formatting business. XSL is DSSSL, it is 100% DSSSL
concepts and processing model, with some syntax changes and some attempt
to merge with CSS.
I can't back up my assertion with numbers but I personally have
written hundreds of transformation applications and have used nearly
every tool and language in this area, DSSSL included. It is truly a
tiny handful of people who have been able to master DSSSL concepts and
fewer still that have been able to get it to work on a real world project.
I am one of those people and I happened to have the opportunity to
write such a real world application in both DSSSL and one of the popular
commercial transformation languages, Balise. The DSSSL implementation
took me two full weeks, Balise two days. I am qualified to give an
expert opinion in this area and my opinion is that DSSSL and XSL are
hard!
This is important because of
the following assertion, which comes repeatedly from the XSL camp: XSL
will make life easier for web developers because they will not have
to write programs to perform complex document transformations. You see,
XSL is not a "programming" language but a "declarative" language.
XSL declarations don't do anything, they describe the state of the
transformed document in relation to the original document.
It is a fascinating and lovely idea. The problem is that
real-world transformations are hard, full of tricky stuff and
zillions of little details such that an overly simple model
of how to do or declare it quickly breaks down and becomes infinitely
harder than the programming model with its many features for
managing complex tasks.
I know I am not going to convince anyone that doesn't want to
program to become a programmer, but I hope to convince at least
some of you that we are very close to defining the ideal
intersection between the talents of the programmer, the data/document designer,
and the style expert with XML, CSS and the
DOM. XML puts control of the design process solidly back in the
hands of the subject matter expert who designs the document types.
CSS allows the designer to isolate the style issues cleanly .
The DOM provides the programmer with an orderly approach to
interactive application programming (including transformation) and prevents the
the web document from becoming an incomprehensible mixture of code and
data. XSL is a big step back, it mixes everything up again and
puts everything in the hands of the few people who can understand
this weird declaration thing which is simultaneously both past
and future and in which nothing really ever "happens".
I mentioned the support in programming languages for
managing complex tasks. I am talking about things that engineering
scientists have studied and developed over many years of research
into program design. Things like program modules, argument passing,
scope, state, basic and complex data structures, naming conventions,
debugging techniques. These are admittedly unfamiliar to people not
trained to write programs and they may seem like an unnecessary
complication. They are not: the proper use of such techniques
will ensure that your transformations are readable and maintainable
to people who can read and maintain programs and that the results
will be reliable.
I'd like to offer another little challenge to those that
are still unconvinced - go and read the XSL script used to
produce Mr. Holman's article (http://www.xml.com/1999/04/holman/xmlcom.msxsl).
Can you understand it? Is the fact that it is a declaration really making
life any easier for me? He is doing basically really simple transformations,
just gathering some element data to stick in tables at the end of the document -
I gave the same assignment to a junior programmer on my team for him to learn
the DOM. But it is enough to push XSL to the wall.
XSL has Set Back the Web at least 2 Years
When I have criticized Microsoft for the lack of
complete and correct CSS support in IE5, I have found that many
are quick to class me among the irrational
Microsoft bashers. In
fact I am keenly interested in seeing Microsoft create
a browser which adheres to Web standards. I want to
see this because I believe that to a significant extent
it is the character of our new information-based civilization
which is at stake. I also happen to believe that we will
sell many more of our 'Zillas if the technology we use
is also supported by Microsoft and by other browser
vendors. Finally, I think that the vendors can be
persuaded that it is in their interest to support
web standards if consistent pressure is applied to
them from the marketplace and from those people who
are the catalysts for new technological developments.
Today there is one and only one W3C Recommendation for the
formatting of XML documents on the Web, CSS. Today there is
one and only one W3C Recommendation for transformation and manipulation
of XML documents, DOM. This year we could have had semantic
markup on the web, in the major browsers, with support for XML 1.0,
CSS and the DOM. This did not happen and I am sorry to say
that XSL advocates that did not clearly articulate their
support for current Web Recommendations contributed heavily
to that state of affairs.
I think it is clear to all that the early implementation
of XSL by Microsoft was a disaster. As predicted,
the next version of the W3C XSL draft created two XSLs, the
W3C's and Microsoft's. But more important than this is the
fact that XSL has been a major marketing coup for Microsoft
in enabling them to "hide" their deficient implementations of
web Recommendations and to force the entire marketplace to
postpone XML implementations for another couple of years while
we wait for common standards that work in the all the major
vendor's products.
The corrective action at this point is to do what we
as a community should have done six months ago,
to insist on full implementation of CSS, the DOM, and XML.
If we clearly articulate this message now we may have XML
on the Web in one year instead of two. If we wait for XSL
to either flop or succeed it will definitely be two and if
it does flop perhaps it will be three. There are also
issues far more basic and critical than XSL which the W3C should
be addressing with greater attention and priority. These
include linking, row and column spanning in CSS-specified
XML tables, and inclusion of scripts in XML files possibly
followed by query and schema issues. Our
first order of business is to create a workable and standard
XML environment for the Web. XSL, a standard designed to
improve the layout composition of printed documents, should
not be in our first order of business.
XSL vs. the DOM+CSS toe-to-toe
The second part of this article consists of a comparison
of the well-known Microsoft stock-sorting XSL and an
enhanced application which uses the DOM and CSS instead
of XSL. Click here to advance
to this section..