Menu

The XSL Debate: One Expert's View

June 8, 1999

Norman Walsh


This article addresses the recent debate about the merits of XSL. It is the personal opinion of the author. It was written in preparation for an online chat forum in which the author participated as one of the "experts", hence the title.

Recently, there has been some debate about the merits of XSL (the extensible style language, one of the original family of XML standards). This debate has taken place on several mailing lists and, most recently, was the featured topic on XML.com.

My own experience with structured markup and stylesheets is extensive. I've written stylesheets using FOSIs, CSS, DSSSL, and XSL. Before my SGML days, I wrote them in TeX and LaTeX, I dabbled in Troff and my own home-grown formatters generating raw PostScript, and even an old IBM mainframe system called simply "Script". Several people have asked my opinion about the XSL debate, so here it is. What follows are my own observations and my personal feelings about the issue.

Mostly, I find the whole debate rather disappointing. Disappointing because much of the debate seems to be framed in the most polarizing language possible and hardly seems designed to engender serious discussion.

In any discussion of this nature, it's important to consider the motivations of the various speakers, so let me make mine perfectly plain. I've been a member of the XSL committee since its inception and I've invested a lot of time and effort attempting to make sure that it will solve the problems that I plan to attack with it. It's difficult for me to be completely dispassionate about the issues, although I will try. I also have a document-centric view of the world. While I fully understand the benefits of using XML for data, and I expect to reap rewards from it, at the end of the day, I need to render most of my XML documents so that they can be read (one way or another) by a human being.

I find the debate about XSL difficult to take seriously because XSL seems so obviously useful and necessary to me. One of XML's great strengths is the fact that it gives authors and document creators the ability to use semantically meaningful tags. This strength is a key component of one of XML's greatest selling points: reuse. Tagging something as a <product-number> means that it can be used to drive invoices, product catalogs, and bills of sale. It allows intelligent searching of document databases, and it allows a rendering system to present it in a meaningful way to the end user (be that print, online, speech, braille, or some distant smell-o-vision application).

If the product number was tagged as <bold-blue-italic>, because that was the presentation that was desired for (one of) the rendered versions, it would be impossible to tell product numbers from other things presented in bold, blue italics. This would eliminate whole classes of applications, make searching less accurate, and make rendering in other media more difficult.

The use of semantic tagging makes a powerful stylesheet language an absolute necessity. There has to be some way to render semantic markup in presentational terms. In practice this requires the ability to:

  • Style text (bold, red, on a new page).

  • Generate content (add words like "Chapter" and "Table of Contents" to the output).

  • Suppress content (remove editorial notes, or print only small parts of a document, like abstracts or executive summaries).

  • Reorder and repeat content (printing chapter titles in different styles on the chapter opening page, in running headers and footers, in the table of contents, and in cross references).

  • Sort and collate content (creating indexes, collapsing page ranges).

Almost all of the arguments against XSL seem to neglect the fact that transforming semantic markup into presentational markup is just that, a transformation. The document that is displayed is not the same as the source document. The extent of the transformation varies, with styled web content at one end of the spectrum, perhaps, and high-quality, high-resolution, layout-driven, four-color printing at the other.

In order to effectively render content, you need two things: a standard language for expressing the layout that is desired (in XSL, the formatting objects), and a facility for transforming the source document into the layout document (in XSL, the transformation language). Decorating the source tree is not sufficient for most applications.

It happens that one common formatting language is HTML, so it's useful to render semantic content by translating it into HTML and treating HTML as a sort of "RTF for the web." But in the long-run XSL will offer much more than that. XSL includes a standard vocabulary of formatting objects, with well-defined CSS properties to control them. These formatting objects allow stylesheet authors to produce high-quality print output. Future versions of XSL will extend the vocabulary, enabling applications to produce the highest quality print rendering in a standard way. This will also allow content publishers to have a single set of stylesheets that will work in multiple media with different applications. And as the rendering capabilities of monitors improve, and the sophistication of browsers increase, the difference between online and print applications will decrease, reducing the need for widely different stylesheets for them.

As I've watched the debate, I've heard a number of arguments against XSL. Here are my reactions to a few of them.

  • XSL is too complex.

    Complexity is a difficult thing to measure. While it is true that it's possible to write very complex stylesheets in XSL, it's also true that you can accomplish a lot without delving deeply into the complexity.

    Consider a simple XSL task, formatting <emphasis> elements in a <synopsis> as bold, inline text:

    <xsl:template match="synopsis//emphasis">
                <fo:inline-sequence font-weight="bold"> <xsl:apply-templates/>
                </fo:inline-sequence> </xsl:template>

    If I explain that XSL processes a document by finding templates that match each context and applying them, I don't think that fragment is really very complex. In particular, I don't think it's much more complicated than the equivalent CSS:

    synopsis * emphasis { display: inline; font-weight: bold } 

    Both of these systems require you to understand a set of concepts and a vocabulary of terms. To do complex processing in either system requires deeper knowledge than is required to do simple processing, but in both systems a lot can be accomplished with the simple rules. CSS's weakness is that it can't reasonably be extended to handle the more complex cases. (For example, reformatting XML data extracted from a database system into a table, sorting the entries in the table based on some user-specified criteria.)

    One of the factors that I think has contributed to the apparent complexity of XSL is the fact that it has changed significantly, and sometimes dramatically, between Working Drafts. So if you were familiar with an early draft, it now looks totally different. But that's the nature of Working Drafts.

  • XSL is unnecessary because you can do everything with CSS and the DOM.

    We could extend this argument and say that anything you can do with an XML document, you could do without the DOM, so we don't need the DOM. And while we're at it, anything you can do with XML you could do with SGML, so we don't need XML either.

    This argument reminds me of one of my favorite quotes about computer science: "There's nothing you can do with a computer that you couldn't do with enough squares of toilet paper and pebbles, if you had the time." The point of these technologies is that they make a certain class of applications much easier and more standard. The same is true of XSL. XSL makes it easier for applications to transform semantic markup into presentational markup. While I'll warrant that not every web user on the planet will be able to write XSL stylesheets, I'll bet in a year that more of them will be able to do that than will be able to write the equivalent transformation using the DOM and CSS.

    This argument also fails to address any of the features of XSL that are designed to produce output in environments other than a web browser. To suggest that content providers should be expected to use CSS and the DOM to make web documents and something else, such as DSSSL, to make print documents is ludicrous.

    No content provider wants to support multiple stylesheet languages for multiple outputs anymore than they want to have multiple source documents. Reuse is one of the key benefits of XML.

  • XSL stands no chance of acceptance by the web community.

    XSL hasn't even reached its first PR yet and already there are significant implementations by IBM and Microsoft. In fact, Microsoft has put XSL support into msxsl.dll, effectively making XSL support part of the operating system. Not the browser, the operating system. There are additional implementations by independent developers such as James Clark and James Tauber, who has published the first application (that I'm aware of) that supports the XSL formatting objects.

    And it's hardly fair to argue that these applications don't represent acceptance because they don't support the full standard. There are parts of CSS1 and CSS2 that still haven't been implemented in a significant commercial product, but there seems little fear that they won't be accepted by the web community.

    As far as I can see, XSL shows as much promise for acceptance by the web community as any other developing standard.

  • XSL doesn't support interactive documents.

    That's true. It also doesn't support bidirectional printing, complex layout-driven formatting, or a host of other advanced features that are clearly in scope. The XSL requirements document (http://www.w3.org/TR/WD-XSLReq) addresses interactivity, which makes it reasonable to argue that the XSL WG will address interactivity in the future, though perhaps not in the first PR.

    The fact remains that there may be some interactive applications for which XSL is an inappropriate tool. That doesn't bother me. Just because I expect my browser to be able to display my financial statement doesn't mean I expect it to be able to calculate my net worth.

  • XSL is "dangerous" because it allows one to replace content-rich XML with XSL formatting objects.

    This argument seems to be little more than an attempt to spread fear, uncertainty, and doubt. This situation already exists with HTML and CSS. Instead of using <H1> and <STRONG>, I could do everything with <DIV> tags (and nothing else) by decorating each <DIV> with the appropriate inline style. But no one is arguing that CSS is "dangerous" for this reason. In practice, people just don't do that. And, you know, if they did, that'd hardly be XSL's fault any more than CSS's. As one of my colleagues likes to say, "you can't legislate morality."

    In fact, I think that XSL and CSS complement each other nicely:

    • XSL flow objects use CSS properties, so CSS is an integral part of the XSL formatting story. In addition, there's been considerable effort expended to make sure that a common formatting model is developed. This will allow people to learn a single display semantic and apply it equally well to CSS and XSL in the projects where each is appropriate.

    • One common use of the transformation language is to transform XML documents into HTML for rendering in a browser. When performing this kind of conversion, I almost always generate nice, structured HTML, with as much semantic content as I can and then attach class attributes so that I can style the result with CSS. This makes my web documents accessible today, just as publishing them as XML with XSL stylesheets will make them accessible tomorrow.

  • XSL is bad because the W3C shouldn't publish competing standards.

    XSL and CSS are not competitors. As I outlined above, they are in fact complementary.

    I think XSL actually strengthens the position of both CSS and the DOM. The transformation process is a tree-to-tree transformation; the DOM is a natural way for implementations to represent these trees. XSL flow object result trees are heavily decorated with CSS properties. And the HTML pages that I produce with XSL transformations all have structure and class attributes so that I can apply CSS to them.

The future of richly structured, accessible web documents is XML and a stylesheet language with enough richness to present them well in a variety of media. In an acronym, XSL.

Copyright © 1999 by Norman Walsh, nwalsh@arbortext.com