XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Style-free XSLT Style Sheets

July 26, 2000



Introduction

One of the most oft-marketed advantages of XML is the separation between content and the layout achievable through applying external CSS or XSL style sheets to XML documents. However, since work started on XSL, the focus has shifted from presentation to transformation. This has given birth to a transformation-only language, XSLT, which is much more widely used than its formatting counterpart, XSL formatting objects.

This shift from presentation to transformation is leading to a massive injection of logic within style sheets. This mixing of presentation and logic is becoming questionable. In this article, I present a simple technique to isolate most of the presentation outside of XSLT, leading to "style free style sheets."

As more logic is embedded in the XSLT style sheets, they become more similar to programs than data. Keeping style within the style sheets causes the same kinds of problems as keeping data within programs--loss of flexibility, and lack of maintainability by anyone but programmers.

The fact that XSLT style sheets are generally pseudo-compiled in servlet environments, and that new techniques are being announced to compile them into Java byte code, enforces this tendency of XSLT style sheets to be more programs than style. The corollary of this trend is to remove as much style as possible, unless you are ready to maintain multiple compiled style sheets.

From an XSLT user's perspective, you can't expect web designers to develop XSLT style sheets, which their favorite tools lack support for. Wouldn't it be nice to let designers design the layout of a page and to add special "tags" to include its content?

A Simple Example

At this point, let's look at a simple example.

For our new portal web site, we want to produce a home page with 3 parts: the headers and headlines are common to all the pages of the site; the body is specific to each page and is itself sub-divided into a header and 2 columns.

The document we want to produce is basically a set of embedded tables whose outer content is site generic, while the inner "body" cell is page specific.

The Traditional Solution

The traditional way of achieving this through an XSLT transformation is to build the generic structure of the page into the XSLT style sheet itself:

<xsl:template match="/">

<html>
  <head>
    <!-- Get the head values here... -->
    <xsl:apply-templates select="/html/head/*" />
  </head>
  <body>
    <table width="100%" border="1" bgcolor="LemonChiffon">
      <tr>
        <td colspan="2" width="100%">
        <h1>Header</h1>
        (common to every page of the site)</td>
      </tr>
      <tr height="300" valign="top">
        <td width="75%" bgcolor="Aqua">
            <!-- Get the body here -->
            <xsl:apply-templates select="/html/body/*"/>
       </td>
        <td width="25%">
        <h1>Headlines</h1>
        (common to every page of the site)</td>
      </tr>
    </table>
  </body>
</html>
</xsl:template>

Download the full style sheet: style.xslt

For readability, I've not included the whole style sheet above, but you get the idea--the presentation of the generic part of the page is embedded directly within a template, while the source document contains the data and often some presentation to be used for the page-specific elements.

We are in a situation that can be represented as (download the source here):

source.xml + style.xslt -> page.html

XML file (50% data + 50% presentation) + XSLT file (50% logic + 50% presentation) -> (X)HTML page.

First Step: Removing (Most of) the Presentation from the Stylesheet.

As a first step, let's focus on the XSLT sheet and see how we can separate most of the presentation from the logic.

An easy way to do this is to isolate the template above in an XHTML document. In this document we'll add some control elements (possibly using a separate namespace--not an absolute requirement, but rather a matter of style) which will trigger templates in the style sheet.

In our case, the "template" XHTML file, layout.xml, created largely by a designer, can be:

<html>
<head>
    <insert-head/>
</head>
<body>
    <table width="100%" border="1" bgcolor="LemonChiffon">
        <tr>
            <td colspan="2" width="100%">
                <h1>Header</h1>
                (common to every page of the site)
            </td>
        </tr>
        <tr height="300" valign="top">
        <td width="75%" bgcolor="Aqua">
            <insert-body/>
        </td>
        <td width="25%"><h1>Headlines</h1>
        (common to every page of the site)</td>
        </tr>
    </table>
</body>
</html>

The style sheet for processing this layout file, logic.xslt, becomes something like:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:variable name="page" select="/"/>
<xsl:variable name="layout" select="document('layout.xml')"/>

<xsl:template match="/">
    <xsl:apply-templates select="$layout/html"/>
</xsl:template>

<xsl:template match="insert-head">
    <!-- Get the head values here... -->
    <xsl:apply-templates select="$page/html/head/*"/>
</xsl:template>

<xsl:template match="insert-body">
    <!-- Get the body here -->
    <xsl:apply-templates select="$page/html/body/*"/>
</xsl:template>

<!-- Identity transformation -->
<xsl:template match="@*|*">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Isn't it simple? Existing style sheets can easily be converted to this technique (by just moving around the full blocks of logic like we've done above with our comments) and it does make a big difference in the way web sites are maintained.

The real "trick" here is the usage of predefined elements to switch between the template document and the XSLT transformation. Since the context node will be the matching element, you can use these elements to pass variable information to the XSLT transformation.

We now have a document (layout.xml) that is based on what your designers produce, and includes the complete layout of the page.

Note also the first two lines of the style sheet, defining both the current XML document and the template file as variables to facilitate switching between the two trees. Storing the current document is necessary, since applying templates on the layout document means that the context will change, and "/" will designate the root of the layout tree in the corresponding templates--removing any way to get back to the current document if you hadn't stored it.

In real life, you will probably keep some presentation in the style sheet, as repeating the templating process for for every single XHTML element would be quite a challenge, but our (improved) situation is now something like:

source.xml + logic.xslt + layout.xml (read by logic.xslt) -> page.html
(Download source code here).

XML file (50% data + 50% presentation) + XML generic layout (100% presentation) + XSLT file (90% logic + 10% presentation) -> (X)HTML

Next Step: Taking Care of the Main Source Document

The problem with the above solution is that our XML source document contains a mix of presentation and data still. In almost every case we want to separate these two. How do we go about doing that?

We repeat the same technique with the XML source file (source.xml) as we did with the style sheet: we separate the content from the presentation in this file, and obtain an XML description of the layout for this particular page, separate from the description of the content. This mirrors our separation of the general layout of the site from the logic.

This kind of multi-level layout is very convenient when designing portal sites, where there is an aggregation of data from different sources in order to create the "body" part of the page. Using the main document source file as a layout allows us to keep most of the (normally dynamic) data in separate files or data sources where they belong.

The same "switching" technique can then be used with high flexibility. To include headlines you might, for instance, introduce the following element into one of your layout files:

<document-rss src="decid/channel.xml" max-size="140"
    max-items="10" class="highlight"/>

This imaginary tag says that you want to include an RSS document located at decid/channel.xml, cutting the headlines after their 140th character, keeping only the ten most recent, and using the CSS class "highlight." Tags like this can be dealt with relatively easily by non-programmers.

In our example, taking this approach of creating elements to implement certain component functionality, this would mean a main source file, spec_layout.xml, that looked like this:

<?xml version="1.0" encoding="utf-8"?>
<!-- This file is describing the layout of the specific
     part of the page.-->
<html>
   <head>
    <title>My page</title>
    <meta name="title" content="My page" />
    <meta name="description" content="Welcome on my new page" />
  </head>
  <body>
    <insert-body-header href="content.xml"/>
    <table width="100%" border="1">
        <tr valign="top">
            <td height="250">
            <insert-first-part href="content.xml"/>
        </td>
        <td >
            <insert-second-part href="content.xml"/>
        </td>
        </tr>
    </table>
  </body>
</html>

In this example, the content for the "first part" and so on is fetched from another file content.xml--it may be under the control of an editorial system, for instance--and included where the designer has used the instruction "insert-first-part." Our XSLT sheet logic2.xslt, which contains the logic, has additional code added to it to provide these page-specific functions.

The style sheet templates to perform these actions are straightforward to write and, most of the styling being defined in the layout.xml file, these templates will be almost "style free."

Here's a summary of our final process (download source code here):

spec_layout.xml + content.xml + layout.xml+ logic2.xslt -> page.html

XML specific layout (10% data + 90% presentation) + XML content files (10% presentation + 90% data) + XML generic layout (100% presentation) + XSLT file (90% logic + 10% presentation) -> (X)HTML.

We have separated our material into

  • presentation, which is stored into generic and specific layout files usable by web designers' (X)HTML tools;
  • logic, stored in XSLT style sheets, and
  • content, stored in XML data sources.

Downsides

What we have lost in doing this is the "inheritance" feature of XSLT (where a specific style sheet can import and override templates from more generic style sheets), which can be very powerful when using layouts "derived" from other layouts.

We can work around this limitation though, using XML include or linking techniques, or at an XSLT level.

Other Techniques

Our approach is XSLT-centric (the XSLT transformation is responsible for including the right information at the right place), and a more pipe-oriented approach could be considered with the same kind of layout files.

At XML Europe 2000, John McKeown from the Trinity College Dublin presented, in his session "SVG: putting XML in the picture", an implementation where SAX filters performed a similar merging of template and content before feeding the document directly into the SAX input flow to be processed by the XSLT processor. Although requiring more Java programming, this second approach is probably more scalable.

The separation between logic, presentation, and data is also a key driver for XML-based frameworks such as Cocoon.

Working Examples

<XML>fr and 4xt are both built using this XSLT templating technique. The full source code of 4xt can be downloaded (gzipped tar file) if you want to have a closer look at this working example.