XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

A Class Act

February 02, 2000

"Style Matters" is a new column on XML.com, covering topics related to XML style and transformation languages. Every two weeks Didier Martin will be tackling a different subject, covering XSLT, CSS, and perhaps even a smattering of DSSSL. To kick off, Didier demonstrates how to preserve the semantics of an XML document when converting it to HTML

--E.D.

Creating HTML Documents With a Zest of XML

Example Files

catalog.xml
catalog.xsl
catalog.htm
catalog.css (must choose "save as" to view)
catalog2.xsl
catalog2.htm
reverse.xsl

If you transform an XML document into HTML with XSLT on the client side—as is possible with Microsoft's IE5 browser—then there is no way to have access to the resultant HTML document. Instead, the "View Source" option allows you to display the original XML source. Thus, you can still see the original semantic information from the document. However, if the transformation is carried out on the server side, what is displayed in the browser is the HTML document sent by the server. The "source" that the user has access to in this case is the HTML document.

Most of the time, when a document is transformed from XML to HTML using an XSLT style sheet, the semantics of the original document are lost. It is very hard to know that the <P> element was in fact a <news-section> element in the original XML document. Is there any way to preserve semantic information from the original XML document? There is. (Note that the solution I propose may not be suitable for all kinds of XML documents.)

Collecting Styles in Classes

Cascading style sheets (CSS) allow you to specify a set of properties to be applied to any kind of element in your document. These properties can be collected in a class. For instance, consider the following class:

.topic 
 { 
    font-family:  Times Roman; 
    font-size: 12pt;
       color: #000000; 
 } 

This class may be applied to different elements as long as these elements include a class attribute with the value "topic," as demonstrated in the following HTML document fragment.

<p class="topic">this a paragraph rendered with 
the topic class</p>

<div class="topic">this is a div rendered with a topic 
class</div>

A class rule is not tied to any particular HTML element. It can be applied to any HTML element where the style rules make sense. Is the sole use of the class attribute for CSS styling? No, it can be used for other purposes, as we'll see in a moment.

And the Trick is...

Use the class attribute to link the transformed element in HTML to the original XML element.

To illustrate this most efficiently, an example is in order. Our original XML document to be transformed into HTML is a list of items that an XML server returned from a request to an e-commerce catalog.

<?xml version= "1.0"?> 
<catalog> <item> <product-number>123-46-465</product-number> <description>Our best mountain bike</description> <price>4 300.00$</price> </item> <item> <product-number>4635-54-348</product-number> <description>Coyote's bike: faster than road runner</description> <price>6 500.00$</price> </item> </catalog>

On the server side, the XML document is transformed with the following XSLT stylesheet:


<?xml version="1.0"?>
  <xsl:stylesheet version="1.0" indent= "yes"> 
   <xsl:template match="/">
<html>
<head>
  <title>The Coyote's Bike shop</title>
</head>
<body>
 <table border="1" cellspacing="0" cellpadding="5" class= "catalog">
     <thead> 
   <tr>
  <th>product-number</th> 
  <th>description</th>
  <th>price</th> 
   </tr> 
     </thead> 
 <tbody> 
  <xsl:apply-templates select="catalog/item"/> 
 </tbody> 
</table> 
</body> 
</html> 
 </xsl:template> 
  <xsl:template match="item"> 
   <tr class="item"> 
<td class="product-number"><xsl:value-of select="./product-number"/></td>    
<td class="description"><xsl:value-of select="./description"/></td> 
<td class="price"><xsl:value-of select="./price"/></td>
  </tr> 
 </xsl:template>
</xsl:stylesheet> 

The resultant HTML document is here: catalog.htm. If you take a close look at this document, which is the result of the XSLT transformation that occurred on the server, you'll notice that we transformed each XML element into a particular HTML object:

  • the <catalog> element into a table
  • the <item> element into a row
  • the <product-number>, <description> and <price> elements into table cells

Also notice that we have not included any CSS stylesheet link or style element in the HTML document. We used the class attribute simply to relate the XML elements to the rendition objects. Figure 1 shows the correspondences between the XML and HTML documents.

Correspondence of XML source to HTML output
document

Figure 1: Correspondence of XML source to output HTML document

In our example, we used HTML as the result language. The XSLT transformation did the translation from our domain language to the result language. Of course, HTML itself can convey certain semantics —but they are totally different from those conveyed by our catalog domain language. HTML deals with paragraphs, headers, tables, objects, etc., while our domain language deals with catalog, item, and part-numbers objects. By including the class attribute, we still keep some of the original semantics. Using this trick, it is possible to relate the displayed objects to each object of our domain language.

This technique also has a secondary advantage. For instance, by inserting the following construct in the HTML header, you can modify the style of the elements containing the class attribute.


<head> 
<title>The Coyote's Bike shop</title>
<link href="catalog.css" type="text/css" rel="stylesheet">
</head>

Round-tripping

Conversely, can we recreate the original XML document from the HTML document? You bet we can, as long as we transform our XML document into XHTML. XHTML is a reformulation of HTML 4.0 into XML syntax. This XML-XHTML-XML transformation is an example of "round-tripping."

The catalog.xml document should now be transformed with the catalog2.xsl stylesheet (a revision of our original stylesheet that now produces XHTML). We obtain, this time, an XHTML document that can still be rendered in most browsers (version 4 and up). To recreate the original XML document, we apply the reverse.xsl stylesheet to the XHTML document. The reverse.xsl stylesheet matches the class attributes from the XHTML document, and reconstitutes the original XML document. Here's an example rule that creates an item element:

<xsl:template match="*[@class='item']">
<item>
<xsl:apply-templates select=".//*[@class='product-number']"/>
<xsl:apply-templates select=".//*[@class='description']"/>
<xsl:apply-templates select=".//*[@class='price']"/>
</item>
</xsl:template>

That's it—we closed the loop!