XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XSLT, Comments and Processing Instructions

September 13, 2000

XSLT includes built-in template rules to copy the text element content from a source tree to a result tree. Comments and processing instructions, however, get left out of this; unless you specify otherwise, an XSLT processor ignores them. You can specify otherwise, and this ability gives you access to the potentially valuable information they store.

You can also add comments and processing instructions to your output, which lets you store information that wouldn't otherwise fit into your document. This ability to both read and write comments and processing instructions lets you copy comments and processing instructions from your source tree to your input tree and even to convert them to elements.

Comments in Your Output

Add comment nodes to your result tree with the xsl:comment element. Using it is simple: put the text of your comment between its start- and end-tags. For example, an XSLT processor that applies the following template to a poem element

<xsl:template match="poem">
  <html>
    <xsl:comment> Created by FabAutoDocGen release 3 </xsl:comment>
    <xsl:apply-templates/>
  </html>
</xsl:template>
will add the following comment after the html start-tag and before its contents in the result tree:

<!-- Created by FabAutoDocGen release 3 -->

Note the space after the xsl:comment start-tag and the one before its end-tag to keep the actual comment being output from bumping into the hyphens that start and end the comment.

Speaking of hyphens, a pair of hyphens within your output comment or a single hyphen at the end of your comment, as shown in the next example, are technically errors.

<xsl:template match="poem">
  <html>
    <xsl:comment> Created by FabAutoDocGen -- rel. 3 -</xsl:comment>
  <xsl:apply-templates/>
  </html>
</xsl:template>

It's easy enough for an XSLT processor to recover from this error. According to the XSLT spec, it can insert a space between two contiguous hyphens and a space after a hyphen that ends a comment to make the comment legal, like this:

<html>
<!-- Created by the FabAutoDocGen system - - rel. 3 - -->

While some processors may correct this for you, it's not a good idea to count on it; it's better to specify valid comment text as the content of your stylesheet's xsl:comment element.

Note that if an xsl:comment element is a top-level element -- that is, if it's a child of an xsl:stylesheet element -- an XSLT processor will ignore it. It must be a child of an element that can add something to the result tree. In the example above, the xsl:comment element is a child of an xsl:template element.

By putting a template rule's xsl:apply-templates or xsl:value-of elements between the xsl:comment start- and end-tags, a stylesheet can convert the source tree content represented by these elements into an output comment. For example, the following template rule converts a documentation element like the ones found in W3C Schemas into XML 1.0 comments in the output.

<xsl:template match="documentation">
  <xsl:comment><xsl:apply-templates/></xsl:comment>
</xsl:template>

It would convert this

<documentation>The following is a revision.</documentation>
into this:

<!--The following is a revision.-->

Reading and Using Comments from Your Source Tree

A stylesheet template with the comment() function as the value of its match attribute selects all the comments in the source tree. The following template copies comments to the result tree:

<xsl:template match="comment()">
  <xsl:comment><xsl:value-of select="."/></xsl:comment>
</xsl:template>

(There is a small chance that the comments won't even make it to the source tree. The XML parser isn't required to pass comments along to the application -- in this case, the XSLT processor -- so if you don't see them showing up, try using a different XML parser with your XSLT processor.)

Once you've read comments from the source tree, you don't have to output them as comments. Wrapping the comment() template's xsl:value-of element with a literal result element (or putting it inside an xsl:element element) lets you convert comments into elements. For example, the first template in the following stylesheet

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">

  <xsl:template match="comment()">
    <doc><xsl:value-of select="."/></doc>
  </xsl:template>

  <xsl:template match="verse">
   <p><xsl:apply-templates/></p>
  </xsl:template>

</xsl:stylesheet>

converts the comments in this document

<!-- Poem starts here. -->
<verse>Of Man's First Disobedience, and the Fruit</verse>
<!-- Poem ends here. -->

to the doc elements shown here:

<doc> Poem starts here. </doc>
<p>Of Man's First Disobedience, and the Fruit</p>
<doc> Poem ends here. </doc>

Processing Instructions

The designation of a stylesheet to use with a particular XML or HTML document is the most popular use of XML processing instructions. It's just one example of their ability to pass along information that doesn't fit into a document's regular structure. Being able to read and write them with an XSLT stylesheet gives your application more power to communicate with other applications.

Outputting Processing Instructions

Add processing instruction nodes to your result tree with the xsl:processing-instruction element. Specify the processing instruction target (which, in the words of the XML spec, is "used to identify the application to which the instruction is directed") in a name attribute, and put any other contents of the processing instruction between the xsl:processing-instruction element's start- and end-tags.

Because an XML processing instruction ends with the two characters ?>, the content of your processing instruction cannot include a question mark immediately followed by a greater-than sign.

The following example

<xsl:template match="article">
  <xsl:processing-instruction name="xml-stylesheet">
    <xsl:text>href="headlines.css" type="text/css"</xsl:text>
  </xsl:processing-instruction>
  <html>
    <xsl:apply-templates/>
  </html>
</xsl:template> 
will add this processing instruction
<?xml-stylesheet href="headlines.css" type="text/css">
before the html element added to the result tree when an XSLT processor finds an article element in the source tree.

There are two special things to note about this example:

  • The example would still work without the xsl:text element surrounding the new processing instruction's contents, but then the carriage returns on either side of that text would have been included in the output, splitting the processing instruction over three lines. (Carriage returns that are next to character data get included in the result tree; those that aren't, don't.) A processing instruction with carriage returns in it would still be perfectly valid.

  • The processing instruction added to the result tree ends with ">" and not "?>" like most XML processing instructions. The XSLT processor knows that the stylesheet is creating an HTML document because the result tree's document element is called "html", so it creates an HTML-style processing instruction. If the result tree's document element isn't "html" (and if you don't specifically tell it to create HTML-style output with an "html" value for an xsl:output element's method attribute) then the new processing instruction will end with "?>".

By using elements such as xsl:apply-templates and xsl:value-of between the xsl:processing-start element's start- and end-tags, you can insert the contents and attribute values of elements from the source tree inside a processing instruction being added to the result tree. For example, the following template rule

<xsl:template match="stylesheetFile">
  <xsl:processing-instruction name="xml-stylesheet">
       href="<xsl:value-of select='.'/>" 
       type="<xsl:value-of select='@type'/>"
  </xsl:processing-instruction>
</xsl:template>
will turn this stylesheetFile element
<stylesheetFile type="text/css">headlines.css</stylesheetFile>
into this processing instruction:

<?xml-stylesheet 
  href="headlines.css" type="text/css"
  ?>

It uses the contents of the matched stylesheetFile element node as the href parameter in the result tree's processing instruction and the value of the stylesheetFile element's type attribute for the value of the processing instruction's type parameter. (The template doesn't use xsl:text elements around its content to prevent line breaks in the resulting processing instruction because it can't -- the content includes two xsl:value-of elements, and an xsl:text element cannot have any child elements. The resulting processing instruction is still perfectly legal XML.)

As with xsl:comment elements, an xsl:processing-instruction element cannot be a top-level element -- if it's a child of an xsl:stylesheet element, an XSLT processor will ignore it. Like the one shown in the example above, it should be a child of an element (in this case, xsl:template) that can add nodes to the result tree.

Reading and Using Processing Instructions from Your Source Tree

An XSLT processor's default treatment of processing instructions in the source tree is to ignore them. Using the processing-instruction() function, your stylesheet can find processing instructions and add them to the result tree. For example, this template copies all processing instructions to the output with no changes.

<xsl:template match="processing-instruction()">
  <xsl:copy/>
</xsl:template>

XSLT also lets you select processing instructions by the value of the processing instruction target that must begin each one. Together with XSLT's ability to pull out processing instruction content by using the xsl:value-of element, you can use this to convert processing instructions with specific processing instruction targets into their own elements.

For example, the following XML document excerpt has two processing instructions with different PI targets: xml-stylesheet and smellPlugIn.

<?xml-stylesheet href="headlines.css" type="text/css"?>
<verse>And hazard in the Glorious Enterprise</verse>
<?smellPlugIn scent="newCar" duration="12secs"?>

In addition to converting the verse element above to a p element, the following template rules will convert the xml-stylesheet processing instruction to a stylesheet element and the smellPlugIn processing instruction to a smellData element.

<xsl:template match="processing-instruction('xml-stylesheet')">
  <stylesheet><xsl:value-of select="."/></stylesheet>
</xsl:template>

<xsl:template match="processing-instruction('smellPlugIn')">
  <smellData><xsl:value-of select="."/></smellData>
</xsl:template>

<xsl:template match="verse">
  <p><xsl:apply-templates/></p>
</xsl:template>

The templates above, applied to the example input, add the following to the result tree:

<stylesheet>href="headlines.css" type="text/css"</stylesheet>
<p>And hazard in the Glorious Enterprise</p>
<smellData>scent="newCar" duration="12secs"</smellData>

Using the specialized XSLT elements and functions that we've looked at in this column, you can take full advantage of the flexibility that comments and processing instructions can add to your input and output documents. After all, sometimes XML documents can be more than elements and attributes.