XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Self-Enhancing Stylesheets
by Manfred Knobloch | Pages: 1, 2

Outputting the XSLT Namespace

Before we look at the core functionality of our tag-trace template, we must think about namespace operations. Because we want to generate XSLT, we need to distinguish the XSLT elements that should be interpreted by the processor from those that are only written to the output tree. This is exactly what namespaces are made for. We will use the namespace prefix genxsl to output XSLT reserved names, using the following namespace declaration.

xmlns:genxsl="http://www.xml-web.de/genxsl"

This enables us to write <genxsl:template match="."> to our trace document. To have the genxsl prefix replaced with the xsl prefix during serialization we use the namespace-alias directive:

<xsl:namespace-alias stylesheet-prefix="genxsl" result-prefix="xsl"/>

It is processor dependent what really happens due to this declaration (XSLT 2.0 §11.1.4), but Saxon produces <xsl:template match=“.“> from <genxsl:template match=“.“> and that's what we want.

Now let's use some new XSLT 2.0 features to collect and compare the tags occurring in our XML input document with the names we keep in the variable $handled-tags.

The tag-trace Template

The main task inside the tag-trace template is to collect all tag names that can be found in the XML input document. To get a list of unique names we use the new xsl:for-each-group element. We choose all element nodes passing the '//*' expression to the select attribute. Then we tell the group-by attribute to divide the nodes into a collection of sequences of items with identical names.

<xsl:result-document href="not_processed.xml" format="nursery">
  <genxsl:stylesheet version="2.0">
  <!-- get the tag names of the input file -->
    <xsl:for-each-group select="//*" group-by="name()">
      <xsl:sort select="name()" case-order="lower-first"/>
      <!-- keep name unique -->
      <xsl:variable name="cname" select="name(current-group()[1])"/>
      <xsl:if test="not(contains($handled-tags, $cname))">
            <!-- write template for name found -->
            <genxsl:template match="{$cname}">
        <!-- attribute code added later -->
              <genxsl:apply-templates/>
            </genxsl:template>
        </xsl:if>
      </xsl:for-each-group>
  </genxsl:stylesheet>
</xsl:result-document>

Inside the for-each-group element the current-group() function accesses the sequence currently processed. To achieve uniqueness we explicitly take the first item of the sequence and keep its name in the variable $cname. Now it is easy to test whether the current name is contained in the list of handled tags.

<xsl:if test="not(contains($handled-tags, $cname))">

If the condition is true, a template fragment matching the current name is created by <genxsl:template match="{$cname}">.

Something like this is generated for each tag name:

<xsl:template match="para">
  <xsl:apply-templates/>
</xsl:template>

Now let's add some information about possible attributes and generate an xsl:value-of select statement for each one inside the template definition. We can use the same logic as with the tag names.

<xsl:for-each-group select="//*[name() = $cname]/@*" 
                  group-by="name()">
  <xsl:sort select="name()"/>
  <genxsl:value-of select="{concat('@',name(current-group()[1]))}"/>
</xsl:for-each-group>

To get all possible attributes of a specific element we select all attributes from all elements with the same name by the XPath expression "//*[name() = $cname]/@*". As we did with the element names the attribute names are grouped, sorted by name, and an xsl:value-of select statement is created.

Note that the select attribute inside the <genxsl:value-of select...> is not in the XSLT namespace and is not evaluated by the XSLT processor. That's the reason why we have to tell the processor to evaluate the concat() function here. This is done by the attribute value template (AVT) inside the curly brackets. If we do not use an AVT here, the complete string 'concat('@',name(current-group()[1]))' will appear on the output tree, because the processor takes it as a literal result element. The nested processing looks like this (line breaks are for formatting reasons):

<xsl:result-document href="not_processed.xml" format="nursery">
  <genxsl:stylesheet version="2.0">
  <!-- get the tag names of the input file -->
    <xsl:for-each-group select="//*" group-by="name()">
      <xsl:sort select="name()" case-order="lower-first"/>
      <!-- keep name unique -->
      <xsl:variable name="cname" select="name(current-group()[1])"/>
      <xsl:if test="not(contains($handled-tags, $cname))">
            <!-- write template for name found -->
            <genxsl:template match="{$cname}">
            <xsl:for-each-group 
              select="//*[name() = $cname]/@*" 
              group-by="name()">
              <xsl:sort select="name()"/>
              <genxsl:value-of 
                select="{concat('@',name(current-group()[1]))}"/>
            </xsl:for-each-group>
            <genxsl:apply-templates/>
            </genxsl:template>
        </xsl:if>
      </xsl:for-each-group>
  </genxsl:stylesheet>
</xsl:result-document>

You will find the code inside the nursery_sheet.xslt a bit longer than discussed here. It contains some additional features: ensuring a processor version, writing out comments to increase readability of the target file, as well as a user defined function already described in my previous article.

Usage Notes

These stylesheets can be invoked from a command line window. You can call saxon7.jar directly:

java -jar saxon7.jar -o test.html glossary.xml new_sheet.xslt

If you don't like coding the name of the calling template into nursery_sheet.xslt you can pass the name to the analyze parameter.

java -jar saxon7.jar -o test.html glossary.xml new_sheet.xslt "analyze=new_sheet.xslt"

It is easy to edit the nursery_sheet.xslt in a way that it can be called once for an arbitrary XML input file to generate template fragments for all tags inside. You can find an older version of this approach on my website generate-xslt.zip.

Conclusion

This stylesheet is neither a complete solution for developing stylesheets, nor does it replace the need for good development environments. You have to decide on your own if a specific template fragment is useful in your document processing or whether a simple <xsl:value-of select='tagname'/> would be enough. But this solution keeps us informed about the tags of an unknown document, especially when there is no DTD available. And it shows some of the promising features of XSLT 2.0.



1 to 1 of 1
  1. More needed
    2003-07-19 12:50:50 Dave Pawson
1 to 1 of 1