Automating Stylesheet Creation
Since the early days of XSLT, many have asked whether it was possible to automate the creation of XSLT stylesheets. The general idea of filling out a form or dragging some icons around, then clicking a button and seeing a productive stylesheet generated from your input has always appealed to people. However, the problem of generating working XSLT syntax from the result of someone clicking on pull-down menus and radio buttons has not attracted many takers.
Breaking the problem down into two parts, though, makes it much easier. Step one, the generation of a file like the following from XForms, Tcl/TK, scripting plus an HTML forms interface, or Visual Basic, sounds much more tractable:
<conversion>
<delete>
<name>foobarNum</name>
<name>weather</name>
</delete>
<strip>
<name>email</name>
</strip>
<reorder>
<parent>purchaseOrder</parent>
<children>
<name>contact</name>
<name>shippingAddress</name>
<name>items</name>
</children>
</reorder>
<wrap>
<name>shippingAddress</name>
<wrapper>addresses</wrapper>
</wrap>
<attribute2element>
<name>purchaseOrder/@date</name>
<elName>orderDate</elName>
</attribute2element>
<custom>
<name>item/@img</name>
</custom>
</conversion>
If anyone has a successful crack at developing a graphical user interface that creates such a file (and, ideally, reads one, lets the user edit it, and writes out the saved one), let me know and I'll mention it in a future column. Meanwhile, I'll show you how to do step two of the XSLT-generation app: how one stylesheet can create another using the file above (which I've named conversionSpecs.xml) as the source for the generating stylesheet. The tasks that conversionSpecs.xml describes above are somewhat general-purpose for XML transformations, and certainly won't cover all your needs; the stylesheet-generation stylesheet below provides a model for other tasks that you might want to list in a file like the one above and then convert to working XSLT code.
The key trick when writing stylesheets that generate other
stylesheets is the use of the xsl:namespace-alias element,
which I described in further detail in the "Using XSLT to Output XSLT"
section of an earlier column
titled Namespaces
and XSLT Stylesheets. The following shows the beginning of my
stylesheet-generation stylesheet. (The complete stylesheet and sample
files are available here.)
The template rules for the generated stylesheet will have the
namespace prefix "wh" in the generating stylesheet, where they're
mapped to the dummy namespace
"whatever." The xsl:namespace-alias element below tells the
XSLT processor to map the "wh" elements to the same namespace as the
"xsl" prefix in the generated stylesheet. Because this is the
http://www.w3.org/1999/XSL/Transform namespace, this will make the
"wh" elements in the generated stylesheet proper XSLT
instructions.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:wh="whatever"
xmlns:exsl="http://exslt.org/common"
extension-element-prefixes="exsl"
version="1.0">
<xsl:strip-space elements="*"/>
<xsl:output indent="yes"/>
<xsl:namespace-alias result-prefix="xsl"
stylesheet-prefix="wh"/>
Other parts of the stylesheet setup above include the
declaration of a namespace used for a stylesheet extension element,
which I'll cover below, and xsl:strip-space
and xsl:output elements to make the generated stylesheet a
little easier to read, because generated code is often tough on the
eyes.
|
The generating stylesheet's first template rule creates a
template rule in the generated stylesheet that will delete all
elements named by name elements in the conversionSpec.xml file's delete element.
With the version of conversionSpec.xml shown above, we want it to generate this:
<wh:template match="foobarNum|weather"/>
The template rule that creates this uses
the xsl:element element to create an element named
"wh:template" and an xsl:attribute element to create
a match attribute for wh:template. It creates the
attribute's value by iterating through the delete
element's name children and adding each name
element's value and a pipe delimiter if another name is
coming up.
<!-- Create template rule to delete elements. -->
<xsl:template match="delete">
<xsl:element name="wh:template">
<xsl:attribute name="match">
<xsl:for-each select="name">
<xsl:value-of select="."/>
<xsl:if test="following-sibling::name">
<xsl:text>|</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:attribute>
<!-- Created element is empty. -->
</xsl:element>
</xsl:template>
Once the attribute is created, nothing else is necessary,
because as the sample wh:template element above shows, we
want to create an empty template rule that does nothing when it finds
elements described by the match condition.
The basic pattern of this template rule works for other stylesheet tasks in which one template rule in the generated stylesheet handles multiple elements. The generating stylesheet's next template rule, which creates a template rule that will output the contents of any matched elements without their tags, resembles the one above, with the new part bolded.
<!-- Create template rule to strip tags. -->
<xsl:template match="strip">
<xsl:element name="wh:template">
<xsl:attribute name="match">
<xsl:for-each select="name">
<xsl:value-of select="."/>
<xsl:if test="following-sibling::name">
<xsl:text>|</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:attribute>
<wh:apply-templates/>
</xsl:element>
</xsl:template>
Once this template rule finishes the match condition
listing the relevant elements, it adds a wh:apply-templates
element to the result, creating an instruction for the generated
stylesheet that looks like this (remember, conversionSpecs.xml only
had one name child of its strip element):
<wh:template match="email">
<wh:apply-templates/>
</wh:template>
The two template rules we've seen so far from the generating stylesheet follow a common pattern that will be useful in any stylesheet that generates other stylesheets. The next three template rules follow another pattern that is handy when you need to generate a separate template rule for each element named in the conversion instructions as the object of a specific task.
The template rule below creates a template rule in the
generated stylesheet for each reorder element in the
conversion instructions. The generated template rule will name the
element to reorder in its match condition, copy all the attributes
verbatim, and then use a series of wh:apply-templates
instructions to output the parent element's children in the order
specified in the name children of the reorder
element's children element:
<!-- Create a template rule for each element
that needs to be reordered. -->
<xsl:template match="reorder">
<wh:template match="{parent}">
<wh:copy>
<wh:apply-templates select="@*"/>
<xsl:for-each select="children/name">
<wh:apply-templates select="{.}"/>
</xsl:for-each>
</wh:copy>
</wh:template>
</xsl:template>
The second of the three template rules that creates a
generated template for each element that needs its own match condition,
finds the conversion instruction's wrap elements and uses its
two child elements to create a generated template rule that will wrap
the named element, when found, with an element named by
the wrapper element:
<!-- Create a template rule for each element that
needs to be wrapped in another element. -->
<xsl:template match="wrap">
<wh:template match="{name}">
<xsl:element name="{wrapper}">
<wh:copy>
<wh:apply-templates select="@*|node()"/>
</wh:copy>
</xsl:element>
</wh:template>
</xsl:template>
The last of these three template rules creates a template
rule that will convert attributes to subelements when it finds
an attribute2element element in the conversion
instructions:
<!-- Create a template rule for each attribute that
gets converted to a subelement. -->
<xsl:template match="attribute2element">
<wh:template match="{name}">
<wh:element name="{elName}">
<wh:value-of select="."/>
</wh:element>
</wh:template>
</xsl:template>
|
The generating stylesheet's next template rule tells the
XSLT processor not to apply any template rules to name
children of custom elements, because the generating
stylesheet's final template rule will grab them when it needs
them.
<xsl:template match="custom/name"/>
The last template rule is the setup one that gets
triggered upon finding the root of the conversion instructions
document. It creates the document element of the generated style
(the wh:stylesheet element), calls any other necessary
template rules in the generating stylesheet, and then does two
things. First, it adds a default template rule to the generated
stylesheet that will copy any nodes that don't have other template
rules specified for them. If your generated stylesheet will copy most
of the nodes it finds without changing them, this is a good default
template rule to add to your generated stylesheet. But you may want
your generated stylesheet to use a different default template rule. If
so, this is a good place for it.
<xsl:template match="/">
<wh:stylesheet version="1.0">
<xsl:apply-templates/>
<!-- Default template rule in generated stylesheet. -->
<wh:template match="@*|node()">
<wh:copy>
<wh:apply-templates select="@*|node()"/>
</wh:copy>
</wh:template>
<!-- If needed, generate an included
stylesheet for custom code. -->
<xsl:if test="conversion/custom">
<wh:include href="custElements.xsl"/>
<exsl:document href="custElements.xsl" indent="yes">
<wh:stylesheet version="1.0">
<xsl:for-each select="conversion/custom/name">
<wh:template match="{.}">
<xsl:comment> Add custom code here </xsl:comment>
</wh:template>
</xsl:for-each>
</wh:stylesheet>
</exsl:document>
</xsl:if>
</wh:stylesheet>
</xsl:template>
The most complex job of the generating stylesheet's final template rule is the generation of template rules for elements and other source nodes that require custom handling. You're lucky if your stylesheet generation system can automatically generate 90 percent of what you need, but you still need to account for the other 10 percent.
The conversionSpecs.xml file has a custom element
with only one name child to indicate a source tree node that
requires this special handling: img attributes
of item elements. The generating stylesheet generates a
separate stylesheet named custElements.xsl, where the custom code for
such nodes can be written by hand to process any nodes named
in conversion/custom elements. The generating stylesheet also
adds a wh:include instruction to the main generated
stylesheet so that an XSLT processor that executes it uses the
template rules from the custom code as well. (Once you've added your
customized code to custElements.xsl, keep a copy in a separate
directory, because this generating stylesheet overwrites any existing
custElements.xsl stylesheets each time it's run.)
The generating stylesheet creates custElements.xsl using
the exsl:document extension element, one of the EXSLT
extensions to XSLT 1.0 specification that I recently described in the
column
titled Extending XSLT
with
EXSLT. (Libxslt
supports exsl:document, but for Saxon, use the XSLT
2.0 xsl:result-document instruction instead.) Inside
this exsl:document element is a wh:stylesheet
element that does the basic setup for the custElements.xsl stylesheet
being created. That wh:stylesheet element will have
a wh:template element for each custom/name element
found in the document. To prevent these template rules from showing up
as single-tag empty elements in the custElements.xsl stylesheet,
an xsl:comment element creates a stub to be replaced by the
actual XSLT logic for those elements.
|
Also in Transforming XML | |
If all this talk of generating/generated stylesheets is confusing, an overview and diagram of the script that runs them all should make it clearer. (In the sample documents zip file, there is a gentest.bat driver file for Windows machines and a gentest.sh one for Linux machines. Both assume that the free libxslt command-line utility xsltproc is in your path.) The driver script has two lines, not counting the line that shows you the result:
generating.xsl reads conversionSpecs.xml and creates the generated.xsl and custElements.xsl stylesheets based on the information found in conversionSpecs.xml.
The generated stylesheet is tested: generated.xsl gets
applied to sampleDoc.xml and stores the result in sampleOut.xml. It
deletes the foobarNum and weather elements, strips
the tags from any email elements, and executes the other
tasks originally listed in conversionSpecs.xml.
How does all this reduce the workload in your office? In the ideal scenario, when a new content type shows up, someone who doesn't necessarily know XSLT creates a new version of conversionSpecs.xml for that content type (perhaps using a GUI tool, as I described above) and then uses generating.xsl to create the stylesheet necessary to convert the new content type to the XML that your system needs. The difficult part is determining the typical transformation tasks that your shop needs and then writing the XSL code in generating.xsl to automate these tasks. The generating.xsl stylesheet shown in this article demonstrates a few automation patterns that should give you a good head start.
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.