Comparing XSLT and XQuery
XSLT has been the main XML technology for transformations for some time now, but it’s not the only player in the game. Although XQuery is designed for retrieving and interpreting information, it is also, according to the specification, “flexible enough to query a broad spectrum of XML information sources, including both databases and documents.”
In this article, we’ll be transforming the following XML source information from Cathy Kost, a beginning XML student who helps with a pot-bellied pig rescue organization.
<animal>
<species>pot belly pig</species>
<name>Molly II</name>
<birth>February, 1998</birth>
<in-date>January, 2000</in-date>
<from>Middle Ave.</from>
<gender spay-neuter="yes">F</gender>
<info>
She is a sweet, friendly pig who likes to hang
out on Cathy’s porch on the lounge pad.
</info>
<picture>
<file>images/molly_th.jpg</file>
<description>Black pig</description>
<caption>Molly in the Pasture</caption>
</picture>
</animal>
We will develop a transformation in both XSLT and XQuery. The transformations will change the XML into several HTML pages with four pigs per page, and an index page with links to the pig description pages. Both transformations will use built-in extensions to create multiple output files.
Each pig’s <picture> element will become an
<img> element in the resulting file.
It’s good practice to put a width and
height attribute into image elements, but it’s
a lot of work to have to look up each image’s dimensions.
This is a perfect place for a user-defined extension function that, given an
image’s file name, returns the image’s width and height.
Which Tools to Use?
For the XSLT transformation, we use the Apache Xalan XSLT processor. For XQuery, we use Qizx/open, which implements all features of the language except Schema import and validation.
The Main Differences
XSLT has a “processing engine” that automatically goes through the document tree and applies templates as it finds nodes; with XQuery the programmer has to direct the process. It’s almost like the difference between RPG (the business programming language, not role playing games) and procedure-oriented programming languages like C. In RPG, there is an implicit processing cycle, and you just set up the actions that you want to occur when certain conditions are met; in C, you are responsible for directing the algorithm.
XSLT is to XQuery as JavaScript is to Java. XSLT is untyped; conversions between nodes and strings and numbers are handled pretty much transparently. XQuery is a typed language which uses the types defined by XML Schema. XQuery will complain when it gets input that isn't of the appropriate type.
Global Setup
We want the number of pigs per page to be a global, user-settable parameter with a default value of four. In XSLT, we define this outside of any templates:
<xsl:param name="perPage" select="'4'"/>
In XQuery the following declaration appears as the first line in our query file:
declare variable $perPage as xs:integer := 4;
Both of these can be overridden by options on the command line. However, here is
our first difference between XSLT and XQuery: any XSLT template may contain an
<xsl:param>; that is how information gets passed among
templates. XQuery’s declare variable defines global
variables only, and cannot appear within a user-defined function.
We also want the output file to be XHTML transitional. In XSLT we accomplish this with the following element:
<xsl:output
method="xml"
indent="yes"
omit-xml-declaration="yes"
doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
doctype-system="http://www.w3.org/TR/xhtml1/DTD/
xhtml1-transitional.dtd" />
In XQuery, we add these options to the UNIX shell script that will run Qizx/open:
-Xindent='yes' \
-Xmethod='XHTML' \
-X'doctype-public'='-//W3C//DTD XHTML 1.0 Transitional//EN'\
-X'doctype-system'=\
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd' \
Creating the Index Page
Here is what the index page looks like when we have four pigs per page:
Figure 1: Logo with list items
In XSLT, we provide a template to match the root <pig-rescue>
element. (To save space, we’re not showing the code that generates
the logo image on the index page.) Since we have to process the
<animal> elements in two different ways—once for
the index page and once for the display pages—we need to use a
mode. The template will be applied only to every
fourth (perPage) entry; this ensures that we get the
correct number of list items in the unordered list.
<xsl:template match="pig-rescue">
<html>
<head>
<link rel="stylesheet" type="text/css" href="bdr.css" />
<title>The Pigs</title>
</head>
<body>
<div align="center">
<h1>The Pigs At Belly Draggers Ranch</h1>
</div>
<ul>
<xsl:apply-templates
select="animal[position() mod $perPage = 1]"
mode="indexList" />
</ul>
</body>
</html>
</xsl:template>
In XQuery, processing the document becomes our single XQuery statement; in this case, an XQuery FLWOR expression. This acronym stands for the clauses in the expression:
for, which allows you to step through a sequence of items or nodes.let, which allows you to declare and initialize variables.where(optional), which allows you to specify under which conditions an item or node should be chosen.order(optional), which sorts the selected items.return, which returns the specified values for each of the selected items.
A FLWOR expression must have at least one for or
let; ours has just a let which assigns
the root element from the input document to the $doc
variable. The return returns an
<html> element. The parentheses aren’t really
necessary as only one item is being returned, but we want to use them for
the sake of consistency.
let $doc := fn:input()/pig-rescue
return
(
<html>
<head>
<link rel="stylesheet" type="text/css" href="bdr.css" />
<title>The Pigs</title>
</head>
<body>
<div align="center">
<h1>The Pigs At Belly Draggers Ranch</h1>
</div>
<ul>
{
local:make-name-list( $doc/animal )
}
</ul>
</body>
</html>
)
The fn:input() in the preceding code
is a Qizx/open extension that takes the input
file name from the command line.
The text starting with the <html> tag
is called a Direct Element Constructor, and it must
be well-formed. Within one of these constructors,
you may embed XQuery expressions by enclosing them in braces. In this
case, we switch back to XQuery to call the
local:make-name-list function, passing it all the
<animal> nodes within the document.
If the function name
looks like it has a namespace prefix, that’s because it does.
XQuery predefines the namespace prefix local and reserves
it for use in defining local functions.