XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Comparing XSLT and XQuery

March 09, 2005

XSLT has been the main XML technology for transformations for some time now, but it’s not the only player in the game. Although XQuery is designed for retrieving and interpreting information, it is also, according to the specification, “flexible enough to query a broad spectrum of XML information sources, including both databases and documents.”

In this article, we’ll be transforming the following XML source information from Cathy Kost, a beginning XML student who helps with a pot-bellied pig rescue organization.

<animal>
<species>pot belly pig</species>
    <name>Molly II</name>
    <birth>February, 1998</birth>
    <in-date>January, 2000</in-date>
    <from>Middle Ave.</from>
    <gender spay-neuter="yes">F</gender>
    <info>
    She is a sweet, friendly pig who likes to hang
    out on Cathy&#8217;s porch on the lounge pad.
    </info>
    <picture>
        <file>images/molly_th.jpg</file>
        <description>Black pig</description>
        <caption>Molly in the Pasture</caption>
    </picture>
</animal>

We will develop a transformation in both XSLT and XQuery. The transformations will change the XML into several HTML pages with four pigs per page, and an index page with links to the pig description pages. Both transformations will use built-in extensions to create multiple output files.

Each pig’s <picture> element will become an <img> element in the resulting file. It’s good practice to put a width and height attribute into image elements, but it’s a lot of work to have to look up each image’s dimensions. This is a perfect place for a user-defined extension function that, given an image’s file name, returns the image’s width and height.

Which Tools to Use?

For the XSLT transformation, we use the Apache Xalan XSLT processor. For XQuery, we use Qizx/open, which implements all features of the language except Schema import and validation.

The Main Differences

XSLT has a “processing engine” that automatically goes through the document tree and applies templates as it finds nodes; with XQuery the programmer has to direct the process. It’s almost like the difference between RPG (the business programming language, not role playing games) and procedure-oriented programming languages like C. In RPG, there is an implicit processing cycle, and you just set up the actions that you want to occur when certain conditions are met; in C, you are responsible for directing the algorithm.

XSLT is to XQuery as JavaScript is to Java. XSLT is untyped; conversions between nodes and strings and numbers are handled pretty much transparently. XQuery is a typed language which uses the types defined by XML Schema. XQuery will complain when it gets input that isn't of the appropriate type.

Global Setup

We want the number of pigs per page to be a global, user-settable parameter with a default value of four. In XSLT, we define this outside of any templates:

<xsl:param name="perPage" select="'4'"/>

In XQuery the following declaration appears as the first line in our query file:

declare variable $perPage as xs:integer := 4;

Both of these can be overridden by options on the command line. However, here is our first difference between XSLT and XQuery: any XSLT template may contain an <xsl:param>; that is how information gets passed among templates. XQuery’s declare variable defines global variables only, and cannot appear within a user-defined function.

We also want the output file to be XHTML transitional. In XSLT we accomplish this with the following element:

<xsl:output
 method="xml"
 indent="yes"
 omit-xml-declaration="yes"
 doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
 doctype-system="http://www.w3.org/TR/xhtml1/DTD/
 xhtml1-transitional.dtd" />

In XQuery, we add these options to the UNIX shell script that will run Qizx/open:

-Xindent='yes' \
-Xmethod='XHTML' \
-X'doctype-public'='-//W3C//DTD XHTML 1.0 Transitional//EN'\
-X'doctype-system'=\
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd' \

Creating the Index Page

Here is what the index page looks like when we have four pigs per page:

logo with list items
Figure 1: Logo with list items

In XSLT, we provide a template to match the root <pig-rescue> element. (To save space, we’re not showing the code that generates the logo image on the index page.) Since we have to process the <animal> elements in two different ways—once for the index page and once for the display pages—we need to use a mode. The template will be applied only to every fourth (perPage) entry; this ensures that we get the correct number of list items in the unordered list.

<xsl:template match="pig-rescue">
<html>
<head>
    <link rel="stylesheet" type="text/css" href="bdr.css" />
    <title>The Pigs</title>
</head>
<body>
<div align="center">
<h1>The Pigs At Belly Draggers Ranch</h1>
</div>

<ul>
    <xsl:apply-templates
        select="animal[position() mod $perPage = 1]"
        mode="indexList" />
</ul>

</body>
</html>
</xsl:template>

In XQuery, processing the document becomes our single XQuery statement; in this case, an XQuery FLWOR expression. This acronym stands for the clauses in the expression:

  • for, which allows you to step through a sequence of items or nodes.
  • let, which allows you to declare and initialize variables.
  • where (optional), which allows you to specify under which conditions an item or node should be chosen.
  • order (optional), which sorts the selected items.
  • return, which returns the specified values for each of the selected items.

A FLWOR expression must have at least one for or let; ours has just a let which assigns the root element from the input document to the $doc variable. The return returns an <html> element. The parentheses aren’t really necessary as only one item is being returned, but we want to use them for the sake of consistency.

let $doc := fn:input()/pig-rescue
return
(
<html>
  <head>
    <link rel="stylesheet" type="text/css" href="bdr.css" />
    <title>The Pigs</title>
  </head>
  <body>
    <div align="center">
      <h1>The Pigs At Belly Draggers Ranch</h1>
    </div>
    
    <ul>
    {
        local:make-name-list( $doc/animal )
    }
    </ul>
    
    </body>
    </html>
)

The fn:input() in the preceding code is a Qizx/open extension that takes the input file name from the command line.

The text starting with the <html> tag is called a Direct Element Constructor, and it must be well-formed. Within one of these constructors, you may embed XQuery expressions by enclosing them in braces. In this case, we switch back to XQuery to call the local:make-name-list function, passing it all the <animal> nodes within the document. If the function name looks like it has a namespace prefix, that’s because it does. XQuery predefines the namespace prefix local and reserves it for use in defining local functions.

Pages: 1, 2, 3, 4

Next Pagearrow