Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Comparing XSLT and XQuery
by J. David Eisenberg | Pages: 1, 2, 3, 4

Writing a Xalan Extension

In order to retrieve the width and height of an image given its file name, we will write a Xalan extension function in Java. It will be in a class named XImageSize (X for Xalan). This function, named getDimensions, will take a file name as string input and return an empty XML element with attributes containing the file name and the image’s width and height. The return value “cleans up” the file name by removing leading and trailing whitespace. The general model for this element is:

<imageSize fileName="fileName"
    width="width" height="height" />

In order to use this extension, we need to add some information to the XSL stylesheet. We need to establish a namespace for the extension and register that prefix as one belonging to an extension. We also want to make sure that this prefix never makes it into the output document.

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0"
    xmlns:img="info.evccit.utils"
    extension-element-prefixes="img"
    exclude-result-prefixes="img">

Once this is set up, the XSL stylesheet can call the function and extract the width and height directly from the returned <imageSize> element as follows:

<xsl:template match="picture">
    <xsl:variable name="dimensions"
        select="img:ImageSize.getDimensions(string(file))"/>
    <img src="{$dimensions/@fileName}"
        width="{$dimensions/@width)}"
        height="{$dimensions/@height}"
        alt="{description}"
        title="{caption}" />
</xsl:template>

In order to return an element, the function must have access to a Document and its createElement() method. We know that it is possible for an extension function to do this; the tokenize() extension in org.apache.xalan.lib.Extensions does it. Because Xalan is Open Source, we can look at the code and copy it wholesale into ours. We also need to put in the appropriate attribution and include a copy of the Apache License information along with the source code.

/**
 * This class is not loaded until first referenced
 * (see Java Language Specification by Gosling/Joy/Steele,
 * section 12.4.1)
 *
 * The static members are created when this class is
 * first referenced, as a lazy initialization not needing
 * checking against null or any synchronization.
 *
 * This function Copyright 1999-2004
 * The Apache Software Foundation.
 */
private static class DocumentHolder 
{
    // Reuse the Document object to reduce memory usage.
    private static final Document m_doc;
    static {
        try
        {
            m_doc =
            DocumentBuilderFactory.newInstance().
                newDocumentBuilder().newDocument();
        }
       
        catch(ParserConfigurationException pce)
        {
              throw new org.apache.xml.utils.
                WrappedRuntimeException(pce);
        }

    }
}

This class will go into the main XImageSize class, which looks like this.

package info.evccit.utils;

public class XImageSize
{
    static char fileSep; note 1
    
    static {
        char[] carr =
         System.getProperty("file.separator").toCharArray();
        fileSep = carr[0];
    }

    public static Node getDimensions( String fileName )
    {
        Document doc = DocumentHolder.m_doc;
        Element result = null;                                    
        fileName = fileName.trim();
        try
        {
            Dimension d = 
      ImageFileDimensions.getFileDimensions( fileName ); note 2
            result = doc.createElement("imageSize");
            result.setAttribute( "fileName",
                fileName.replace( fileSep,  '/' ) ); note 3
            result.setAttribute( "width",
                Integer.toString((int) d.getWidth() ));
            result.setAttribute( "height",
                Integer.toString((int) d.getHeight() ));
        }
        catch (Exception e)
        {
            result = null;
        }
        return result;
    }
}
  1. The static initialization of the class saves the system’s file separator character.
  2. The call to ImageFileDimensions.getFileDimensions() opens up the file and reads the first few bytes to determine whether it is a gif, JPG, or GIF file. Depending upon the file type, it does the appropriate work to extract the width and height and returns it in a Dimension object. The exact details aren’t relevant to this article, so the code isn’t shown here.
  3. We have to replace the file separator character with a slash, which is the standard separator for URLs.

The source XML file sets the base path for all the images with the <image-base> element. Rather than do a complicated normalize-space() and concat() to join the base path to the image file name in the XSLT, we create a second version of getDimensions() that accepts two strings and does the heavy lifting:

public static Node 
           getDimensions( String pathName, String fileName )
{
String fileSeparator = System.getProperty("file.separator");
    String combinedName;
    
    pathName = pathName.trim();
    fileName = fileName.trim();

    if (pathName.endsWith( fileSeparator ))
    {
        combinedName = pathName + fileName;
    }
    else
    {
        combinedName = pathName + fileSeparator + fileName;
    }

    return getDimensions( combinedName );
}

If you download the code, you will see that we have heavily overloaded the getDimensions() function by allowing it to accept a Node or NodeList for either or both parameters, but that isn’t the point of this article. Onward to...

Writing an XQuery Extension

The code for this extension is almost identical to the Xalan extension. Instead of returning an <imageSize> element, however, we will return a vector of three items: the filename, the width, and the height. Qizx/open will interpret this as a sequence of items.

The XQuery file must connect the class, which is named QImageSize, with a namespace. This statement goes at the head of the XQuery file. Note carefully! This assignment uses a single equal sign, not the := used for a let clause. We will also have to pass the class name to Qizx/open on the command line when we run the query; this lets Qizx/open know that this is an authorized extension and no security exception needs to be raised.

declare namespace imgsize =
                        "java:info.evccit.utils.QImageSize";

Once the namespace is established, XQuery can extract the information as part of the pig display code:

for $animal at $pos in $animalList
    let
        $basePath := $animal/../image-base,
        $dimensions := imgsize:getDimensions($basePath,
            $animal/picture/file/text() )
    return
    (
        <img
            src="{$dimensions[1]}" 
            width="{$dimensions[2]}"
            height="{$dimensions[3]}"
            alt="{$animal/picture/description/text()}"
            title="{$animal/picture/caption/node()}"
            hspace="4" />
    )

Here’s the code for the function that takes the entire filename as one string parameter:

package info.evccit.utils;

import net.xfra.qizxopen.xquery.dm.Node;

import java.awt.Dimension;
import java.util.Vector;

public class QImageSize
{
    static char fileSeparator;
    
    static {
        char[] carr =
         System.getProperty("file.separator").toCharArray();
        fileSeparator = carr[0];
    }
    
    public static Vector getDimensions( String fileName )
    {
        Vector result = new Vector(3);
        fileName = fileName.trim();
        try
        {
          Dimension d = 
          ImageFileDimensions.getFileDimensions( fileName );
            result.add( fileName.replace( fileSeparator, '/' ) );
            result.add( new Integer( (int) d.getWidth() ) );
            result.add( new Integer( (int) d.getHeight() ) );
        }
        catch (Exception e)
        {
            result = null;
        }
        return result;
    }
}

The two-string version of getDimensions() is exactly the same as the Xalan version, except that it returns a Vector instead of a Node. This function has also been heavily overloaded to accept a Qizx/open Node (which is not the same as a DOM node) so that the caller doesn’t have to dig down to the text() step in the path.

Getting the Code

You can download the sample pig rescue file, XSLT stylesheet, XQuery file, extension sources, and Apache License here. The Java source files are in the info directory, and the API documentation is in the doc directory. Make sure you put the ImageSize.jar file in your classpath when invoking Xalan and/or Qizx/open.

The shell files xcompile.sh and qcompile.sh will compile the Xalan and Qizx/open extensions. Files make_javadoc.sh and make_jar.sh create the Javadoc and ImageSize.jar files. Files run_xalan.sh and run_qizx.sh run the transformation and XQuery.

Thanks to Xavier Franc, author of Qizx/open, for his advice and information on using XQuery.


Comment on this articleShare your experience in our forums.
(* You must be a
member of XML.com to use this feature.)
Comment on this Article


Titles Only Titles Only Newest First
  • Complete junk!
    2005-03-11 03:40:01 M.David.Peterson [Reply]

    Please rewrite this using a modern day to modern day spec comparison... Please see my blog post for more specific details > http://www.xsltblog.com/archives/2005/03/dear_j_david_ei.html


    To other readers: This article is a complete joke. You can waste your time reading it but you'll learn more useful and valuable information watching the Simpson's with the added benefit of getting to watch the Simpson's instead of reading this junk.

  • Off Target
    2005-03-10 23:29:34 dnovatchev [Reply]

    > XSLT is to XQuery as JavaScript is to Java.
    > XSLT is untyped; conversions between nodes and
    > strings and numbers are handled pretty much
    > transparently.


    Simply not true. Both XQuery and XSLT 2.0 use the same common subset expression language (XPath 2.0) and its data model. Both are typed. While both have approximately the same power, there are tasks, which are more easily accomplished using XSLT 2.0.


    This article would have been more useful were it written in 2001.


    What is strange is that XSLT 1.0 (yes, the author does not specify what he means by "XSLT") is compared to XQuery.


    Why should someone compare two products using the 1999 version of the first and the 2005 working draft of the second?


    I would definitely *not* recommend this article as an authoritative text on the subject.


    This time xml.com wasn't too-useful either...


    Best regards,
    Dimitre Novatchev

    • Off Target
      2005-03-11 16:13:14 J David Eisenberg [Reply]

      I should get an award for "worst. simile. ever." Let me put it this way: XSLT (any version) does have an underlying type system, but your XSLT stylesheet doesn't have to specify the datatype of everything; you can even skip the as="" attribute on xsl:param if you so desire. XSLT tends to handle string/number/node-to-text conversions without a lot of fuss--it almost always delivers a reasonable result, even if it's not what you'd ideally like. XQuery tends to be more vocal if you don't specify types. I'm not saying that one way is "better" or "worse" - that depends on your philosophy and the task at hand.


      But yeah, that simile was really awful.




      • Off Target
        2005-03-11 20:35:09 dnovatchev [Reply]

        > Let me put it this way:


        > XSLT (any version) does have an underlying
        > type system, but your XSLT stylesheet doesn't
        > have to specify the datatype of everything;
        > you can even skip the as="" attribute on
        > xsl:param if you so desire. XSLT tends to
        > handle string/number/node-to-text conversions
        > without a lot of fuss--it almost always
        > delivers a reasonable result, even if it's not
        > what you'd ideally like.


        Quite *not* so. I don't think anyone who has actually written XSLT 2.0 code will make the quoted statement.


        XSLT 2.0 uses XPath 2.0 which has stringent type-checking rules.


        For example, even something as simple as the following transformation raises type errors, as evidenced using Saxon 8.3B:


        <xsl:stylesheet version="2.0"
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        >

        <xsl:template match="/">
        <xsl:copy-of select="*/*
        [3 < substring(@n,2)]"/>
        </xsl:template>
        </xsl:stylesheet>


        when applied on this source xml document:


        <t>
        <x n="n1"/>
        <x n="n2"/>
        <x n="n3"/>
        <x n="n4"/>
        <x n="n5"/>
        </t>


        the result is:


        Saxon 8.3 from Saxonica
        Java version 1.5.0_01
        Error at xsl:copy-of on line 6 of file:/C:/Program%20Files/Java/jdk1.5.0_01/bin/marrowtr.xsl:
        XP0006: Cannot compare xs:integer to xs:string
        Failed to compile stylesheet. 1 error detected.



        Best regards,
        Dimitre Novatchev.

        • Off Target
          2005-03-11 21:58:12 J David Eisenberg [Reply]

          A quick further note: I also have been testing some XSLT files I wrote for transforming XML to OpenOffice.org files; I just changed the version to 2.0 and everything continued to work. I guess I was unlucky enough to write transformations that didn't hit any of the type problems.

          • Off Target
            2005-03-11 22:23:59 dnovatchev [Reply]

            Yes, this may happen and is indeed a question of luck.


            Specifying version="1.0" causes the XSLT processor to work in "backwards compatibility mode" but even in this case the XSLT processor uses the XPath 2.0 data model -- therefore it does not completely behave as an XSLT 1.0 processor.


            Best regards,
            Dimitre Novatchev

        • Off Target
          2005-03-11 21:15:58 J David Eisenberg [Reply]

          You're right; my experience with 2.0 is fairly limited, and that's what I get for running a test transformation in Saxon without changing the version from 1.0 to 2.0.

        • Off Target
          2005-03-11 20:40:13 dnovatchev [Reply]

          I wrote:


          <xsl:copy-of select="*/*
          [3 < substring(@n,2)]"/>
          </xsl:template>


          Of course, the "less-than" operator was written escaped, however it is displayed as"<" -- I just don't know how to enter xslt code so that it would be properly displayed on this site -- anyway, the example is very simple and must be clear.


          Dimitre Novatchev

    • Off Target
      2005-03-11 05:26:25 bryan rasmussen [Reply]

      if we're gonna get a language war going I think XSLT 1.0 is still preferable to XQuery 1.0


      whenever XQuery 1.0 comes through.