XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Comparing XSLT and XQuery
by J. David Eisenberg | Pages: 1, 2, 3, 4

Writing a Xalan Extension

In order to retrieve the width and height of an image given its file name, we will write a Xalan extension function in Java. It will be in a class named XImageSize (X for Xalan). This function, named getDimensions, will take a file name as string input and return an empty XML element with attributes containing the file name and the image’s width and height. The return value “cleans up” the file name by removing leading and trailing whitespace. The general model for this element is:

<imageSize fileName="fileName"
    width="width" height="height" />

In order to use this extension, we need to add some information to the XSL stylesheet. We need to establish a namespace for the extension and register that prefix as one belonging to an extension. We also want to make sure that this prefix never makes it into the output document.

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0"
    xmlns:img="info.evccit.utils"
    extension-element-prefixes="img"
    exclude-result-prefixes="img">

Once this is set up, the XSL stylesheet can call the function and extract the width and height directly from the returned <imageSize> element as follows:

<xsl:template match="picture">
    <xsl:variable name="dimensions"
        select="img:ImageSize.getDimensions(string(file))"/>
    <img src="{$dimensions/@fileName}"
        width="{$dimensions/@width)}"
        height="{$dimensions/@height}"
        alt="{description}"
        title="{caption}" />
</xsl:template>

In order to return an element, the function must have access to a Document and its createElement() method. We know that it is possible for an extension function to do this; the tokenize() extension in org.apache.xalan.lib.Extensions does it. Because Xalan is Open Source, we can look at the code and copy it wholesale into ours. We also need to put in the appropriate attribution and include a copy of the Apache License information along with the source code.

/**
 * This class is not loaded until first referenced
 * (see Java Language Specification by Gosling/Joy/Steele,
 * section 12.4.1)
 *
 * The static members are created when this class is
 * first referenced, as a lazy initialization not needing
 * checking against null or any synchronization.
 *
 * This function Copyright 1999-2004
 * The Apache Software Foundation.
 */
private static class DocumentHolder 
{
    // Reuse the Document object to reduce memory usage.
    private static final Document m_doc;
    static {
        try
        {
            m_doc =
            DocumentBuilderFactory.newInstance().
                newDocumentBuilder().newDocument();
        }
       
        catch(ParserConfigurationException pce)
        {
              throw new org.apache.xml.utils.
                WrappedRuntimeException(pce);
        }

    }
}

This class will go into the main XImageSize class, which looks like this.

package info.evccit.utils;

public class XImageSize
{
    static char fileSep; note 1
    
    static {
        char[] carr =
         System.getProperty("file.separator").toCharArray();
        fileSep = carr[0];
    }

    public static Node getDimensions( String fileName )
    {
        Document doc = DocumentHolder.m_doc;
        Element result = null;                                    
        fileName = fileName.trim();
        try
        {
            Dimension d = 
      ImageFileDimensions.getFileDimensions( fileName ); note 2
            result = doc.createElement("imageSize");
            result.setAttribute( "fileName",
                fileName.replace( fileSep,  '/' ) ); note 3
            result.setAttribute( "width",
                Integer.toString((int) d.getWidth() ));
            result.setAttribute( "height",
                Integer.toString((int) d.getHeight() ));
        }
        catch (Exception e)
        {
            result = null;
        }
        return result;
    }
}
  1. The static initialization of the class saves the system’s file separator character.
  2. The call to ImageFileDimensions.getFileDimensions() opens up the file and reads the first few bytes to determine whether it is a gif, JPG, or GIF file. Depending upon the file type, it does the appropriate work to extract the width and height and returns it in a Dimension object. The exact details aren’t relevant to this article, so the code isn’t shown here.
  3. We have to replace the file separator character with a slash, which is the standard separator for URLs.

The source XML file sets the base path for all the images with the <image-base> element. Rather than do a complicated normalize-space() and concat() to join the base path to the image file name in the XSLT, we create a second version of getDimensions() that accepts two strings and does the heavy lifting:

public static Node 
           getDimensions( String pathName, String fileName )
{
String fileSeparator = System.getProperty("file.separator");
    String combinedName;
    
    pathName = pathName.trim();
    fileName = fileName.trim();

    if (pathName.endsWith( fileSeparator ))
    {
        combinedName = pathName + fileName;
    }
    else
    {
        combinedName = pathName + fileSeparator + fileName;
    }

    return getDimensions( combinedName );
}

If you download the code, you will see that we have heavily overloaded the getDimensions() function by allowing it to accept a Node or NodeList for either or both parameters, but that isn’t the point of this article. Onward to...

Writing an XQuery Extension

The code for this extension is almost identical to the Xalan extension. Instead of returning an <imageSize> element, however, we will return a vector of three items: the filename, the width, and the height. Qizx/open will interpret this as a sequence of items.

The XQuery file must connect the class, which is named QImageSize, with a namespace. This statement goes at the head of the XQuery file. Note carefully! This assignment uses a single equal sign, not the := used for a let clause. We will also have to pass the class name to Qizx/open on the command line when we run the query; this lets Qizx/open know that this is an authorized extension and no security exception needs to be raised.

declare namespace imgsize =
                        "java:info.evccit.utils.QImageSize";

Once the namespace is established, XQuery can extract the information as part of the pig display code:

for $animal at $pos in $animalList
    let
        $basePath := $animal/../image-base,
        $dimensions := imgsize:getDimensions($basePath,
            $animal/picture/file/text() )
    return
    (
        <img
            src="{$dimensions[1]}" 
            width="{$dimensions[2]}"
            height="{$dimensions[3]}"
            alt="{$animal/picture/description/text()}"
            title="{$animal/picture/caption/node()}"
            hspace="4" />
    )

Here’s the code for the function that takes the entire filename as one string parameter:

package info.evccit.utils;

import net.xfra.qizxopen.xquery.dm.Node;

import java.awt.Dimension;
import java.util.Vector;

public class QImageSize
{
    static char fileSeparator;
    
    static {
        char[] carr =
         System.getProperty("file.separator").toCharArray();
        fileSeparator = carr[0];
    }
    
    public static Vector getDimensions( String fileName )
    {
        Vector result = new Vector(3);
        fileName = fileName.trim();
        try
        {
          Dimension d = 
          ImageFileDimensions.getFileDimensions( fileName );
            result.add( fileName.replace( fileSeparator, '/' ) );
            result.add( new Integer( (int) d.getWidth() ) );
            result.add( new Integer( (int) d.getHeight() ) );
        }
        catch (Exception e)
        {
            result = null;
        }
        return result;
    }
}

The two-string version of getDimensions() is exactly the same as the Xalan version, except that it returns a Vector instead of a Node. This function has also been heavily overloaded to accept a Qizx/open Node (which is not the same as a DOM node) so that the caller doesn’t have to dig down to the text() step in the path.

Getting the Code

You can download the sample pig rescue file, XSLT stylesheet, XQuery file, extension sources, and Apache License here. The Java source files are in the info directory, and the API documentation is in the doc directory. Make sure you put the ImageSize.jar file in your classpath when invoking Xalan and/or Qizx/open.

The shell files xcompile.sh and qcompile.sh will compile the Xalan and Qizx/open extensions. Files make_javadoc.sh and make_jar.sh create the Javadoc and ImageSize.jar files. Files run_xalan.sh and run_qizx.sh run the transformation and XQuery.

Thanks to Xavier Franc, author of Qizx/open, for his advice and information on using XQuery.



1 to 3 of 3
  1. Complete junk!
    2005-03-11 03:40:01 M.David.Peterson
  2. Off Target
    2005-03-10 23:29:34 dnovatchev
  3. XSLT is to XQuery as JavaScript is to Java?
    2005-03-10 07:38:52 bryan rasmussen
1 to 3 of 3