Menu

Reading Multiple Input Documents

March 6, 2002

Bob DuCharme

When you run an XSLT processor, you tell it where to find the source tree document -- probably in a disk file on a local or remote computer -- and the stylesheet to apply to it. You can't tell the processor to apply the stylesheet to multiple input documents at once. The document() function, however, lets the stylesheet name an additional document to read in. You can insert the whole document into the result tree or insert part of it based on a condition described by an XPath expression. You can even use this function with the xsl:key instruction and key() function to look up a key value in a document outside your source document.

To start with a simple example, we'll look at a stylesheet that copies one document and inserts another into the result document. It will read this document

<shirts>
  <shirt colorCode="c4">oxford button-down</shirt>
  <shirt colorCode="c1">poly blend, straight collar</shirt>
  <shirt colorCode="c6">monogrammed, tab collar</shirt>
</shirts>

and copy it to the result tree, inserting this xq485.xml document after the result version's shirts start-tag:

<!-- xq485.xml -->
<colors>
  <color cid="c1">yellow</color>
  <color cid="c2">black</color>
  <color cid="c3">red</color>
  <color cid="c4">blue</color>
  <color cid="c5">purple</color>
  <color cid="c6">white</color>
  <color cid="c7">orange</color>
  <color cid="c7">green</color>
</colors>

(Complete stylesheets, sample input files, and sample output files are available in this zip file.) The stylesheet that does this has just two template rules. The second copies all the source tree nodes except the one for the shirts element, which is covered by the first template rule, to the result tree.

<!-- xq486.xsl: converts xq484.xml into xq491.xml -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
  <xsl:output method="xml" omit-xml-declaration="yes"/>

  <xsl:template match="shirts">
    <shirts>
      <xsl:apply-templates select="document('xq485.xml')"/>
      <xsl:apply-templates/>
    </shirts>
  </xsl:template>

  <xsl:template match="@*|node()">
    <xsl:copy>
      <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

The first template's rule's second xsl:apply-templates instruction copies the contents of a shirts element to the result tree between two shirts tags. Before that second xsl:apply-templates instruction, however, is another xsl:apply-templates instruction with a select attribute. This attribute's value calls the document() function and names the xq485.xml document as its one argument. The function reads this XML document in and parses it as an XML document, and the xsl:apply-templates instruction tells the XSLT processor to apply any relevant template rules to it. The stylesheet's second template is the relevant template, and it processes the xq485.xml document's contents the same way that it processes the source tree document's content: copying it all to the result tree.

Tip When a stylesheet uses the document() function to read in another document, that stylesheet can include template rules to process the other document's nodes as easily as it can include template rules to process the source tree's nodes.

Because the xsl:apply-templates instruction that uses the document() function comes after the shirts start-tag and before the xsl:apply-templates instruction that processes the content of the source document's shirts element, the contents of the xq485.xml document shows up in the result after the shirts start-tag and before the shirts element's contents:

<shirts><!-- xq485.xml --><colors>
  <color cid="c1">yellow</color>
  <color cid="c2">black</color>
  <color cid="c3">red</color>
  <color cid="c4">blue</color>
  <color cid="c5">purple</color>
  <color cid="c6">white</color>
  <color cid="c7">orange</color>
  <color cid="c7">green</color>
</colors>
  <shirt colorCode="c4">oxford button-down</shirt>
  <shirt colorCode="c1">poly blend, straight collar</shirt>
  <shirt colorCode="c6">monogrammed, tab collar</shirt>
</shirts>
Warning Don't confuse the document() function with the use of xsl:include and xsl:import. These XSLT instructions let you insert one stylesheet inside another; the document() function lets you access other documents to combine with your source documents.

You don't need to insert the entire document read by the document() function into your result document. This next stylesheet is just like the last one except that the xsl:apply-templates element's select attribute only selects the elements in that document whose cid attribute value equals "c7".

<!-- xq488.xsl: converts xq484.xml into xq492.xml -->

<xsl:template match="shirts">
  <shirts>
  <xsl:apply-templates select="document('xq485.xml')//*[@cid='c7']"/>
  <xsl:apply-templates/>
  </shirts>
</xsl:template>

<xsl:template match="@*|node()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()"/>
  </xsl:copy>
</xsl:template>

The result only has those elements from the xq485.xml inserted:

<shirts><color cid="c7">orange</color><color cid="c7">green</color>
  <shirt colorCode="c4">oxford button-down</shirt>
  <shirt colorCode="c1">poly blend, straight collar</shirt>
  <shirt colorCode="c6">monogrammed, tab collar</shirt>
</shirts>

One valuable use of the document() function is to read in a document that stores elements to use for lookups. (For an introduction to the declaration and use of keys, see last month's column.) For example, let's say we want to add the same source document's list of shirts to the result tree, but we want each shirt listed with its color name spelled out instead of its color code. We need to take the value of the colorCode attribute in each shirt element (for example, "c4" or "c1"), find the color element in the xq485.xml document that has that value in its cid attribute, and then output that color element's contents -- the actual name of the color such as "yellow" or "blue". The result should look like this:

  blue oxford button-down
  yellow poly blend, straight collar
  white monogrammed, tab collar

Which the following stylesheet does. Because it references the xq485.xml document twice, it first declares a variable named colorLookupDoc whose value uses the document() function to read the document into a tree where it can be referenced elsewhere in the document. (This is more efficient than making the document() function call twice.)

<!-- xq487.xsl: converts xq484.xml into xq493.xml -->
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     version="1.0">
  <xsl:output method="text"/>

<xsl:variable name="colorLookupDoc" select="document('xq485.xml')"/>

<xsl:key name="colorNumKey" match="color" use="@cid"/>

<xsl:template match="shirts">
  <xsl:apply-templates select="$colorLookupDoc"/>
  <xsl:apply-templates/>
</xsl:template>

<xsl:template match="colors"/>

  <xsl:template match="shirt">
    <xsl:variable name="shirtColor" select="@colorCode"/>
    <xsl:for-each select="$colorLookupDoc">
      <xsl:value-of select="key('colorNumKey',$shirtColor)"/>
    </xsl:for-each>
    <xsl:text> </xsl:text><xsl:apply-templates/><xsl:text>
</xsl:text>
  </xsl:template>

</xsl:stylesheet>

The xsl:key instruction names a colorNumKey key as a group of color elements whose cid attribute will be used as an index to look up specific color elements. When an efficient XSLT processor sees this instruction, it should create a hash table in memory or some other data structure to speed these lookups.

The template rule for the shirts element resembles the one in the earlier examples. It has two xsl:apply-templates instructions: one to read in the external xq485.xml document (referring to this document in this example using the colorLookupDoc variable instead of the document's filename) and another to process the shirts element's contents.

A brief template rule for the colors element suppresses this element from being copied to the result tree. Another template rule uses the key() function to look up the color names within the xq485.xml document's colors element.

    

Also in Transforming XML

Automating Stylesheet Creation

Appreciating Libxslt

Push, Pull, Next!

Seeking Equality

The Path of Control

The template rule for the shirt element looks up the color name and adds it to the result tree followed by a single space (added by an xsl:text element) and the contents of that element. The lookup is performed using a key() function that names the colorNumKey key declared at the beginning of the stylesheet and the color ID of the shirt element being processed as the value to look for in the key. (The color ID is stored in a shirtColorCode variable declared at the beginning of the template.)

Wrapping an xsl:for-each element around the xsl:value-of instruction that calls the key() function solves a small problem with using the key() function to look something up in another document: this function looks for key nodes in the same document as the context node, and without that xsl:for-each instruction, the context node for this xsl:value-of element is the shirt element being processed by the template rule. We're looking for a color element in the xq485.xml document, not in the same document as the shirt node, so we need to make xq485.xml the context node document for the xsl:value-of instruction. Wrapping it with an xsl:for-each instruction that selects xq485.xml (again, referenced using the variable colorLookupDoc) works nicely.