XInclude Processing in XSLT
March 28, 2007
Assembling various parts of a document before processing the assembled document is a recurring theme in document processing. XML Inclusions (XInclude) is the W3C standard created to support this scenario, but since it is a standalone specification, it needs to be supported by a piece of software implementing this functionality. The XInclude Processor (XIPr), written in XSLT 2.0, implements XInclude and thus may help to reduce the dependency on numerous software packages if XInclude is used in an environment where XSLT 2.0 is used anyway. XIPr is implemented as a single XSLT 2.0 stylesheet. It can be used standalone in a publishing pipeline or as an imported module in some other XSLT code for integrated XInclude processing.
Compound Documents in XML
XML DTDs introduced the concept of entities, which could be used for assembling distributed physical structures of an XML document into one logical XML document. The XML processor has the task of assembling the various entities. Entities, however, were never very popular in the XML community (except among the SGML traditionalists) and thus were completely removed in XML Schema. As a replacement, the W3C came up with XML Inclusions (XInclude) , which is defined as a process of merging XML Infosets . An increasing number of XML processors supports XInclude, but it is important to realize that XInclude is a separate step of an XML processing pipeline, not an integral part of XML parsing or transformation.
The new 2.0 version of XSLT  finally allows you to implement XInclude in XSLT (some XSLT processors, such as libxslt , support XInclude, but that is not a mandatory part of an XSLT processor), something which could not be done in the 1.0 version of the language, because in that version it was impossible to access plain-text files . The XInclude Processor (XIPr)  is an implementation of XInclude in XSLT 2.0; it has been created as part of an XSLT-only tool that requires an inclusion facility (this tool is the XSLidy  presentation package, which uses XSLT to generate a set of Slidy presentations out of an XML document). Instead of defining and implementing a proprietary solution, XSLidy is now based on XIPr, which is available as a standalone XSLT stylesheet.
XIPr provides XInclude processing in XSLT-only environments, which can be useful if XSLT is already a required component of a processing environment where the goal is to minimize the number of required technologies required to support an application.
XInclude Processing Model
XInclude's processing model is pretty straightforward: it takes as input an Infoset (which includes XInclude elements) and produces as result an Infoset where all XInclude elements have been processed (i.e., expanded). This maps well to XSLT's model of transforming trees, so in the XSLT implementation of XInclude, the XInclude process starts with the tree of some input document and transforms it into a tree where all XInclude elements have been processed. XSLT's templates provide excellent support for this kind of processing, so the XSLT implementation essentially is an identity transform that contains templates for processing XInclude elements.
Since XInclude is defined on the Infoset, it needs to address all information items defined by the Infoset. This includes unparsed entities and notations, which are also handled by XInclude. However, since XIPr is implemented in XSLT and thus based on XSLT's stripped down data model (which, in terms of node kinds, is a subset of the Infoset), the implementation does not have to deal with unparsed entities and notations, which are stripped from the input document before processing begins (actually, unparsed entities are available from the input tree, but XSLT does not provide any facilities to produce unparsed entities in the result tree).
XInclude handles two types of resources—XML and plain-text documents. Each of these
resources can be included in a document being processed. XML documents are included
as a new
fragment of the result tree, whereas plain-text documents are included as a text node.
type of the resource to be included is indicated using a
parse attribute on an
XInclude element, and permitted values are
text. This is
the area where XSLT 1.0 is not able to support XInclude, because XSLT 1.0 is only
access XML documents. XSLT 2.0 adds the
unparsed-text() function, which
provides access to plain-text files out of an XSLT stylesheet.
XInclude not only supports the inclusion of XML and plain-text documents, it also supports the inclusion of fragments of XML documents. This is very useful when assembling documents from parts of other documents.
While the identification of fragments in XML documents is a useful facility (and must be supported by every XInclude implementation), the history and current status of the language for doing this is less than perfect. The XML Pointer Language (XPointer)  was created in a effort to create a hypertext-friendly environment of XML technologies, using the XML Linking Language (XLink) for an XML-based hyperlink notation and XPointer as the counterpart for addressing fragments within XML documents. XPointer's goal was to identify arbitrary ranges within XML documents (basically, everything users might mark with a mouse selection). This turned out to be a hard problem. Finally, the XPointer language was split into multiple parts and the basic functionality was finalized; the more advanced range locations were never finished.
The XPointer framework itself specifies shorthand pointers, which are equivalent
to HTML's fragment identifiers. They consist of a single name after the
separating the resource name from the fragment identifier, and they are resolved to
element with the ID with that name (the IDness of an attribute can be inferred from
an XML Schema, or some other source of information—for example, if it is specified
A more advanced form of fragment identification is specified in the XPointer
element() scheme  and must be supported by an XInclude implementation. First of all, the shorthand
notation of the XPointer framework is also supported, but with a different syntax:
The interesting concept of the
element scheme, though, is that of child
sequences. They allow the identification of elements that have not been assigned an
explicit ID, by navigating to them as a path expression (similar to XPath, but more
and with a different syntax) of child steps. For example, the named fragment used
previous examples could also be identified by the following XPointer:
This XPointer child sequence is equivalent to the XPath
and identifies the sixth child (the
xml-included-items div2) of the fourth
div1) of the second child (the
body) of the second
spec) of the root node. The advantage of child sequences is that
they work for elements that do not have an ID. The obvious disadvantage is that they
rather brittle and break easily when the document is modified. For somewhat improved
stability of XPointers using child sequences, the ID mechanism and the child sequences
be combined, resulting in XPointers like this one:
In this case, the XPointer selects the fragment, which is located by first finding
element with the specified ID and then navigating the child sequence relative to that
element (in this example, this fragment consists of the
head of the identified
part of the XML document).
While a longer article about XInclude has already looked at how to use XInclude , the following examples briefly illustrate its main usages:
|Include the XML document
<xi:include href="example.xml" xpointer="element(id754)"/>
<xi:include href="example.txt" parse="text" [ encoding="US-ASCII" ] />
|Include the text document
<xi:include href="example.xml"> <xi:fallback>could not include "example.xml"</xi:fallback> </xi:include>
|Include the XML document
XIPr is implemented as a single standalone XSLT 2.0 stylesheet. It can be used as a standalone XSLT implementation of XInclude, or it can be integrated into XSLT code for integrated XInclude processing. XIPr contains a template that by default initiates XInclude processing starting at the document element of the input document:
<xsl:template match="/*"> <xsl:apply-templates select="." mode="xipr"/> </xsl:template>
When using XIPr within other stylesheets, it should be
imported so that XSLT's
conflict resolution assigns a lower import precedence to XIPr's templates. XInclude
processing can then be initiated at any node at any time, by following the pattern
<xsl:template match="/*"> <!-- do something ... --> <xsl:apply-templates select="$xinclude-candidates" mode="xipr"/> <!-- do something else ... --> </xsl:template>
XIPr produces messages using XSLT's
message instruction, which produces
messages on the console or some similar output device (not in the result document,
When encountering fatal errors (as defined by the XInclude specification), XIPr terminates
message as well.
The important thing to notice is that XIPr processing has to be initiated using the
mode so that the templates in the XIPr stylesheet are used for processing the
Marsh, David Orchard, and Daniel Veillard,
XML Inclusions (XInclude) Version 1.0 (Second Edition),World Wide Web Consortium, Recommendation REC-xinclude-20061115, November 2006
Elliotte Rusty Harold,
Using XInclude,xml.com, July 2002
XSL Transformations (XSLT) Version 2.0,World Wide Web Consortium, Recommendation REC-xslt20-20070123, January 2007
Appreciating Libxslt,xml.com, August 2005
Eric van der Vlist,
Processing Inclusions with XSLT,xml.com, August 2000
- XIPr Home Page
- XSLidy Home Page
Paul Grosso, Eve Maler, Jonathan Marsh, and Norman Walsh,
XPointer Framework,World Wide Web Consortium, Recommendation REC-xptr-framework-20030325, March 2003
Grosso, Eve Maler, Jonathan Marsh, and Norman Walsh,
XPointer element() Scheme,World Wide Web Consortium, Recommendation REC-xptr-element-20030325, March 2003