XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Using XInclude

July 31, 2002

Elliotte Rusty Harold is the coauthor of XML in a Nutshell, 2nd edition

It's often convenient to divide long documents into multiple files. The classic example is a book, which is customarily divided in chapters. Each chapter may be further subdivided into sections. Traditionally one has used XML external entity references to support document division. For example, this book has three chapters, each stored in a separate file:

<?xml version="1.0"?>
<!DOCTYPE book SYSTEM "book.dtd" [
  <!ENTITY chapter1 SYSTEM "malapropisms.xml">
  <!ENTITY chapter2 SYSTEM "mispronunciations.xml">
  <!ENTITY chapter3 SYSTEM "madeupwords.xml">
  <title>The Wit and Wisdom of George W. Bush</title>

However, external entity references have a number of limitations. Among them:

Related Reading

XML in a Nutshell, 2nd Edition

XML in a Nutshell, 2nd Edition
A Desktop Quick Reference
By W. Scott Means, Elliotte Rusty Harold

  • The individual component files cannot be used independently of the master document. They are not themselves complete, well-formed XML documents. For instance, they cannot have XML declarations or document type declarations and often do not have a single root element.

  • If any of the pieces are missing, then the entire document is malformed. There's no option for error recovery.

  • An entity reference cannot point to a plain text file such as an example Java program or HTML document. Only well-formed XML can be included.

XInclude is an emerging W3C specification for building large XML documents out of multiple well-formed XML documents, independently of validation. Each piece can be a complete XML document, a fragmentary XML document, or a non-XML text document like a Java program or an e-mail message.


XInclude reference external documents to be included with include elements in the http://www.w3.org/2001/XInclude namespace. The prefix xi is customary though not required. Each xi:include element has an href attribute that contains a URL pointing to the file to include. For example, the previous book example can be rewritten like this:

<?xml version="1.0"?>
<book xmlns:xi="http://www.w3.org/2001/XInclude">
  <title>The Wit and Wisdom of George W. Bush</title>
  <xi:include href="malapropisms.xml"/>
  <xi:include href="mispronunciations.xml"/>
  <xi:include href="madeupwords.xml"/>

Of course you can also use absolute URLs where appropriate:

<?xml version="1.0"?>
<book xmlns:xi="http://www.w3.org/2001/XInclude">
  <title>The Wit and Wisdom of George W. Bush</title>
  <xi:include href="http://www.whitehouse.gov/malapropisms.xml"/>
  <xi:include href="http://www.whitehouse.gov/mispronunciations.xml"/>
  <xi:include href="http://www.whitehouse.gov/madeupwords.xml"/>

XInclude processing is recursive. That is, an included document can itself include another document. For example, a book might be divided into front matter, back matter, and several parts:

<?xml version="1.0"?>
<book xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="frontmatter.xml"/>
  <xi:include href="part1.xml"/>
  <xi:include href="part2.xml"/>
  <xi:include href="part3.xml"/>
  <xi:include href="backmatter.xml"/>

Each part might be further divided into a part intro and several chapters:

<?xml version="1.0"?>
<part xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include href="intro1.xml"/>
  <xi:include href="ch01.xml"/>
  <xi:include href="ch02.xml"/>
  <xi:include href="ch03.xml"/>
  <xi:include href="ch04.xml"/>

There's no limit to how deep this can go. Only circular inclusion (Document A includes Document B which includes, directly or indirectly, Document A) is forbidden. When an XInclude processor reads an XML document it resolves all references and returns a document that contains no XInclude elements.

Pages: 1, 2

Next Pagearrow