How the Annotated XML Specification Works

September 12, 1998

Tim Bray

The architecture of the system, illustrated below, is simple enough:

Graphic representation of the document/creation structure.

The XML 1.0 specification is accessed in read-only mode. For convenience, I keep a local copy in a file called xml.xml. All the annotations live in another single file (about 25% larger than the XML specification itself) named notes.xml. A Java program, based on my Lark processor, reads both xml.xml and notes.xml and builds in-memory tree-structured representations of both. After processing all the links in notes.xml, the program writes out a file called target.html, which is the annotated version of the spec, and a large number of small files, each containing one of the annotations. The rest of this article outlines what's in those files and what the program does.

Design Choices for the Annotation

The first question I had to face in constructing the annotation was whether or not to use the highly-unfinished XLink and XPointer technologies, currently under development in the World Wide Web Consortium (W3C) XML Activity. XLink and XPointer have two large advantages: they are built for XML and can point at arbitrary locations inside a document.

On the other hand, neither spec is nearly finished, and they have changed (in syntax, if not at a conceptual level) from draft to draft. Furthermore, there were (in early 1998) neither commercial nor freeware implementations available. So if I were going to use this technology, I was going to have to write all the software myself. I decided to go ahead with XLink and XPointer, and while the syntax described here is about a year behind the latest drafts, I believe that what I've done is conceptually in tune with the current thinking, so the syntax will be easy to upgrade once the spec settles down.

XLink: A Quick Review

The XLink spec is concerned with recognizing which elements are being used as links, and with giving those links some useful structure and properties. HTML has a very simple solution to this set of problems: all linking elements have to be named A, and they have to point at one "resource", and the resource's address is found in the HREF= attribute.

<a href="resource_address">

XLink tries to be more general than HTML: any element can serve as a link, and is identified as such by using a magic reserved attribute, xml:link=. There are two kinds of XLinks, identified by the values xml:link="simple" and xml:link="extended". In the annotation, I only used the extended flavor, so I won't discuss simple XLinks here. Extended XLinks can have a bunch of useful associated information:

If a linking element is "in-line", the element itself is one of the ends of the link. All HTML links are in-line, as are all the links in the Annotated Spec. Out-of-line links are really the Wild Blue Yonder of hypertext theory and practice.
XLinks can have labels, both machine-readable (provided in the role= attribute) and human-readable (in the title= attribute).
XLink includes some tools for controlling the behavior of a link when it's being followed. In HTML, links really have only one behavior; you are at one page, you follow a link, then you're somewhere else. Since I had to use the Web as it is today to deliver the annotations, I wasn't able to get fancy with behaviors.

The x Element

In the Annotated Spec, I used an element named x as the chief linking element. If you were to dress it up with all the necessary attributes, one of these elements would look like this:

  content-title="Annotation" >
  ... contents of the linking element go here ...

If every one of the 312 annotations had to carry around all those attributes, the Annotated Spec would be hard to write and to work with. Fortunately, XML has attribute defaulting, so I could provide all these attributes just once, in the document header:

<!DOCTYPE Annotations [
 <!ELEMENT x (here|spec)+>
   xml:link       CDATA  #FIXED "extended"
   inline         CDATA  #FIXED "true"
   content-role   CDATA  #FIXED "commentary"
   content-title  CDATA  #FIXED "Annotation" 
   id             ID     #REQUIRED> ]>
<x id="first-link-id"> ... content of first link ... </x>
<x id="second-link-id"> ... content of second link ... </x>

The real role of the x element is to hold one here element and a bunch of spec elements. The here element actually contains the text of the annotation, while each of the spec elements points at a location in the XML spec where the annotation applies. Most of the annotations apply to only one location, but there are a few that attach to many places. For example, there is an annotation saying that DTD keywords (such as DOCTYPE , SYSTEM, ELEMENT, and ATTLIST) must be in upper-case; this annotation is attached to the first definition of each of these keywords.

The here Element

This element is what XLink calls a "Locator" - it serves as one end of the extended link, and contains the annotation. It has a lot of attributes, most of which are defaulted and don't actually appear in the body of the document. Here's the declaration, with all those attributes:

 <!ELEMENT here ANY>
 <!ATTLIST here
  xml:link CDATA #FIXED "locator"
  actuate  CDATA #FIXED "auto"
  show     CDATA #FIXED "replace"
  role     CDATA #FIXED "annotation"
  title    CDATA #REQUIRED
  href     CDATA #FIXED "here()"
  index    CDATA #IMPLIED">

Here's what all those attributes mean:

tells the processing program that this is a locator
specifies that the link should be processed as soon as it's found - this differs from the Web browser behavior of just displaying the link (as underlined blue text) then waiting for the user activate it.
tells the processor that the result of following this link should replace the target of the previous one that was followed.
tells the processor that this here element contains the annotation, not a pointer into the spec.
provides a human-readable label for this link - this is required to be present since it is used to generate the title in the Web implementation.
points to the annotation; which in the Annotated Spec is just the content of this element.
not part of the XLink apparatus - used to build an index of all the annotations; if it's not provided, the title value is used.

The content is declared as ANY and contains text marked up with HTML tags. Since this annotation was designed for Web delivery, and this content was designed to be read by humans, I felt that HTML was adequate to meet my formatting needs. HTML also had the advantage that I didn't have to write code to convert it for delivery. So far, I've found HTML perfectly satisfactory for this particular application. However, I do not draw the conclusion that HTML is going to be the right presentation solution for every, or even most, hypertext applications. Since the annotation is an XML document, the HTML has to be well-formed, to allow processing with XML-processor based tools.

Here's one of the here elements:

<here title='The Document Entity is Special'
 index='Document Entity, Special Status Of'>
<p>The differences between the document entity and 
any other external parsed entity are:</p>
<ol><li>The document entity can begin with an
<Sref href='&h;dt-xmldecl'>XML declaration</Sref>,
other external parsed entities with a 
<Sref href='&h;NT-TextDecl'>text declaration</Sref>.
<li>The document entity can contain a
<Sref href='&h;dt-doctype'>document type 

Note that there are a few magic non-HTML elements mixed in; in this case, the Sref element, which is used to encode a pointer back into the XML specification. The Annotated Spec also uses Xref (external reference) and Nref (reference to another annotation) elements. That pointer (in the href attribute) uses an entity reference, &h;, which contains the URL for the XML spec; this is a good idea since there are hundreds of these URLs in the Annotated spec, and the location I read the XML spec from might change.

The spec Element

This is another XLink "Locator", which contains a pointer into the XML spec, indicating what part of the spec the annotation is there to annotate. Here's its declaration:

<!ATTLIST spec
  xml-link CDATA #FIXED "locator"
  actuate  CDATA "user"
  show     CDATA "replace"
  role     (Using|History|Tech|Misc|Example) "Misc"
  title    CDATA "Into XML Specification" 
  href     CDATA #REQUIRED>
The only really interesting attributes are role, saying which kind of annotation it is, and href, which contains the URL pointing into the spec. The allowed values of role correspond to the (U), (H), (T), (M), and (E) symbols that mark the annotations in the spec.

Here's an example, not just of a spec element, but of a whole x element with its spec and here children:

<x id='RfC1808URI'>
<spec role='Using' href='&s;id(RFC1808)'/>
<here title='RFC 1818 URL'>
<p><Xref href=''></Xref></p>
In this example, the target of the reference is easily identified, since it is just a bibliographic entry that has an id attribute.

XPointer: A Quick Review

In an XLink Locator, there is an href= attribute that gives the URI identifying an end of the link. An XPointer is a string of characters that is used after the # "fragment separator" character in that href= value. It points into the XML document by treating it as a tree structure and identifying numbered child and descendant nodes.

The best XPointers are those that are based on "ID Attributes", that is to say attributes that have been declared to have a unique value. These are easy for an XML Processor to find and traverse to, but there is a problem in that you don't know which attributes are so declared unless you are prepared to read the whole DTD. This means, if you read the XML spec carefully, that to be sure, you have to use a validating XML processor. In my case, I was able to get away with using Lark, my non validating processor, simply by assuming that any attribute whose name was id was an ID attribute.

XPointer provides quite a few different verbs for selecting objects inside the document tree; in the Annotated Spec I was able to get away with using only the id, descendant, child, and string verbs. Furthermore, I could have got by without using the string operator. This raises the question: if something as complex as the Annotated Spec can be constructed with just these operators, do we really need all the others?

Here are some interesting examples of XPointers from the Annotated Spec; they have been set up to point into the indicated part of the HTML version of the XML spec.


I actually authored each of the 312 XPointers in the Annotated Spec by hand, finding IDs, counting children, and matching strings. At the summer 1998 XML Developers' Day conference, David Megginson showed how I could have programmed GNU Emacs (which is what I use for editing anyhow) to construct these automatically at the touch of a key. The lesson is that any sensible XML editing environment ought to make it easy to construct this kind of hyperlink.