Before getting into this month's questions, I wanted to discuss a change in this column's standard operating procedures. In the nearly three years that I've been writing "XML Q&A," all questions I've answered have been drawn from a single source: the O'Reilly Network XML Forum. (Annual exceptions -- the August columns -- have been the "Nobody Asked Me, But..." pieces, based on questions that no one in particular asked but that I wanted to tackle anyway.) In May, the XML Forum was discontinued by O'Reilly; its subject matter, after all, overlapped with numerous other online resources.
Starting this month, I'll be perusing those other online resources for questions to answer here. Included are the following mailing lists and newsgroups, in addition to others (links are to archives, subscription pages, or the Google Groups root for the given resource):
In all cases, I'll focus on questions which haven't yet been answered and continue to focus on questions of broad interest.
Now on to this month's items.
Q: XPath to IDs?
I want something like this:
<!-- a.xml --> <a> <elt id="a" value="1"/> <elt id="b" value="2"/> <elt id="c" value="3"/> </a> <!-- a.xsl --> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <output> <xsl:value-of select="#b/@value"/> </output> </xsl:template> </xsl:stylesheet>
and to get the output:
Is there a syntax for this in XPath 1 or one being considered for XPath 2.0?
A: First, I assume you know that simply naming an attribute
"id" doesn't make it an ID-type attribute. The only way to
ensure that an attribute is of the ID type is to declare it so, via either
DTD or XML Schema. So most of my answer takes it for granted that the
id attributes in your document are formally declared as
Just to highlight the portion of your stylesheet which you're proposing will locate the correct element:
It's not quite that easy, but it's not much harder, either. (Aside from
its simplicity, though, the above is syntactically incorrect. An
XPath-aware processor, such as an XSLT engine, will complain about the
# character.) Instead of a simple "named
anchor"-style selection, use the
id() function to locate
the element in question. It takes one argument, the ID value(s) you're
looking for. Replace your
xsl:value-of element with this
And what if the
id attributes are undeclared? You can
still locate the right element (assuming no two elements share the same
id value) with:
As an aside, note that you needn't pass the
just a single value. You can pass it multiple values in a
whitespace-delimited list, as here:
<xsl:value-of select="id('b c')/@value"/>
This locates the first element matching either of the two ID
values. Furthermore, the argument needn't be a string. If it's a number
or Boolean, the argument will be converted to a string. This behavior is
consistent with that of other XPath functions. But, and this is the
interesting part, if the argument is a node-set, the
function behaves quite differently. Rather than returning a single node,
it returns a node-set containing all element nodes whose ID-type
attributes match any of the string-values of nodes in the passed node-set.
id() function can actually locate more than one
node, which seems to be a contradiction.
The notion is hard to visualize with the sample document the questioner
has provided, since there's no correspondence between any string-values in
the document and the values of the
But consider a common scenario: You control your own XML vocabulary, but not some other XML-based resource whose contents you want to use to locate ID-based values in one of your own documents. For instance, say you've got a document listing book titles (call it, say, books_details.xml):
<books_details> <book isbn="b0684833395">Catch-22</book> <book isbn="b0440180295">Slaughterhouse 5</book> <book isbn="b0764547771">XML: A Primer</book> <book isbn="b0446670251">The Virgin Suicides</book> <book isbn="b0440215625">Dragonfly in Amber</book> <book isbn="b088184800X">Crossed Wires</book> <book isbn="b0679736379">Sophie's Choice</book> <book isbn="b0596002521">XML Schema</book> </books_details>
isbn is declared as an ID-type attribute. Also
note a side-effect of this declaration: the attribute's value may not
start with a digit.)
Elsewhere, in some other document, you have a list of books arranged by subject (as they might be shelved in a bookstore, for example). This document (books_shelves.xml) might look something like this:
<books_details> <category shelf="fiction"> <isbn>b0684833395</isbn> <isbn>b0440180295</isbn> <isbn>b0446670251</isbn> <isbn>b0679736379</isbn> </category> <category shelf="tech"> <isbn>b0764547771</isbn> <isbn>b0596002521</isbn> </category> <category shelf="romance"> <isbn>b0440215625</isbn> </category> <category shelf="mystery"> <isbn>b088184800X</isbn> </category> </books_details>
Obviously, if you controlled both of these vocabularies, a simple solution would be to merge the two documents into one. But if you can't do so, for any of a thousand reasons, you can still use the second document to locate in the first all books which are shelved as, say, fiction. A stylesheet template to achieve this, by transforming books_details.xml, might look like the following:
<xsl:template match="/"> <xsl:for-each select="id(document('books_shelves.xml')//isbn [../@shelf='fiction'])"> <output> <xsl:value-of select="."/> </output> </xsl:for-each> </xsl:template>
The operative portion of this template -- the portion highlighted in
boldface -- uses the
id() function, in conjunction with the
document() function, to locate multiple nodes in the first
document (books_details.xml ) based on the string-values of nodes in the
second (books_shelves.xml). Translated into English, the value of the
xsl:for-each element's select attribute might read something
- The inner call to the
document()function locates, in books_shelves.xml, a node-set consisting of all
isbnelements whose parents have a
shelfattribute with a value of "fiction."
- The outer call to
id()locates, in books_details.xml, each element with an ID-type attribute equal to the string-value of one of the nodes in the node-set located in the preceding step.
The result tree from this transformation is:
<output>Catch-22</output> <output>Slaughterhouse 5</output> <output>The Virgin Suicides</output> <output>Sophie's Choice</output>
By the way, note that this result tree isn't well-formed on its own, consisting as it does of more than one root element.
Also in XML Q&A
There's more than one way to obtain these results. Instead of using the
id () function, for instance, you could use keys to locate
the desired nodes. (This is absolutely the way to go if the attributes in
question aren't ID-type attributes in the first place. See Bob DuCharme's
and Performing Lookups" here on XML.com for more details.) Still,
if you've got ID-type attributes you might as well take advantage of their
Follow-up: XML-based résumés
In last month's column, I reported on the XML Résumé Library for capturing curriculum vitae information. Shortly after that column appeared, I was contacted by Aaron Straup Cope, who has taken it upon himself to extend the XML Résumé Library with some (IMO) notable improvements.
At a minimum, Cope's extensions add to the XML Résumé Library's DTD a new element, activities, and several offspring elements. The activities element, says Cope, identifies "personal, or group, projects that are not directly 'work' related." For instance, you could include memberships in civic organizations under this category. Much more interesting is the set of stylesheets which Cope has prepared; these provide you with the ability to exclude certain information (address and phone number, for example) from the output, to define more than one CSS stylesheet depending on output device, and so on.
If you found the XML Résumé Library interesting, by all means head over to Cope's aaronlind.info XSLT tools page.