"If you use inheritance where composition will work, your designs will become needlessly complicated." —Bruce Eckel, Thinking in Java
This week's column takes a look at two new specifications that are winding through the W3C Recommendation Track, xml:id as a Proposed Recommendation and XLink 1.1 as a Last Call Working Draft. These two specifications share an important common trait: neither is a standalone vocabulary, but rather they are intended to be combined into larger vocabularies. The formal name for "combining stuff" this way is "composition," this week's topic. First, a thirty-second review of some basic terms for any nonprogrammers.
Code written by someone who has just learned object-oriented techniques is
usually pretty easy to spot. One of the main tells is enthusiastic overuse
of derivation, that is writing new code (called a "class") explicitly
based on existing code, modulo specific changes. Derivation is an important
tool for expressing clear is-a relationships. It would make sense to derive
a new class, say
Polygon, from a more fundamental
because a polygon or circle really is a kind of a shape. More to the point,
a sign that derivation makes sense is when circles and polygons can be treated
more generally as shapes, for example the code might say, "Attention all
shapes: render yourselves to SVG." Other cases, however, are not so clear
cut. For example, is an
XMLDom class really a derivation of
Probably not. In designing software, many such situations arise; experts such
as Bruce Eckel recommend sticking with composition in those cases.
In the realm of standards, both derivation and composition take place. Derivation might occur when one specification forms an extended subset of another—almost always a sign that something has gone horribly wrong. At a basic level, though, composition happens all the time in the normal course of producing specifications. A look at the XML 1.1 specification shows eight "normative" references, including the Unicode specification, which forms the basis for much of XML's interoperability.
Designing an XML vocabulary is a special case of producing a specification. Despite XML Namespaces finalization more than five years ago, architects are still uncovering trouble in applying composition in XML, more commonly referred to under the banner "compound documents." The first to look at this week is a revision to XLink.
XLink 1.0 became a W3C Recommendation in June 2001, amid a fair amount of controversy. Support for the standard has been slow to come; in a March 2002 article titled "XLink: Who Cares?" Bob DuCharme noted that only seven partial implementations were available. In fact, the death of XLink has become one of several permathreads on the xml-dev mailing list.
After the thread a while back about the death of XLink, I keep finding examples of it being cited, particularly in the OGC [Open GIS Consortium] literature...I wonder if there is a term for this phenomenon: polite inclusion of other works that are not successful in terms of ground support
One of the most prominent examples of politely including XLink has been inside
SVG (scalable vector graphics), but even there, DTD (document type definition) tricks were used to remove the need for some of the
explicit markup. XLink 1.1 legitimizes the use of sole
without need of any further tricks, which is, of course, a pleasant development.
Even so, the changes in version 1.1 don't address the more fundamental complaints with XLink. For example, XLink 1.1 is still incompatible with any version of XHTML, but especially XHTML 2.0, which uses two distinct attribute names for different kinds of links.
However, the remainder of the thread on xml-dev consisted largely of folks pointing out increasing usage of XLink: Geoffrey Shuetrim pointed out use in XBRL, Alexander Johannesen pointed out Topic Maps XTM, and Rick Jelliffe highlighted his company's Schema Management Tool, which uses XLink extensively under the hood.
Does this indicate a resurgence in XLink's fortunes? Bullard concludes by saying that "the usefulness of architectural forms is apparent," a sensible position.
Prediction: XLink 1.1 isn't going to do much beyond encouraging uses where it has historically been politely included.
Proposals for something like
xml:id go back a long stretch of
time. Even in November of 2001, Leigh Dodds, in this very column, outlined
several proposals including the one we see today substantially as a Proposed
Recommendation. Tim Bray (famously) wrote:
This is in danger of tripping over what is maybe the #1 gaping architectural hole as regards XML & the Web. The problem is that at the moment, given some arbitrary XML, there is no good way to determine what's an ID without recourse to some external resource like a DTD or schema, and that, to use a technical term, sucks.
Who's to say that long-standing problems with XML never get solved? But the
tricky thing about the way specifications interact under composition is that
solving one problem often uncovers another. In this case, the use of the reserved
originally justified for this architectural hole as regards XML, learned a
new trick, or at least unlearned an old one.
xml:space, both defined
as part of XML itself, as well as later addition
over a scope that includes child elements. For example, defining any of these
attributes once on the root element would have an effect on every element in
the document. But
xml:id, in contrast, doesn't follow this assumption: it
applies to a single element. Specifications that made assumptions about scoping
xml-namespaced attributes will run into problems if documents
xml:id become common. Canonical XML (including its influence
on XML Signature) is the most often cited example here.
Prediction: xml:id will rapidly find its way into vocabularies that need
well-known IDs, especially without DTD processing. Other specifications that
over assumed about
xml-attribute scoping will quickly come into
In Java programming, composition is straightforward and almost foolproof. With specifications, however, it's still possible for all kinds of unexpected interactions to take place. Further, the number of technical specifications seems to be steadily increasing with time, so that chances of conflicts continue to rise. The xml:id example showed that sometimes even a nearly-unrelated spec can come along and throw chaos into your world by dismantling fundamental assumptions.
Prediction: more problems are yet to be uncovered in the various combinations of existing XML specifications, to say nothing of the new ones. Somehow, we'll keep muddling through.
Births, Deaths, and Marriages
Versions 1.6 and 2.0Beta1 of the nxslt, a free .NET XSLT command-line utility from Oleg Tkachenko.
Documents and Data
An identifier by any other name...
Rick Jelliffe on untangling the Schema spaghetti
If you're into the podcast thing, this one looks interesting.