July 20, 2005
"If you use inheritance where composition will work, your designs will become needlessly complicated." —Bruce Eckel, Thinking in Java
This week's column takes a look at two new specifications that are winding through the W3C Recommendation Track, xml:id as a Proposed Recommendation and XLink 1.1 as a Last Call Working Draft. These two specifications share an important common trait: neither is a standalone vocabulary, but rather they are intended to be combined into larger vocabularies. The formal name for "combining stuff" this way is "composition," this week's topic. First, a thirty-second review of some basic terms for any nonprogrammers.
Code written by someone who has just learned object-oriented techniques is usually
easy to spot. One of the main tells is enthusiastic overuse of derivation, that is
new code (called a "class") explicitly based on existing code, modulo specific
changes. Derivation is an important tool for expressing clear is-a relationships.
make sense to derive a new class, say
Polygon, from a
Shape class, because a polygon or circle really is a kind of a
shape. More to the point, a sign that derivation makes sense is when circles and polygons
can be treated more generally as shapes, for example the code might say, "Attention
shapes: render yourselves to SVG." Other cases, however, are not so clear cut. For
example, is an
XMLDom class really a derivation of
Probably not. In designing software, many such situations arise; experts such as Bruce
recommend sticking with composition in those cases.
In the realm of standards, both derivation and composition take place. Derivation might occur when one specification forms an extended subset of another—almost always a sign that something has gone horribly wrong. At a basic level, though, composition happens all the time in the normal course of producing specifications. A look at the XML 1.1 specification shows eight "normative" references, including the Unicode specification, which forms the basis for much of XML's interoperability.
Designing an XML vocabulary is a special case of producing a specification. Despite XML Namespaces finalization more than five years ago, architects are still uncovering trouble in applying composition in XML, more commonly referred to under the banner "compound documents." The first to look at this week is a revision to XLink.
XLink 1.0 became a W3C Recommendation in June 2001, amid a fair amount of controversy. Support for the standard has been slow to come; in a March 2002 article titled "XLink: Who Cares?" Bob DuCharme noted that only seven partial implementations were available. In fact, the death of XLink has become one of several permathreads on the xml-dev mailing list.
After the thread a while back about the death of XLink, I keep finding examples of it being cited, particularly in the OGC [Open GIS Consortium] literature...I wonder if there is a term for this phenomenon: polite inclusion of other works that are not successful in terms of ground support
One of the most prominent examples of politely including XLink has been inside SVG
(scalable vector graphics), but even there, DTD (document type definition) tricks
to remove the need for some of the explicit markup. XLink 1.1 legitimizes the use
xlink:href attributes without need of any further tricks, which is, of
course, a pleasant development.
Even so, the changes in version 1.1 don't address the more fundamental complaints with XLink. For example, XLink 1.1 is still incompatible with any version of XHTML, but especially XHTML 2.0, which uses two distinct attribute names for different kinds of links.
However, the remainder of the thread on xml-dev consisted largely of folks pointing out increasing usage of XLink: Geoffrey Shuetrim pointed out use in XBRL, Alexander Johannesen pointed out Topic Maps XTM, and Rick Jelliffe highlighted his company's Schema Management Tool, which uses XLink extensively under the hood.
Does this indicate a resurgence in XLink's fortunes? Bullard concludes by saying that "the usefulness of architectural forms is apparent," a sensible position.
Prediction: XLink 1.1 isn't going to do much beyond encouraging uses where it has historically been politely included.
Proposals for something like
xml:id go back a long stretch of time. Even in
November of 2001, Leigh Dodds, in this very column, outlined several proposals including
one we see today substantially as a Proposed Recommendation. Tim Bray (famously) wrote:
This is in danger of tripping over what is maybe the #1 gaping architectural hole as regards XML & the Web. The problem is that at the moment, given some arbitrary XML, there is no good way to determine what's an ID without recourse to some external resource like a DTD or schema, and that, to use a technical term, sucks.
Who's to say that long-standing problems with XML never get solved? But the tricky
about the way specifications interact under composition is that solving one problem
uncovers another. In this case, the use of the reserved
xml prefix, originally
justified for this architectural hole as regards XML, learned a new trick, or at least
unlearned an old one.
xml:space, both defined as part of
XML itself, as well as later addition
xml:base, function over a scope that
includes child elements. For example, defining any of these attributes once on the
element would have an effect on every element in the document. But
contrast, doesn't follow this assumption: it applies to a single element. Specifications
that made assumptions about scoping of
xml-namespaced attributes will run into
problems if documents with
xml:id become common. Canonical XML (including its
influence on XML Signature) is the most often cited example here.
Prediction: xml:id will rapidly find its way into vocabularies that need well-known
especially without DTD processing. Other specifications that over assumed about
xml-attribute scoping will quickly come into line.
In Java programming, composition is straightforward and almost foolproof. With specifications, however, it's still possible for all kinds of unexpected interactions to take place. Further, the number of technical specifications seems to be steadily increasing with time, so that chances of conflicts continue to rise. The xml:id example showed that sometimes even a nearly-unrelated spec can come along and throw chaos into your world by dismantling fundamental assumptions.
Prediction: more problems are yet to be uncovered in the various combinations of existing XML specifications, to say nothing of the new ones. Somehow, we'll keep muddling through.
Births, Deaths, and Marriages
Versions 1.6 and 2.0Beta1 of the nxslt, a free .NET XSLT command-line utility from Oleg Tkachenko.
Documents and Data
An identifier by any other name...
Rick Jelliffe on untangling the Schema spaghetti
If you're into the podcast thing, this one looks interesting.