A New Kind of Namespace

August 22, 2001

Followers of this column will be more than aware that when a technology emerges from the W3C there are always teething pains. W3C Recommendations aren't quite the shrink-wrapped parcels of authority they're often taken to be. Perhaps the canonical example of this has been the XML and Namespaces Recommendation, the somewhat unpredictable repercussions of which are still being felt; all the while, typically, the specification is heralded as a fine piece of work by some of its major supporters.

This summer's problem child has undoubtedly been W3C XML Schema. One Schema issue in particular has already figured strongly in these pages: the use unqualified elements within namespace-qualified types. Last week's discussion on XML-DEV, while not resolving this issue, has produced a useful crystallization of the motivation behind this technique. This article describes the motivation, the import of which seems to be that XML Schema has added to XML another form of namespaces.

To get up to speed on the issue, you may want to review the debate thus far, as summarized by the XML-Deviant. The issue was introduced in Schema Scuffles and Namespace Pains. The problem is caused by structures such as

<foo:person xmlns:foo="http://best.practice.com">

  <familyName> KAWAGUCHI </familyName>

  <lastName> Kohsuke </lastName>

</foo:person>

W3C XML Schema encourages the creation of such structures. The major complaint against them is that the corresponding XML Schema is required to understand that the familyName and lastName element types belong to the logical unit foo:person, whereas if they were in the same namespace it's a fair bet that they belong to the same kind vocabulary. Simon St. Laurent even went so far as to write some software that would normalize such structures, moving familyName and lastName into the same namespace as person.

The debate progressed and the view was taken that this is largely an issue of best practices. Yes, W3C XML Schema lets you do this, but you're probably better off not doing it. In fact, W3C XML Schema encourages this practice, by defaulting child elements in a type to be without a namespace. Leigh Dodds described a debate that illuminated this odd situation a little more in Opening Old Wounds, but he bumped up against W3C confidentiality rules, meaning that the general public didn't really get an explanation for the Schema Working Group's decision to make child elements unqualified by default.

Over the last week more information came to light in an exchange between Tim Bray (erstwhile editor of the XML Namespaces specification) and Matthew Fuchs, who took part in the W3C XML Schema Working Group.

After St. Laurent released his software to fix up document instances to fit the fully-qualified idea of the world, there was considerable furor on XML-DEV. In reponse, Bray said that, "having spent some time reading this thread, I realized I didn't understand either local types or Simon's filters." Bray goes on to sum up what he saw as the motivation for the situation.

[P]eople want to use schema X to validate element Y, but they don't want element Y to be in a namespace (even a defaulted namespace), they want the connection picked up from the namespace of an ancestor element, and "local element types" allow this.

At one level this seems like a reasonable thing to want to do: "please use the following rules to validate no-namespace elements whose type is Y and whose ancestor is myNS:Z."

Having identified the motivation, Bray then noted the opposing case, namely, that it sits awkwardly with some of the basic ideas behind using XML Namespaces in the first place.

On the other hand, it does contravene the achingly-simple procedure for linking markup to software provided by XML+namespaces: identify markup to software by putting it in a namespace. Which feels pretty serious. The default, simple, obvious way of arranging for software module X to process element Y is to advertise that X processes elements which are in the NSx namespace. It's kind of troubling that schema supports a non-interoperable, more complex, less robust way of tying markup to software. That this is its default behavior is simply outrageous.

In response to Bray, Matthew Fuchs replied, taking as his theme the idea that the motivation behind the local types mechanism in W3C XML Schema is the same as that which led to XML Namespaces. Fuchs first countered Bray's perception of why people want locally scoped names in their schemas.

The motivation is the same as for any data definition language that's arisen in the last 40 years that has locally scoped names -- the use of a particular name in a particular context should not pollute the global context, but allow the same name to be used for different things in different contexts. Which, at the semantic level, is exactly what namespaces were introduced for -- let's not confuse the goal with the particular syntax introduced to handle it.

Also in XML-Deviant

The More Things Change

Fuchs further explained the desire to use locally scoped elements using a simple example. In an XML vocabulary to coordinate music, graphics and text, a musician, designer, and editor all have the notion of the "line", and they each wish to have their notions differentiated. If local scoping were not possible, then the unqualified line element type would have global scope in the schema, and the three concepts would be indistiguishable. If each one's idea of a line were given a namespace, this would solve the problem, but lead to instances like

<music><music:line>

... </music:line></music>

Fuchs says avoiding such fragmentation is the very purpose of XML Schema's use of local names.

The goal of local elements is to support this degree of differentiation among <line> elements without needing to explicitly break a schema into a huge number of little schemas, each establishing individual namespaces. Yes it can be misconstrued as "syntactic sugar", but everything in programming is syntactic sugar on machine language, anyway. ... As I hope you can see by now, the goal is "please recognize which elements are locally scoped and process them within the semantics provided by their enclosing global element"

Countering Bray's accusation of poor interoperability, Fuchs saw the default behavior of XML Schema as more interoperable than if child elements were namespace qualified by default. He claimed that it's as compatible as possible with current document practice, and he suggested as a comparison the use of attributes: "local attributes are not in a namespace, so local elements should not be either." (A miniature version of this issue has recently plagued the RDF folks. In the end, they opted for a different solution: to insist on namespace qualified attributes in their vocabulary.)

So, in a sense, the mystery is at an end. The inclusion of locally-scoped types in XML Schema is not an anomaly. It does come as a surprise, however, that it's taken this long to find a reasonable justification of the functionality. Tim Bray seemed happy with Fuchs' explanation:

OK, I think I get it. Local element types allow the <line> element to have different validation rules depending on whether it's a child of <matt:music>, <matt:graphics> or <matt:text>. Clearly something that DTD's can't do but is desirable.

It's unlikely this will be the last word on the topic. Locally scoped names seems destined to be a topic, like XML namespaces, in need of an eventual myth-busting analysis. But, for now, at least we know why locally scoped names were included in W3C XML Schema, something that will help the expert community develop best practices.