Menu

Schema Scuffles and Namespace Pains

May 30, 2001

Edd Dumbill

Leigh Dodds, after a well-deserved vacation, will be returning next week. Meanwhile, after the latest bout of XML conferences, I have the pleasure of casting the XML-Deviant's eye over some of May's developments in the XML developer community.

Schema Scuffles

The W3C XML Schema Definition Language was released as a Recommendation at the beginning of May, at the Tenth International World Wide Web Conference. After collective sighs of relief were heaved, it would be unrealistic to expect silence to descend on the topic.

Kohsuke Kawaguchi posted a reference to an article, XML Schema Dos and Don'ts, which gives his best practice for keeping XML Schemas simple. His rather direct introduction states

Several similar documents are already available on the web. But I discovered that those documents are written by some special people; they are brilliant people who always drive things to the limit. They simply can't stop inventing cool tricks that even the working group member can't imagine.

For them, XML Schema is a new favorite toy.

There has to be a different document. A document for those who use XML Schema for business --- for those who are at a loss how to use it.

Kawaguchi's document provoked a considered response from Martin Gudgin, a member of the Schema Working Group, who disagreed over which features from XML Schema were disposable. One note of dissent in particular concerned the use of namespaces (what else?).

Regarding local declarations. I have to take issue with your assertion that


<foo:person xmlns:foo="http://best.practice.com">

  <familyName> KAWAGUCHI </familyName>

  <lastName> Kohsuke </lastName>

</foo:person>

is 'bad use of XML namespaces'. I have lots and lots of places where I use *exactly* that approach and it works very nicely for what I do. I don't think you can call this one way or the other. Neither approach is *wrong*, they're just different

This complaint was the start of an extended debate on proper namespace usage. Kawaguchi's assertion is that the child elements inside the foo:person element should also be in the same namespace. Gudgin's counter was that there is more than one way to use namespaces, and his was equally valid, and indeed is used in the SOAP specification.

Gudgin appealed to the notion that his way was more natural for data-oriented XML applications and more directly mirrored the approach a programming language takes to scoping names.


public class Person

{

    String lastName;

    String firstName;

}

I don't think anyone would claim that the fields of the Person class were in the best.practice.com package. They are local to the Person class. Another class in the same package could have the same field names and there would be no confusion on the part of the Java compiler/programmer/VM

So, the most natural mapping, to me, is to take the same approach when serializing the class as XML...

The debate bounced back and forth between the two protagonists, until Jonathan Borden intervened with a simpler solution.

This argument seems hopelessly complicated. The most reasonable way to define a person name structure is:


<person.name xmlns="http://example.org/person">

<given>Martin</given>

<family>Gudgin</family>

</person.name>

why would anyone want to complicate this with different namespaces for each element of the structure?

The central point of the debate is that child elements do not inherit their namespaces from a parent element. This seems relatively straightforward, but Gudgin pointed out that XML Schema allows types like


<complexType name='person'>

  <sequence>

    <element name='given' type='string' />

    <element name='family' type='string' />

  </sequence>

</complexType>

Given an XML Schema for a document, then, it is possible to know that the child elements do in some sense belong to their parent, in a way which is not apparent simply from the XML instance.

Observing this, Simon St. Laurent offered a comment.

Unfortunately, I think the way you're going about using namespaces makes a lot of potentially unfounded assumptions about how the document will be processed. In an environment you completely control, that's fine, but if your documents ever leak out to the rest of the world, you may well find that no one else shares your assumptions and that therefore the document is interpreted quite differently.

As the debate continued, Tim Bray, one of the editors of the Namespaces in XML specification, weighed in on the side of fully qualifying child elements.

I was reading through this thread with interest, and could see both sides of the argument, but find myself swinging more and more strongly one way.

At one level, what's going on in the following is pretty obvious and natural ... I'm sure there are large classes of applications that will Do The Right Thing. But the more I think about this, the more I believe that this is probably pretty bad practice.

Bray proceeded to explain that Gudgin's style of markup breaks in the scenario when it is mixed with names from another namespace. He concluded that the "admirable simplicity" of Gudgin's example wasn't quite "pleasing enough to justify the cost, which is considerable."

There's only been room for a small section of the entire debate here. If you have the time, read the thread in the XML-DEV archives. Not only is it instructive, but it also exemplifies the lively flavor of XML-DEV debates.

Expectations for XML Schema 1.1

In response to a message that implied "co-constraints" will be introduced as a feature in XML Schema 1.1 or 2.0, Rick Jelliffe seemed doubtful such functionality would be in XML Schema 1.1. Co-constraints are constraints on instances where the permissible values of one element depend on the value of a different element.

Jelliffe expanded:

I don't expect these kinds of constraints will be in XML Schemas 1.1. I imagine the Schema WG will be more interested in making good on the existing functionality (e.g. the incomplete areas such as date/times, key scoping) rather than venturing into new areas.

Another reason not to expect it for 1.1 is that the Schematron model's strength is that it assumes random access to the whole document, while XML Schemas has been limited to allow streaming implementations (a respectable goal too). So in order to be consistent with XML Schema's design, a Schematron-like constraint language would have to limit the kinds of XPaths allowed: there has been no discussion anywhere on this issue (except the related issue of the XPath subset to allowing streaming processing on keys) and whether the result is useful.

Little else has been said publicly about what XML Schema 1.1 may or may not contain, but it seems likely that consolidation and perhaps simplification are in the cards.

Namespace Pains, Again

If you thought Namespaces in XML had caused enough pain for a lifetime, then perhaps it's time to find a different job. Rumors from multiple areas of the XML world indicate that the W3C may be planning to reopen the can of namespace worms, in an attempt to resolve issues with the specification.

While such attention to detail seems admirable, the potential for further confusion and delay make a compelling case for the W3C to leave Namespaces in XML alone. Given this, and the experience of the XML-URI firestorm last year, one can only conclude that the W3C must have a very strong reason to revive this debate.