XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Namespace Nuances
by John E. Simpson | Pages: 1, 2

Complications ensue...

You may have picked up on a couple of odd, unexplained features of the preceding document.

Once you start using namespaces in a particular document, you must commit to going the whole way. In theory, only the names of the two table element types needed to be disambiguated. In practice, though, you use namespaces to disambiguate entire vocabularies -- even the names of elements, like td and chair above, which are already unambiguous. Thus, if you decide to require that the furniture-type table have a furn: prefix, you're committed to using that prefix on the names of the furniture, chair, and lamp elements as well.

As previously noted, a particular namespace prefix's associated URI need not be the "web address" of anything in particular. There is nothing at the URI http://myfurn/namespace (except a "document not found" error). On the other hand, there's definitely something at the http://www.w3.org/1999/xhtml URI associated with the "empty namespace prefix." (I'll leave for you the exercise of inspecting that "something.")

But the above document introduces some more profound questions.

Deeper mysteries

The first mystery is that the document above no longer contains two distinct elements named table. It now contains a table element, and a furn:table element. The prefix is part of the element name.

The second mystery is the real killer, and it's the reason why the original questioner is having trouble with the checkbook application. If you mix element types from two different vocabularies, how can you possibly validate a document at all, given that a valid document may contain no more than one DOCTYPE declaration, referencing no more than one DTD?

The answer is weird but also (once you think about it) obvious. Either (a) you can't validate it at all, or (b) you can validate it only if you include, in the one referenced DTD, all element names -- including their prefixes and all namespace-declaring attributes.

Case (a) isn't as outlandish an option as you might imagine. It's one of the most common solutions, thanks in part to XSLT's popularity. An XSLT style sheet must contain elements from the XSLT vocabulary, such as xsl:stylesheet and xsl:template, and these are intermingled in the stylesheet with elements from the result tree vocabulary. Validating an XSLT style sheet is a remote -- but only remote -- possibility. The whole thing works wonderfully using the simpler alternative of well-formedness.

(For some reason, case (a) seems to drive many otherwise sane users of XML absolutely batty: "If I can't validate a document, how do I know it's correct?" This has never bothered me because in terms of XML 1.0 well-formedness is just as "correct" as validity. If a document works in an application that needs to use the document, who cares if it works in the framework of some other arbitrary application -- like a validating parser?)

The question at hand

Also in XML Q&A

From English to Dutch?

Trickledown Namespaces?

From XML to SMIL

From One String to Many

Getting in Touch with XML Contacts

For starters, the XML document with which this whole discussion opened is a little strange -- given what you now know about namespaces. Its root element, checkbook, declares four namespace prefixes and their associated URIs: f, s, m, and ars. Of these, only one is actually used anywhere in the document: f, on the f:deposit element. Furthermore, there is no namespace declaration for the "empty prefix" -- which is actually, by default, implicit in the names of all other elements in the document (amount, date, and so on).

Let's assume that validation must be achieved somehow, that simple well-formedness won't suffice. Let's also assume that the original document is a fragment of a more complete one, which actually does at some point need to use the s, m, and ars prefixes as well as f. Here's how the fragment of a DTD above, way back at the beginning, could be modified to accommodate both validation and namespaces.

<!ELEMENT checkbook (f:deposit|payment)*>
<!ATTLIST checkbook
xmlns:f CDATA #FIXED "http://schemas.ar-ent.net/soap/file/"
xmlns:s CDATA #FIXED "http://schemas.xmlsoap.org/soap/envelope/"
xmlns:m CDATA #FIXED "http://schemas.ar-ent.net/test/soap.tr/checkbook/"
xmlns:ars CDATA #FIXED "http://schemas.ar-ent.net/soap/"
xmlns CDATA #FIXED "http://mycheckbookURI">

<!ELEMENT f:deposit (payor, amount, date, description?)>
<!ATTLIST f:deposit
type (cash|check|direct-deposit|transfer) #REQUIRED>
<!ELEMENT amount (#PCDATA)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT payor (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ATTLIST description
category (cash|entertainment|food|income|work) 'food'>

Now your application will find an element named f:deposit in the DTD, whereas before the DTD declared only an element named deposit (no prefix). And now the rest of the document can use any of the four explicit prefixes on any element name, as long as those names, including prefixes, are declared in the DTD. If an element named s:envelope appears in the document, an element named s:envelope must be declared in the DTD. A declaration for a simple envelope element won't suffice.

Simple? Probably not. Possible? You bet.