Namespace Nuances
by John E. Simpson
|
Pages: 1, 2
Complications ensue...
You may have picked up on a couple of odd, unexplained features of the preceding document.
Once you start using namespaces in a particular document, you must
commit to going the whole way. In theory, only the names of the two
table element types needed to be disambiguated. In
practice, though, you use namespaces to disambiguate entire
vocabularies -- even the names of elements, like td and
chair above, which are already unambiguous. Thus, if you
decide to require that the furniture-type table have a
furn: prefix, you're committed to using that prefix on
the names of the furniture, chair, and
lamp elements as well.
As previously noted, a particular namespace prefix's associated URI
need not be the "web address" of anything in particular. There is
nothing at the URI http://myfurn/namespace (except a
"document not found" error). On the other hand, there's definitely
something at the
http://www.w3.org/1999/xhtml URI associated with the
"empty namespace prefix." (I'll leave for you the exercise of
inspecting that "something.")
But the above document introduces some more profound questions.
Deeper mysteries
The first mystery is that the document above no longer contains
two distinct elements named table. It now contains a
table element, and a furn:table element. The
prefix is part of the element name.
The second mystery is the real killer, and it's the reason why the
original questioner is having trouble with the checkbook
application. If you mix element types from two different vocabularies,
how can you possibly validate a document at all, given that a valid
document may contain no more than one DOCTYPE
declaration, referencing no more than one DTD?
The answer is weird but also (once you think about it) obvious. Either (a) you can't validate it at all, or (b) you can validate it only if you include, in the one referenced DTD, all element names -- including their prefixes and all namespace-declaring attributes.
Case (a) isn't as outlandish an option as you might imagine. It's
one of the most common solutions, thanks in part to XSLT's
popularity. An XSLT style sheet must contain elements from the XSLT
vocabulary, such as xsl:stylesheet and
xsl:template, and these are intermingled in the
stylesheet with elements from the result tree vocabulary. Validating
an XSLT style sheet is a remote -- but only remote -- possibility.
The whole thing works wonderfully using the simpler alternative of
well-formedness.
(For some reason, case (a) seems to drive many otherwise sane users of XML absolutely batty: "If I can't validate a document, how do I know it's correct?" This has never bothered me because in terms of XML 1.0 well-formedness is just as "correct" as validity. If a document works in an application that needs to use the document, who cares if it works in the framework of some other arbitrary application -- like a validating parser?)
The question at hand
|
Also in XML Q&A | |
For starters, the XML document with which this whole discussion
opened is a little strange -- given what you now know about
namespaces. Its root element, checkbook, declares four
namespace prefixes and their associated URIs: f,
s, m, and ars. Of these, only
one is actually used anywhere in the document: f, on the
f:deposit element. Furthermore, there is no
namespace declaration for the "empty prefix" -- which is actually, by
default, implicit in the names of all other elements in the document
(amount, date, and so on).
Let's assume that validation must be achieved somehow, that simple
well-formedness won't suffice. Let's also assume that the original
document is a fragment of a more complete one, which actually does at
some point need to use the s, m, and
ars prefixes as well as f. Here's how the
fragment of a DTD above, way back at the beginning, could be modified
to accommodate both validation and namespaces.
<!ELEMENT checkbook (f:deposit|payment)*>
<!ATTLIST checkbook
xmlns:f CDATA #FIXED "http://schemas.ar-ent.net/soap/file/"
xmlns:s CDATA #FIXED "http://schemas.xmlsoap.org/soap/envelope/"
xmlns:m CDATA #FIXED "http://schemas.ar-ent.net/test/soap.tr/checkbook/"
xmlns:ars CDATA #FIXED "http://schemas.ar-ent.net/soap/"
xmlns CDATA #FIXED "http://mycheckbookURI">
<!ELEMENT f:deposit (payor, amount, date, description?)>
<!ATTLIST f:deposit
type (cash|check|direct-deposit|transfer) #REQUIRED>
<!ELEMENT amount (#PCDATA)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT payor (#PCDATA)>
<!ELEMENT description (#PCDATA)>
<!ATTLIST description
category (cash|entertainment|food|income|work) 'food'>
Now your application will find an element named
f:deposit in the DTD, whereas before the DTD declared
only an element named deposit (no prefix). And now the
rest of the document can use any of the four explicit prefixes on any
element name, as long as those names, including
prefixes, are declared in the DTD. If an element named
s:envelope appears in the document, an element named
s:envelope must be declared in the DTD. A declaration for
a simple envelope element won't suffice.
Simple? Probably not. Possible? You bet.