Cracks in the Foundation
by Micah Dubinko
|
Pages: 1, 2
Con-Fusion
Other specific choices made in the development of XML namespaces cause persistent confusion among markup practitioners. For one, namespace prefixes are largely incompatible with DTDs, which, although unfashionable, are still a built-in part of XML and intimately connected to any use of XML involving DOCTYPEs or named character entities. In other words, they're still important to nearly any web developer.
Does a namespace declaration apply to elements or attributes? A reasonable answer would be "yes and no"; the subtleties still pop up in mailing lists. Ron Bourret covered this and more in the seminal Namespace Myths Exploded article previously published on this site. The way things ended up, attributes can't be placed in a specific namespace without an explicit prefix regardless of whether a conflict is even possible -- a decision that would cause problems later.
Another trouble spot lies in the use of "namespace names," or strings that look like URLs, which add a truckload of URL baggage to the spec and lead many enthusiastic new learners to wonder what will happen if they visit that URL in a browser. This is an old debate, one that won't be rehashed here. But direct confusion from the XML Namespaces spec is only the beginning.
Collateral Damage
QNames in content: what does this phrase bring to mind? The Namespaces in XML spec defined QNames but remained silent on the topic of using them in content; in practice, it arrived with XPath 1.0, which needed a way to refer to element and attribute names. At the time, making XPath identifiers look the same way as the elements themselves was justified as the most sensible way to deal with two-part names with one part bound to a longer third name. Over time, though, this practice has fallen out of favor and been compared to "using TCP packets as delimiters in an application protocol." This bit of unpleasantness has become firmly entrenched in XML vocabularies, at a minimum including any that use XPath. The lineage of this practice can be traced straight back to namespaces.
Aside from using QNames in content, the XML Schema Part 1 specification has been criticized for its complexity. I started counting how many times the word "namespace" or its plural appears in the document and gave up somewhere around 400. How much of this complexity is caused by namespace-think?
Then there's XLink, once a promising branch of XML technology. XLink failed to meet a key requirement:
It must be possible to apply XML link semantics to existing documents by modifying the documents' DTDs only, requiring no modification to the document instances themselves.
For example by supplying appropriate information in an element's definition (in the DTD), such as a default ROLE attribute. This provides for layering of XML link semantics onto large bodies of XML documents without requiring modification of those documents.
The syntactic restrictions introduced by namespaces caused this conflict. It wasn't possible to meet this requirement and define XLink as a distinct namespaced vocabulary.
Even when vocabularies use namespaces, there's no guarantee of coordination. If anything, scoping encourages folks to go off and do their own thing. Already within the W3C we have paragraphs as html:p as well as speech:p, where html and speech map to the "namespace names" for XHTML and Speech Synthesis Markup Language, respectively. (Don't even get me started on wml:p.) Markup vocabularies also have multiple anchors as a elements, including XHTML, SMIL, and others outside of W3C. So the problem attributed to chameleon namespaces at the outset of this article has already come to pass. In aggregate, the amount of time that has been wasted debating and rehashing such issues in standardization and development communities is staggering.
Now What?
There are a few other lines of evidence that will have to go in another article, like the mobile industry reaction to namespaces or examining the continual stream of proposed alternatives. Individually, any of the objections recounted here wouldn't amount to much. Collectively, though, they form a composite sign that suggests that the XML community might need to reconsider not only its approach and attitudes toward HTML, but toward the foundation of namespaces as well.
What would the XML world look like without namespaces, or with a less intrusive version thereof? Docbook and HTML would continue to be fine. Newer languages like SVG would be slightly different in details, but overall the same. The big question is what compound documents would look like, but it's hard to imagine the situation much worse than what we have today.
So far, the W3C hasn't posted an official notice to the effect of what Tim Berners-Lee wrote on his blog. Nevertheless, it's encouraging to see a willingness to change course when needed. Let's hope this willingness extends deep enough to reconsider namespaces. The tension between incremental HTML 4 philosophies and XML namespace practice will only get stronger.
The formal objection noted at the start of this article concludes with these words:
If we're going to go changing the namespace for every host language that comes along, we might as well not have namespaces in the first place.
Actually, that doesn't sound like such a bad alternative.
Disclosure: the author of this article is a former editor for the XForms and HTML Working Groups, and is a contributor to XML Hacks. He submitted this article in namespace-free HTML.
Share your experience in our forums.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- AAA Carpet Cleaning Los Angeles 323-678-2704
2008-11-21 10:35:03 Movingcompany [Reply]
Carpet Cleaning Los Angeles 323-678-2704
all carpet cleaning upholstery area rug
steam clean or deep shampoo cleaning
- Still a skeptic
2007-07-08 14:59:19 arjunray [Reply]
Yeah, me. It's going on a decade now - actually, just over, if it all started with "SD5" - and I'm still far from convinced. Or impressed. I remember how the Namespaces spec evolved by subtraction, use case after use case biting the dust through the series of drafts. The process didn't stop with the Rec, either. I might point to the numerous times my demonstrations, refutations, counterexamples and alternatives went basically unanswered on xml-dev, but I won't. Suffice it to say, showing that XML Namespaces - and colonification - are neither necessary nor sufficient to solve any problem worth solving, is simply irrelevant.
It's here, and we just have to deal with it. Not warts and all, but warts and nothing else.
- XMPP
2007-03-21 09:34:04 phil wilson [Reply]
I'm amazed no-one's mentioned XMPP/Jabber.
Last time I read the XMPP they hard-code the values of the namespace prefix. That is to say that every XMPP server and client *must* use the same value for the prefix. The URI is irrelevant.
It's not real XML namespace use, but it does seem to work for tens of millions of users on a daily basis.
- Compelling examples
2007-01-29 20:50:04 rpbourret [Reply]
XSLT and SOAP are not possible without namespaces. Sure, you could leave namespaces out, but there would be a theoretical risk of collision between the non-XSLT (non-SOAP) elements and XSLT (SOAP) elements. Specifications should not leave such holes.
Personally, I'm not at all convinced that you need namespaces for local documents -- that is, ones that never leave your application sitting on your desktop. But the minute they go somewhere else -- another desktop, another department, another company -- you should start worrying, if only because the cost of fixing the problem later is much higher than adding namespaces now.
- Middle position
2007-01-11 00:26:54 rjelliffe [Reply]
If the gods removed default namespaces, so that a prefix = a namespace, it would remove many of the practical complications of namespaces.
So XHTML could not be not in a namespace, big deal. You have the doctype declaration, the mime type and the file extension already; there are no shortage of hints.
- namespaces: we shouldn't be forced to pick them
2006-12-01 11:35:26 rp_ [Reply]
The one problem I see with namespaces is that it must be decided within an XML document which namespaces it uses. But when we start developing our XML-based document formats and software that depends on them, we usually don't know what would be a good name to pick, and we don't care which name to pick, either. The adopted solution was to suggest a naming convention (the use of URIs) that would make it easy for most people to pick something unique. This led to the problem that URIs already have implied semantics attached to them. But perhaps the whole approach of forcing XML document designers to pick a specific namespace right from the start is wrong. We have to pick a namespace and hardcode it into all of our XML documents, then make all of our XML processing software aware of the namespaces we use before it will work correctly. For example, my XSD and your RNG for our common documents must agree on the namespaces used in them. But at the time you write your RNG you may not have decided whether or not your documents and mine will be compatible, or to what extent. So the real problem is not that picking URLs for namespaces is confusing, it's that we have to pick namespaces prematurely.
What might help is a mechanism by which we can declare or redeclare our namespaces outside of the XML documents we want to apply them to. Some kind of "namespace override" mechanism, like the namespace aliasing in XSDs, but more general. I don't know if anything like that exists, but it would certainly help me, as an application developer, to provide better operability within the XML processing tools I deal with.
- namespaces: we shouldn't be forced to pick them
2008-02-16 07:39:31 madscmhansen3@yahoo.com [Reply]
One mechanism for declaring and/or redeclaring namespaces outside of the XML documents we want to apply them to is through the use of general entities.
Ken Holman explains the technique in his Global large-scale stylesheet deployment case study: http://www.idealliance.org/proceedings/xml05/ship/18/Holman-18.HTML#d0e333
The technique was implemented for an extensive stylesheet library and provided the ability to modify the namespace for the entire library by changning one line in the entity file.
- namespaces: we shouldn't be forced to pick them
- Not compelling?
2006-11-13 07:50:56 Norman Walsh [Reply]
DocBook, HTML, Atom, RSS, and Dublin Core all have elements called 'title' that mean different things. DocBook and Dublin Core both have elements called date. TEI and HTML both have elements called 'p'.
These examples aren't compelling how?
- Not compelling?
2006-12-25 22:58:41 Micah Dubinko [Reply]
Sorry for the late reply, Norm.
I'm saying I haven't seen a compelling markup fragment that uses any of those elements in a way that's not obvious from context. In practice, there's no ambiguity about what the element at html/head/title is all about. Likewise for other same-named elements.
I'm treating introductory XML texts as a touchstone here, and they generally all fail to provide compelling examples. Still happy to see examples otherwise, though. :)
-m
- Not compelling?
2006-11-14 15:58:02 mikeday [Reply]
"DocBook, HTML, Atom, RSS, and Dublin Core all have elements called 'title' that mean different things."
Very true, but does it matter? This multitude of title elements each occur in different contexts. If you have a title element inside a head element inside an html element, does it really need to be explicitly qualified with the XHTML namespace for you to recognise it as an HTML title?
Why not use context for disambiguation instead of requiring globally unique names for everything?
- Not compelling?
2007-01-29 20:43:03 rpbourret [Reply]
Because keeping a context stack is much, much harder than simply comparing the URI and local name of the element or attribute at hand with the one already in your code.
And there's no guarantee it will be correct. Even if you expect a particular "p" element at a certain place in an XML document, there's no guarantee that an unnamespaced p element will be the one you want. Good luck if it uses a different content model.
- Not compelling?
- Not compelling?
- Examples of documents with multiple namespaces
2006-11-10 02:19:32 spepping [Reply]
Examples of documents with multiple namespaces are hard to find? All Elsevier's XML files use multiple namespaces (see the DTDs at http://info.sciencedirect.com/implementing/implementing_sdos/dtds/):
One namespace for the family of documents to which this document belongs to, one namespace for elements shared by all our documents, one namespace for the markup of bibliographic references, one namespace for an extension to CALS table markup, namespaces for elements from public standards like XLink and MathML. The documents also use remapping of namespace prefix binding: the default namespace is remapped to the CALS namespace; this was necessary because the CALS DTD does not allow a prefix.
Useful? Certainly so. Technically complicated? Yes, especially the fact that no prefix is bound to a namespace is a trap for many. Does it work well with the DTDs? To a certain extent; I would prefer to use RELAX-NG schemas.
If I would now rewrite these schemas, I would certainly try to reuse XHTML elements, and I would certainly identify them as such by using the XHTML namespace.
In summary, I love namespaces.
- Examples of documents with multiple namespaces
2006-11-10 02:26:00 spepping [Reply]
'that no prefix is bound to a namespace' should be: 'that the absence of a prefix is bound to a namespace'.
- Examples of documents with multiple namespaces
- XLink
2006-11-09 11:38:46 Erik Wilde [Reply]
i also liked the example of XLink. personally, i think that one of the reasons why XLink failed is what could be called "markup-centric thinking". markup is good, but i think it should *not* be all that you are doing. do a model, and then map it to markup. markup is good, but models are reusable, so if someone else decides that the model is useful, but that the markup is not good, at least on the model level people can still align what they are doing. xml technologies don't give us any support for this kind of model reuse, but this does not mean that would not be a better way of doing things.
i proposed this for XLink (http://dret.net/netdret/publications#wil02i), but when discussing it in forums, it very quickly became clear that most people think that all that matters is markup: http://lists.xml.org/archives/xml-dev/200208/msg01196.html
i think that even though markup is good and great and everything, there is more to markup than just the names of things being used there, and failure to separate the two thing carefully (the model and its embodiment in a certain vocabulary) is not a good idea. this will probably infuriate the markup purists, but i think there are just too many examples for problems caused by this.
by being more careful in separating models and markup, things like HLink (anyone remembering this?) and similar monstrosities could have been avoided, and it would have been possible to re-use a very useful set of semantics.
only because namespaces are complicated and because xml technologies do not support us in separating models and markup, does not mean that we should not be good engineers and make this separation.
- Namespace Examples
2006-11-09 11:10:06 Erik Wilde [Reply]
i like the section about namespace justifications in books, claiming that potential name conflicts are the problem. they are *not* the most important problem. i used to explain and teach namespaces using this approach in the beginning, but moved away from it because i recognized that this creates a very wrong image of namespaces in the minds of people given this explanation. i think namespaces should not be explained that way.
namespaces are about identification, and avoiding name clashes is just a (useful) side effect. the identification part is the part which is essential and which makes namespaces indispensable, however ugly they may be. and i think that a lot of namespace backlash is caused by people explaining them badly. this does not mean that they do not have their very ugly parts, they certainly do, so let me conclude with what every wanna-be cool web person has to say:
i hate namespaces!
(actually i don't, but they are surprisingly hard to understand and explain and teach properly, given how simple they really are.)
cheers,
dret.
- DocBook Would Be Fine...
2006-11-09 10:35:18 KeithFahlgren [Reply]
> What would the XML world look like without
> namespaces, or with a less intrusive version
> thereof? Docbook and HTML would continue to be
> fine.
DocBook 4.* would be fine. DocBook 5 would _not_ be fine. One of the nicest additions for DB5 is <info> blocks in a lot of places. As we move into a world with more metadata, we need more places for this metadata to live. One of the recent discussions on DB5 included a discussion on whether or not to allow other namespaces in <info> blocks, for RDF, OWL, etc. This presents a wonderful opportunity and would be unavailable without namespaces.
- It's all up to the parser implementation
2006-11-08 18:56:40 chuckwh [Reply]
Hammersley's article is not a good reference for this discussion. Winer tried to get adoption on RSS on a broad scale, and pretty much singlehandedly wrote the "spec" as I understand history. I'm not sure Dave quite grokked RDF when he developed RSS, but I'm not sure he didn't, either. I remember when RSS was in its infancy and Winer was trying to find a home for it. He ended up publishing the spec on the userland site. There is still, to this day, no official home for RSS, no adoption within W3C.
I wrote Mastering XML Premium Edition (published in 2000) with Liam Quinn, who is now XML lead for the W3C, and we didn't devote an entire chapter to namepaces but we did point out, sort of, what you have pointed out here, when we mentioned, on page 62, in year the 2000 (actually, it was written a tad earlier), that IE created tons of issues with namespace issues.
One thing software vendors should keep in mind is that if they *DO* change the URI, and they're creating re-usable components (see Adobe acquisition of Macromedia and resultant Flex namespace issues), they need to change the namespace prefix, too, simply to avoid confusion.
But picking on Dave Winer to me seems bizarre. He's been a very serious contributor to many things XML, and the fact that RSS is not standardized through a more formal process, which normally includes a massive vetting process, is a more serious topic than the implementation of a small, insignificant program named Radio that few people remember or care about.
