Menu

XML Namespaces Don't Need URIs

April 13, 2005

Michael Day

The decision to identify XML namespaces with URIs was an architectural mistake that has caused much suffering for XML users and needless complexity for XML tools. Removing namespace URIs altogether and simply using namespace prefixes to identify namespaces would make it easier for people as well as software to read, write, and process XML.

Background

In XML 1.0, element and attribute names were treated as atomic tokens with no interior structure.

Namespaces in XML introduced the concept of element and attribute names existing in namespaces. Namespaces are identified by URIs and bound to namespace prefixes. It is also possible to bind a default namespace to the empty prefix. This namespace will then apply to all elements that have no prefix.

For example, the XSLT elements exist in the http://www.w3.org/1999/XSL/Transform namespace, which is traditionally bound to the xsl namespace prefix:


<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <xsl:variable name="foo" select="1"/>

    ...

</xsl:transform>

A namespace-aware XML processor will internally resolve these element names into tuples containing the namespace URI, the namespace prefix, and the local name:


{ http://www.w3.org/1999/XSL/Transform, xsl, transform }

{ http://www.w3.org/1999/XSL/Transform, xsl, variable }

The particular namespace prefix used is supposed to be irrelevant, but in practice people agree on common namespace prefixes for clarity, as it would be very confusing if everyone used different ones. Here are some familiar namespace prefixes from the W3C:

Prefix URI
xml http://www.w3.org/XML/1998/namespace
xsl http://www.w3.org/1999/XSL/Transform
fo http://www.w3.org/1999/XSL/Format
xsd http://www.w3.org/2001/XMLSchema
html http://www.w3.org/1999/xhtml
svg http://www.w3.org/2000/svg

Looking at this list one might wonder why it should be necessary to specify the namespace URI at all, considering that these namespaces already have a standard prefix that is far more concise and easy to remember. Using URIs to identify namespaces is a problematic approach with many usability flaws, all of which would be solved if namespaces were identified by the namespace prefix instead.

What Is Wrong with Namespace URIs?

Namespace URIs Have Terrible Syntax

As seen in the table above, namespace URIs tend to be long and cryptic, with lots of punctuation and case-sensitive text. In this instance the W3C has compounded the problem by adding dates to ensure that the namespace URIs are unique, as if it were likely that the W3C would create another "XSL/Transform" or "xhtml" namespace in the future.

While namespace URIs may be guaranteed to be unique, they are also guaranteed to be impossible to remember. Quick, without checking, can you remember if the namespace URI for W3C XML Schema ends with "xmlschema", "XML/Schema", or "XMLSchema"? Was the namespace URI for SVG allocated in 1999, 2000, or 2001?

The opaque nature of these namespace URIs is inconvenient for users, who must begin each new XML document with a ritual of carefully copying and pasting all of the namespace declarations from the last document that they were working on. If the namespace URIs are typed slightly wrong, the XML document will lose its intended meaning and software will fail to process it

HTTP URIs Are a Poor Choice for Namespaces

HTTP URIs are often used as namespace URIs. However, most software treats HTTP URIs as resource locators, not identifiers. For example, the requirement to type namespace URIs exactly as they appear is at odds with the standard practice for HTTP URIs, which usually have many equivalent forms:


http://w3.org/1999/XSL/Transform

http://www.w3c.org/1999/XSL/Transform

http://www.w3.org/1999/XSL//Transform

http://www.w3.org:80/1999/XSL/Transform

http://www.w3.org/1999/XSL/Transform

All of these HTTP URIs will return the same web page if entered into a browser, but only the last one is the correct namespace URI for XSLT. This clashes with user expectations, to put it mildly. The one potential advantage of using HTTP URIs would be that they could act as links to useful resources, but in practice most people don't bother doing this. This disinterest is most strikingly observed with the XSLT and XSL-FO namespaces, which point to brief documents saying "Someday a schema for XSL Transforms will live here" and "This is another XSL namespace" respectively.

There was an effort to develop RDDL (Resource Directory Description Language) expressly for creating documents to sit at the end of HTTP namespace URIs and direct XML tools to associated resources such as style sheets, schemas, and documentation. It is not used by any tools on the Web and with good reason: there are better ways to associate resources with individual XML documents.

Aside: Why were URIs chosen over better alternatives?

It is not difficult to construct a better syntax than HTTP URIs for unique identifiers. A good existing example is the syntax used to identify Java packages:


org.w3.xsl.transform

Look at the difference. The identifier is all lowercase to make it easier to remember, the redundant http://www. that wastes the first 11 characters of so many namespace URIs is gone, as are all the slashes.

Given that Java predated the XML Namespaces specification, one can only assume that URIs were chosen to identify namespaces for reasons other than syntactical convenience, such as their intended use in the RDF/XML syntax.

Namespace URIs Don't Help People Read or Write XML

Namespace URIs give people the ability to write XML documents with arbitrary prefixes like <foo:schema> or <superluminal:transform>, but people don't do that, sensibly enough, as it would be confusing and serve no purpose. Since people are already identifying namespaces using sensible namespace prefixes, having to write the namespace URIs as well is just a hindrance.

Namespace URIs don't help people to read XML documents either. They add an unnecessary level of indirection that makes XML documents harder to interpret, as looking at an element name is no longer enough to tell you exactly what that element is. When you read an XML document beginning with <html>, or <svg>, or <xsl:transform>, or <xsd:schema>, should it really be necessary to carefully check that the namespace prefix is bound to the correct namespace URI?

Since namespace URIs don't help people to read or write XML documents, why should XML tools complain if they are omitted? Namespace URIs do not fit in with the goals of XML, which has been designed to be produced and/or consumed by people as well as software.

Could Namespaces Work Without URIs?

If namespace URIs were removed and namespaces were identified solely by namespace prefixes instead, namespaces would still make sense and existing XML specifications would only require minor alterations.

XSLT Without Namespace URIs

XSLT is one of the few XML languages that actually relies on namespaces for disambiguation, specifically to distinguish XSLT elements that are processed specially from other elements, which are output verbatim. XSLT also has the requirement to perform namespace rewriting in order to be able to output elements that are in the XSLT namespace without actively processing them, similar to quoting or escaping in other programming languages.

However, XSLT has no need for namespace URIs. An XSLT processor could instead treat any element with an xsl prefix as being in the XSLT namespace and process it accordingly. Elements with a different prefix or no prefix would be output verbatim in the usual manner, and namespace prefix rewriting would also take place as normal using the existing XSLT aliasing mechanism:


<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

Removing all of the namespace URIs from an XSLT transform will make it easier to read and write but will not affect the way it is processed, so why require namespace URIs for XSLT?

XHTML Without Namespace URIs

XHTML documents rarely use namespace prefixes, as many web browsers are not XML-aware and do not expect to see them. In any case, a root element of <html> should be sufficient to identify an XHTML document; there is no pressing need to add the namespace URI as well. Current W3C practice encourages XHTML documents to accumulate the namespace URIs for XHTML, SVG, MathML, XForms, XML Schema, XML Events, and who knows what else. All of these have simple prefixes that are sufficient to identify the namespace in question, so there is no reason to place this burden on users. XHTML does not need namespace URIs.

RDF/XML Without Namespace URIs

RDF/XML, the XML syntax for RDF that seems to have been the driving force for the adoption of namespace URIs, does not need namespace URIs. Or to be more accurate, it would be trivial to define a method of binding URIs to namespace prefixes specifically for RDF/XML, without forcing it to be a standard that applied to all XML documents. Given that RDF/XML is not an ideal syntax for representing RDF anyway—there exist numerous superior alternatives—-it is unfortunate that it has imposed such a clumsy namespace mechanism on the wider XML community.

The Default Namespace

There are some occasions such as modular XHTML, where people may wish to write elements without namespace prefixes that are nonetheless in a namespace. This could be done with an attribute like xmlns; let's call it xml:ns, just for fun:


<blockquote xml:ns="html">

    ...

</blockquote>

An explicit namespace prefix is probably a better choice though, as it makes each element stand alone, with a fixed meaning that cannot be changed at the whim of its ancestors.

QNames in Text Content

One of the uglier architectural warts that namespaces has introduced to XML is the use of qualified names in text content:


<foo:message status="foo:severe" ...

The problem, of course, is that according to the current specification of XML Namespaces, namespace prefixes are supposed to be irrelevant and may be changed without altering the meaning of the document. Unless it uses namespace prefixes in text content, in which case the namespace prefixes become very significant indeed. Why not just drop the URIs, admit that the namespace prefixes are significant, and end the whole pointless charade?

Further Reading

  • Use XML Namespaces with Care, where Uche Ogbuji provides some more handy hints for effective namespace usage.

  • A Plea for Sanity, where Joe English defines the useful concepts of neurotic, borderline, psychotic, normal, and sane use of namespaces in XML documents.

Conclusion

For XML, it may already be too late to remove namespace URIs. While the XML specification itself does not depend on them, enough implementations already do so that it may be impractical to effect a change. However, there are still steps that can be taken when designing XML vocabularies to minimize the problems that namespace URIs cause:

  • Carefully consider whether namespaces are really necessary. Many XML vocabularies don't need them, so don't feel compelled to use them without good reason.

  • If namespaces are necessary, choose namespace URIs that are concise and easy to remember. It helps if they are all lowercase and don't include unnecessary information.

  • Try to get by with only one namespace if you can. There is not much to gain by multiplying namespaces unnecessarily except trouble and complexity. If you must use more than one namespace, at least ensure that the namespace URIs follow a consistent pattern.

  • Agree on standard namespace prefixes for your XML vocabularies; they will help people to read and write your XML without confusion. If you find yourself using the default namespace rather than the prefixes, consider whether you actually need a namespace at all.

Following these steps will help to keep namespace URIs under control in your XML documents.