Answering the Namespace Riddle
February 28, 2001
This tutorial introduces the Resource Directory Description Language (RDDL), which is the result of a recent project conducted by the XML-DEV community. It provides an overview of RDDL's very simple vocabulary and the benefits it can bring to XML applications.
Namespaces are now a common feature of any new XML vocabulary. While their use is spreading, there is still a great deal of controversy associated with them. The controversy has generally focused on the choice and use of URIs, or more commonly URLs, as Namespace identifiers. While the XML Namespaces specification notes that URIs were selected merely as a unique identification system, it is silent on the issue of what, if anything, those URIs should point to.
The received wisdom, as documented in the Namespace FAQ, is that these URIs are not meant to point to anything. Paste one into your web browser, and you'll likely get "404 Not Found" error. URLs have become synonymous with web resources -- developers and Internet users alike expect to be able to point their browsers at these URLs and obtain something intelligible.
In response to this expectation, many developers have begun placing useful resources at a namespace URL. For example, the RSS 1.0 specification is found at the RSS namespace URI. Other XML applications place XML schemas, of different varieties, at these URLs, giving a handy place for applications to retrieve schemas during processing.
|Table of Contents|
This unregulated practice, and more importantly the mixture of resources that might appear at these URLs, has lead to controversy in XML circles. Some believe that the practice should be deprecated, others that it must be regulated in some way. Following a recent resurgence of this debate on XML-DEV a consensus was finally reached. A namespace URL should point to a directory of resources rather than a single web page or schema. Thus RDDL was born. Additional background of the debate can be found in a recent XML-Deviant column, "Old Ghosts: XML Namespaces"
RDDL (pronounced "riddle") was designed by Jonathan Borden and Tim Bray in collaboration with members of the XML-DEV mailing list. A number of requirements contributed to the design of the language:
- Recognize that a plurality of different resources could be associated with a Namespace, no single resource type should be favored
- Provide machine-readable access to the resource directory
- Provide human-readable (i.e. browsing) access to the resource directory
- Use a simple well-defined syntax
- Recognize that a directory might contain multiple instances of a particular resource type; for example, multiple CSS or XSLT stylesheets
RDDL meets these requirements in the following ways:
RDDL documents describe a directory of resources
RDDL is derived from the XHTML standard and is therefore accessible using existing browsers
The RDDL vocabulary contains only a single element
RDDL adds an additional level of indirection to its resource directory, allowing a resource to have both a nature and a purpose
The details of these features are covered in the next section. It is worth noting that RDDL takes advantage of the modularization features of XHTML to build its vocabulary upon XHTML Basic. As it adds but a single new element that borrows features from XLink, it's very accessible to programmers and web developers alike.
The XHTML framework
RDDL is layered on XHTML Basic; thus RDDL documents can contain any element from the XHTML Basic module. With this in mind, what follows is a simple template for an RDDL document.
<!DOCTYPE html PUBLIC "-//XML-DEV//DTD XHTML RDDL 1.0//EN" "http://www.rddl.org/rddl-xhtml.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rddl="http://www.rddl.org/" xml:lang="en"> <head> <title>My First RDDL Document</title> <link href="http://www.rddl.org/xrd.css" type="text/css" rel="stylesheet" /> </head> <body> <h1>My First RDDL Document</h1> <-- Body of RDDL document to appear here --> </body> </html>
There are several things worth noting about this first example:
- The DOCTYPE declaration refers to the RDDL DTD, which has been defined as an XHTML module (see "XHTML Modularization" for more information)
- Since the majority of the elements in RDDL are from the XHTML namespace, it's been declared as the default namespace in the document.
- The explicit declarations of the RDDL and XLink namespaces
- Reference to a CSS stylesheet provided by the RDDL authors for styling RDDL documents; Jonathan Borden has also produced an IE5 behavior.
In all other regards the document is practically identical to any other simple XHTML document; save it to a file and fire up your browser: you'll see the expected result.
Given this initial template we can now begin adding elements from the XHTML and RDDL namespaces.
The resource element
RDDL includes only a single new element, called
resource. In fact the element
is only required as a placeholder for a number of XLink attributes. RDDL is very simple.
you can understand XHTML and basic XLink, then RDDL is a breeze.
The simplest version of the
resource element is as follows:
<rddl:resource xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/schema.xsd" xlink:title="The PizzaML Schema"> <!-- Description of the resource goes here --> </rddl:resource>
This adds a single resource to the RDDL document. Its location is defined using a
XLink reference. A title is added for completeness. Additional descriptive text can
associated with a resource using elements from the XHTML namespace. An RDDL element
roughly equivalent to the
xhtml:div element and may contain, therefore, the
usual mixture of paragraph and textual markup elements.
The following is a more complete example using the XHTML template. In this case we're producing an RDDL document that describes the resources associated with the namespace of a fictitious XML vocabulary for describing pizzas).
<!DOCTYPE html PUBLIC "-//XML-DEV//DTD XHTML RDDL 1.0//EN" "http://www.rddl.org/rddl-xhtml.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rddl="http://www.rddl.org/" xml:lang="en" xml:base="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/ns/"> <head> <title>RDDL Document for the Pizza Description Language</title> <link href="http://www.rddl.org/xrd.css" type="text/css" rel="stylesheet"/> </head> <body> <h1>RDDL Document for the Pizza Description Language</h1> <p> This document provides a list of resources associated with the Pizza description language, PizzaML. </p> <rddl:resource xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/schema.xsd" xlink:title="The PizzaML XML Schema"> <p> The XML Schema for PizzaML documents is <a href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/schema.xsd"> available from here</a>. </p> </rddl:resource> </body> </html>
We now have a complete RDDL document containing only a single resource, an XML Schema
PizzaML. Obviously this example is very trivial. In reality a directory is likely
all sorts of additional resources. Thus, we need to introduce two additional XLink
attributes of the
resource element to declare the nature and purpose of a
resource in our RDDL directory.
|Table of Contents|
A simplistic directory format might associate a unique name with each resource. Such a format won't scale, though, since it requires the user or processing application to know in advance the name of the resource they wish to retrieve. Without a standard naming convention, it's hard to know resource knows in advance, particularly if multiple resources of the same type must be added to the directory. How would you distinguish, in an interoperable way, between two XSLT stylesheets referenced as resources?
RDDL adds a level of indirection and a naming convention that solves these problems. It distinguishes the type of a resource, known as its nature, from the action that it performs, known as its purpose.
We might want to add several new resources to our PizzaML directory. An XSLT stylesheet to convert PizzaML documents into an XHTML menu, a second stylesheet to render a PizzaML document as RSS (to syndicate our menu to customers), and a DTD that can be used for validation instead of our XML Schema. The following example demonstrates such a usage, with the XHTML bits left out for clarity.
<rddl:resource xlink:role="http://www.w3.org/1999/XSL/Transform" xlink:arcrole="http://www.w3.org/1999/xhtml" xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/pizza2html.xslt" xlink:title="Transform Pizza Menu to XHTML"> <!-- ... --> </rddl:resource> <rddl:resource xlink:role="http://www.w3.org/1999/XSL/Transform" xlink:arcrole="http://purl.org/rss/1.0/" xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/pizza2rss.xslt" xlink:title="Transform Pizza Menu to RSS"> <!-- ... --> </rddl:resource> <rddl:resource xlink:role="http://www.isi.edu/in-notes/iana/assignments/media-types/text/xml-dtd" xlink:arcrole="http://www.rddl.org/purposes#validation" xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/pizzaml.dtd" xlink:title="The PizzaML DTD"> <!-- ... --> </rddl:resource> <rddl:resource xlink:role="http://www.w3.org/2000/10/XMLSchema" xlink:arcrole="http://www.rddl.org/purposes#schema-validation" xlink:href="http://www.bath.ac.uk/~ccslrd/examples/pizzaml/schema.xsd" xlink:title="The PizzaML XML Schema"> <!-- ... --> </rddl:resource>
The new additions are the
xlink:role attribute describes the resource type. RDDL defines a simple
rule to determine the correct URI for a role (XLink requires that both role and arcrole
attributes are URI references):
- If the reference is to an XML language that defines its own namespace, then the role should be the URI of this namespace.
- Else, if the reference is not an XML document, but can be distinguished by its MIME type (true for almost all resources), then this is used to derive the correct reference by adding a prefix of "http://www.isi.edu/in-notes/iana/assignments/media-types/".
In our example above, we can see that the XSLT resources refer to the XSLT namespace in their role attributes, while the DTD uses the alternative mode, deriving its nature from the mime type for a DTD.
xlink:arcrole attribute describes the purpose of our resource. It is
harder to determine a purpose than to determine a type, so RDDL defines a number of
purposes. In general the purpose of a resource is dependent on its nature.
Our example above provides several examples. The purpose of the PizzaML Menu stylesheet is the XHTML namespace, which indicates that this is an XSLT stylesheet that generates XHTML. Yet the PizzaML DTD is to be used for validation and so is labeled with the special RDDL validation purpose. It's important to note that RDDL differentiates between DTD and Schema-based validation, as the example clearly demonstrates.
There isn't space to discuss all the potential purposes of RDDL resources, but the initial list of well-known purposes provides many more examples. As with natures, this list will expand over time.
It should be clear from the example that, using nature and purpose together, it's possible to distinguish between two resources of the same type. Such distinction allows an application to define the type of resource it wishes to retrieve. A person can directly browse the RDDL document to determine the resource that he or she requires.
Example RDDL Documents
Presently there are only a few live examples of RDDL on the Internet, although this is likely to change rapidly when developers begin to use RDDL in earnest. The RDDL specification and associated documentation are all RDDL documents. Schematron, the rules-based validation language, provides an RDDL document at its Namespace URI. Jonathan Borden, co-editor of the RDDL specification, has provided an example of how the RSS specification would be converted into RDDL. The RSS example includes some interesting examples of RDDL purposes, notably for describing mailing lists ("http://www.rddl.org/purposes#mailing-list") and contributors ("http://purl.org/dc/elements/1.1/contributor").
Having created a simple RDDL document, the natural question to consider is how we might use it within an XML application. Jonathan Borden has started to create a Java API for RDDL. The API will allow an application to manipulate resources described by an RDDL document, allowing enumeration of resources with given natures and purposes, as well as retrieving individual resources upon request. It's still a work in progress, so we'll confine the following examples to simple XSLT transforms.
Defining XSLT transformations to manipulate RDDL documents -- to extract the locations of resources with particular natures or purposes, for example -- is very simple. We'll attempt something a bit more ambitious. Let's say that we want an XSLT transformation that will
- process a RDDL document, and
- produce a local copy of a desired resource.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:rddl="http://www.rddl.org/"> <xsl:param name="purpose"/> <!-- template to grab the RDDL document --> <xsl:template name="getResource"> <xsl:param name="ns"/> <xsl:param name="role"/> <xsl:param name="arcrole"/> <xsl:apply-templates select="document($ns)//rddl:resource[$role = @xlink:role and $arcrole = @xlink:arcrole]" mode="getResource"/> </xsl:template> <!-- return a copy of this resource --> <xsl:template match="rddl:resource" mode="getResource"> <xsl:copy-of select="document(@xlink:href)"/> </xsl:template> <xsl:template match="/"> <xsl:call-template name="getResource"> <xsl:with-param name="ns"> <xsl:value-of select="namespace-uri(*)"/></xsl:with-param> <xsl:with-param name="role"> http://www.w3.org/1999/XSL/Transform</xsl:with-param> <xsl:with-param name="arcrole"> <xsl:value-of select="$purpose"/></xsl:with-param> </xsl:call-template> </xsl:template> </xsl:stylesheet>
There are several things to note about this stylesheet.
- A named template "getResource" processes the RDDL document, extracting those resources with the correct nature (role) and purpose (arcrole)
- Copying the desired resource is handled with the document() function
- The getResource template is invoked with several parameters by the main template
- The nature is fixed; only XSLT resources are extracted
- The purpose is chosen by the user, using the purpose stylesheet parameter
- The RDDL document location is extracted from the namespace of the root element in the input document
We can run this stylesheet using the example PizzaML document. For this demonstration we'll request resources with the RSS purpose:
C:\projects\rddl>java com.icl.saxon.StyleSheet -o out.xsl pizza.xml getResource.xsl purpose=http://purl.org/rss/1.0/
We should find the PizzaML to RSS stylesheet in the output file.
Now we can dynamically download XSLT stylesheets upon request as long as there is an RDDL document associated with the namespace. Obviously this stylesheet needs to be more robust: it doesn't handle documents with namespaces, and it ignores the fact that a document may contain elements from multiple namespaces. Yet this is enough for us to get started.
We can take this a step further by defining a two-step translation process. The first step elects the stylesheet to download, and the second applies it:
C:\projects\rddl>java com.icl.saxon.StyleSheet -o out.xsl pizza.xml getResource.xsl purpose=http://purl.org/rss/1.0/ C:\projects\rddl>java com.icl.saxon.StyleSheet pizza.xml out.xsl
The end result is an RSS version of our original PizzaML document. In fact we can put this process into a handy batch file.
The advantages of this approach should be obvious. I no longer need to know where the stylesheet is kept, a detail, among others, neatly encapsulated within the RDDL document. There are a number of avenues for further expansion: a GUI interface to present users with a list of possible transforms would be useful; as would a means to cache downloaded documents locally. Neither of these are RDDL responsibilities, but they can easily be constructed on the basic framework outlined above. Implementing the two-step process as a Java application should also be a simple project.
There are numerous other possibilities for using RDDL in XML applications:
- Retrieve a CSS stylesheet for styling an XML document
- Retrieve various XML schemas for validating a document
- Download application code for processing XML instances
- Download an application plugin to visualize or manipulate an XML data
- Retrieve user documentation for a user editing an XML document
Because RDDL does not limit the resources that can be linked to a namespace URI, none of these possibilities are mutually exclusive. The same cannot be said for URIs that resolve to a single resource.
RDDL is the model of simplicity. Layered on top of XHTML, it's instantly familiar. Yet it includes machine readable data making it suitable for consumption by applications and humans alike. RDDL is a good example of what can be achieved using XHTML modularization and is perhaps indicative of the types of document that will populate the Semantic Web.
The real benefits of RDDL will be realized as namespace URIs are populated with RDDL documents; and as applications can routinely rely on them as a source of required resources. In the short term the lack of use can be encapsulated within RDDL APIs until the network effect takes hold.
RDDL can also facilitate the development of new types of application. By traversing a number of RDDL documents, it's possible for an XML processor to piece together a pipeline of transformations that may be applied to a given document. If the processor needs to transform a document from language X into language Y, it may be able to find one or more intermediate transforms if no direct route is available -- a kind of simple inference which may be key to Semantic Web applications.
The XML community has long desired a true XML browser, an application that can display and manipulate any kind of XML document. A key XML browser component is a way to download dynamically new behaviors to deal with unknown document types. With appropriate application code referenced from RDDL resource elements, this kind of dynamic adoption becomes much easier.
In short, RDDL is a elegant little language that can resolve much of the confusion and debate over XML Namespaces, while providing the means to build some very interesting applications.