Menu

The xml:id Conundrum

February 23, 2005

Rich Salz

Identifying Attributes

XML attributes whose type is ID are very important. They are the only fundamental way to identify a piece of XML. While we have XPath, XPointer, and so on, the only identification mechanism that every XML parser, and therefore every XML application, must understand is ID attributes.

They're very cool. If you have a URL to the document, and the piece you want has an ID attribute, you can just glue the two together:

    http://www.example.com/public/report.xml#part2

The problem, however, is that you need out-of-band information — that is, information outside the document itself — in order to know which are the ID attributes! Without a DTD or Schema or hard-wiring your application, you can't tell which of the following is part2:

    <SOAP:Envelope>

      <SOAP:Header>

        <foo iam="part2" id="part2">

          I am part 2

        </foo>

      </SOAP:Header>

      <SOAP:Body>

        <bar tns:id="part2">

          No, it's me

          <baz wsu:Id="part2">

            Ignore them, it's me!

          </baz>

        </bar>

      </SOAP:Body>

    </SOAP:Envelope>

Of course, only one of those attributes can actually be an ID attribute. If more than one of them were of type ID, then the document would violate the uniqueness constraint and wouldn't be valid XML.

This example also illustrates a few other points. For example, if the foo header element were added by a SOAP intermediary, then it could have inadvertently invalidated the document. In order to prevent this kind of thing from happening in a web services deployment, it's necessary for every node to have the schema for every possible bit of XML that might pass through it. And, of course, that's unrealistic. Instead, applications often generate ID nodes with random values unlikely to cause conflicts, such as UUIDs, or pick obvious names like "Body" that are unlikely to be applied anywhere but the SOAP message body.

The second point is that while many different namespaces define ID attributes, they almost always name them ID, Id, or id; interestingly I don't think I've ever seen iD. In fact, you could be pretty successful if you just wrote an XPath search that fetched attributes with those names:

    //attribute::*[translate(local-name(), 'ID', 'id') = 'id']

Why do you need to do this? Suppose you are doing generic processing, such as signature verification, and you want to keep the layering very clean. The xmlsec library, which is built on libxml2, the Gnome XML toolkit, suffers from this problem. It's question 3.2 of the FAQ.

It seems pretty obvious that a nice and clean solution would be to define a "universal" ID attribute that could be understood by everyone. There are two obvious ways to do it:

  • Add an xml:id attribute; that is, an id attribute in the well-known namespace for XML, which is under the sole control of the W3C.
  • Add an xmlid attribute, which is in no namespace, but since it starts with the magic three letters is also under the sole control of the W3C.

Which one to use seems a matter of taste and opinion; there is a Candidate Recommendation that uses the namespace form. Careful readers will note that applications that aren't aware of namespaces — do any of those even exist in the web services area? — will just see that xml:id starts with the three reserved letters.

Who Needs This?

Outside of basic interoperability, one of the biggest areas of activity in the web services world is security. In a SOAP message the preferred way to identify the sender and guarantee that the message hasn't been altered is to use a "detached" signature. For example, a WS-Security SOAP header will contain an XML Digital signature that references the content of the SOAP Body by means of a fragment identifier which points to an ID attribute. Here's a sketch of such a message:

    <SOAP:Envelope>

      <SOAP:Header>

        <wsse:Security>

          <Signature xmlns="http://www.w3.org/2000/09/xmldsig#">

          ...

            <Reference URI="#Body">

            ...

          </Signature>

        </wsse:Security>

      </SOAP:Header>

      <SOAP:Body>

        <Statement xmlns="http://www.example.com/tns" wsu:ID="Body">

          ...data being signed...

        </Statement>

      </SOAP:Body>

    </SOAP:Envelope>

Note that WS-Security had to define an ID attribute in their "utilities" namespace; see the Statement element. Similarly, the SAML working group defined its own ID attribute, as has the XML Key Management working group. No doubt there are others. Almost anyone who wants to make sure that their data can be signed within a SOAP message is all but guaranteed to define their own ID attribute.

So things would be a lot simpler, a lot cleaner, and a lot safer (i.e., a potential source of errors would vanish), if there were a universal ID attribute.

Of course, since hindsight is 20/20, it would have helped if the base XML specification included this, or if the XML DSIG group had thought of it.

Be Careful What You Wish For

In order to understand this section, we need to have a little background on one particular aspect of digital signatures. Don't worry, math-phobes, it will be pretty painless.

Hashing is the process of taking a string of input bytes of any length, such as the UTF-8 representation of an XML document, and generating a fixed-size string of output bytes, such that no two inputs will generate the same output. A proper hash, or message digest, is crucial to being able to sign an XML document. The issue with XML is that it's possible rewrite a document in many different ways and still preserve the syntax and semantics. For example, XML doesn't care about the order of attributes or the amount of whitespace between them. Thus, from the XML perspective, the following elements are identical:

    <foo a="this" b="not" c='a'/>

    <foo a="this" b="not" c='a'><foo>

    <foo b="not" a="this" c='a'></foo>

    <foo    b="not" a="this" c='a'></foo>

    <foo

    <="not"

    a="this"

    c='a'/>

They will, however, all generate different message digests. This means that if someone signs any of the above formats, someone else may not be able to verify the signature because the hash values are different.

The solution is to canonicalize the input before hashing it. (By the way, canonicalization is typically written as c14n; count the letters.) The basic canonicalization method, known as Canonical XML Version 1.0, addresses all those variations. It mandates an explicit close tag, defines a sorting order for attributes, and specifies the whitespace between attributes. The document is fairly short — the actual rules take about two pages — but it is remarkably subtle and difficult to get right.

One of the basic rules of c14n is that if you are processing a subset of a document, then all attributes in the official XML namespace are effectively imported into the document. If you consider namespaces, this makes a lot of sense. If you are signing a piece of XML and a namespace declaration is not part of what you're signing, you still need the value to be included in the digest. For example, suppose we are signing the inner element in the fragment below. If an attacker changes the outer namespace to something like "debit", bad things could happen:

    <SOAP:Body

        xmlns:tns="http://www.example.com/banking#deposit">

        <tns:Transaction acct="12341234">

        1200.00

        </tns:Transaction>

    </SOAP:Body>

The Problem

Perhaps unfortunately, the basic c14n specification says that all attributes in the XML namespace must be imported. This makes sense for namespaces and xml:space but is not appropriate for xml:base, as discussed in the specification. (This is arguably a bug in the c14n specification, but it made sense at the time, and with all the implementations out there now, it cannot be changed.)

It's also not appropriate for the proposed xml:id. In the example above, an attacker could add an attribute like xml:id='x2' to the SOAP Body, and any signature on the internal element will fail to verify.

More from Rich Salz

SOA Made Real

SOA Made Simple

Freeze the Core

WSDL 2: Just Say No

XKMS Messages in Detail

So xml:id breaks c14n. Another basic c14n mechanism is Exclusive Canonicalization (exc-c14n) which doesn't do the importing. It didn't do this because people realized that an attacker could also break a signature just by adding a namespace declaration to the envelope.

What should be done? I'm not sure. When I started writing this piece, I was in favor of xmlid. But everything else is in a namespace, so that seems yucky. Also c14n can already be broken by xmlns, so this doesn't create a new problem, it just makes an existing one bigger. Viewed parochially, web services use exc-c14n anyway, so let's use xml:id. A new namespace has been suggested, and is arguably the more pure solution, but I think it starts a bad precedent of creating intimate knowledge of new namespaces at the very core of all our XML processing engines, and I don't think we want that.

For once, I don't have a strong opinion. But I'm inclined to go with xml:id and leave c14n alone. What do you think?