Entities and their Identifiers

If you are going to use any external entities, you have to declare them, and in that declaration, you have to give a "system identifier", and that system identifier has to be a URI, and whoever is using your document has to be able to use that URI to retrieve the contents of the entity.

A URI, of course, is (in the year 1998, in practical terms) just a URL with a party face.

Since you can't tell if a URL is any good by looking at it, the only way to check it out is to try it, using (if you're a programmer) your handy local library (e.g. java.net for Java geeks), or (if you're a real person) your handy local Web browser.

This is quite a bit different from SGML, where you can stay away from the fragility and general icky-ness of URLs by using "public identifiers", which are just a label that you look up in a local catalogue, or just auto-magically know about.

Public identifiers are a really good idea, and it would be good if XML had them. However, at the time we designed XML, there was really no generally-agreed-on way to resolve public identifiers to actually find the data. And URLs are certainly not perfect, but nearly everyone knows what they mean, more or less, and pretty well everybody has machinery on their desktop that they can use to retrieve them.

So, in XML you can use public identifiers, but you still have to provide a system identifier. If you do something silly like try using an empty identifier, or one that doesn't work, anyone to whom you send the document has every right to complain that it's broken.

Back-link to spec

Copyright © 1998, Tim Bray. All rights reserved.