Links That Are More Valuable Than the Information They Link?

July 25, 1998

Bob DuCharme

Traditional databases have had them for years, and soon people will make money selling Web links.

As XML opens up new possibilities for using information on the Web, its companion XLink specification is expanding ideas about what we can do with links. XLink will let us create links among documents that we can't edit. It will let us create links between more than two points. It will let us structure our links, allowing greater control over increasingly complex links. What will we do with all these new capabilities? For one thing, we'll be able to treat collections of these links as independent databases of information that lead to new categories of products for sale.

Links As Relationships

Sometimes a link has more value than the information it links. How? A link is more than an instruction to jump from a piece of text to another document; it's the identification of a relationship between two (or more) pieces of information. For example, the Snee Widget Company's list of orders is more important than their list of potential customers or their list of inventory items, because orders bring in money. An order table in a relational database won't include a customer's address or the available colors of an inventory item; the customer and inventory tables hold this information. An order record may say little more than "Customer 4978 ordered 3 units of item 6134 on July 3, 1998." If necessary, an order report can include the customer's address or the item's color choices by using these ID numbers to look up this information in the respective tables; the real job of the order record is to identify the relationship between a piece of information in the customer database and a piece in the inventory database -- in other words, to link them.

The role of these order records demonstrates some new possibilities for links:

  • You can create useful links without editing the linked information. The Snee Widget Company's order records were created without requiring any changes to the customer or inventory records they linked.

  • Along with identifying related pieces of information, a link can add new information: in this case, the order date and the number of items ordered.

  • The links themselves have value. The order database will obviously be used for billing, but it also can be cross-referenced with the other tables to find out who ordered how much of what types of merchandise -- the kind of information that American Express spends huge sums gathering.

Links Between Documents

Inventories, customer lists, and order lists fit nicely into the rows and columns of relational tables, and that's where they belong. Let's look at an example more suited to Web documents: legal citations. Lawyers presenting an argument to a judge often cite "precedents," or past legal decisions that serve to support their client's case. Let's say that in O'Rourke v. Agarn, Agarn's lawyer cites the Tucker v. Storch case from ten years ago. Then, it turns out that some other lawyer had cited Tucker v. Storch five years ago for Dobbs v. Duffy, and that the Dobbs v. Duffy judge criticized the Tucker v. Storch decision. This may make it a problematic citation for Agarn's lawyer to use. On the other hand, if seven other cases had successfully cited the case Tucker v. Storch in those ten years with no criticism, it's a great case for Agarn's lawyer to cite.

Each citation is an identification of a relationship between two cases -- a link. The relative success or failure of past citations to a particular case is important information for the lawyers who may cite it again, and they would pay for this information. They already pay for similar information from expensive services that sell access to the decisions and the databases of citation information, but, as more court decisions become freely available online through the efforts of groups like the Taxpayer Assets Project and Cornell's Legal Information Institute, all that we need are the citation links between the court decisions on the Web.


XML's XLink provides the mechanism to define these links. The XPointer specification (another in the W3C's family of XML specs) provides ways to identify a point in the link's own document or in an external document as being the source, destination, or both for a particular link, even if you have no editing rights for that document. (Contrast this with HTML, which can only link from a particular point if you can insert an A element with an HREF attribute there, and can only link to a specific point within a document if you can put an A element with a NAME attribute at the link destination.) An "out-of-line" XLink link (as opposed to an "in-line" link, which is incorporated into one of the documents being linked, like all a href elements) can use XPointers to identify those two ends of a link from within a document completely separate from the files holding the two ends. XLink even lets you define a link between more than two documents (or images, or audio clips, or query results), so that one passage of Shakespeare and three critics' commentaries on it could all be linked to each other.

XLink also lets you create more complex structures for links, so that you can define relationships between the various document components that take part in a link. For example, let's say that the Shakespeare quote and the three critics' commentaries weren't just four ends (or, in XLink parlance, "resources") of the same link, but had specific roles and relationships identified by the markup around them. A style sheet could then distinguish between a Shakespeare-critic relationship and critic-critic relationship; once it knew the difference, it could implement traversal of one as a popup window and the other as scrolling parallel windows.

Structured Links

XLink's ability to define more structure for these increasingly complex links makes them easier to control, which leads to easier maintenance and, as with XML, to greater possibilities for automated processing into electronic products. When XLink specification co-editor Eve Maler announced the publication of a revised W3C working draft of the specification at "XML '98" in Seattle last March, she drew an analogy between the evolution of linking technology and that of word processing: early word processing systems had structure, then WYSIWYG systems skipped the structure, but as people scaled up to larger document processing systems they had to fake structure, and structure for Web documents eventually had to be reinvented in the form of XML. Early hypertext systems far more powerful than HTML had structure, too; HTML's A element skipped it in the name of simplicity, and now XLink puts it back in to make larger, more complex linking systems possible.

By easing this automated processing, structure makes it easier to scale up to larger systems. This make maintenance of a large hypertext system easier; with out-of-line links stored in a relational database, you can scan for and handle broken links much more efficiently -- so efficiently that even with multiple gigabytes of content you could still check all of its links as a nightly batch job. This vastly reduces the chance that your users will find broken links on any given day.

And that's just mundane maintenance work. Systems like this will encourage the development of new services and new products to build around these services. For example, with link information stored in a database, the extra information stored with each link record provides filtering opportunities that let you charge different rates for different categories of access. Customized link sets related to specific content or subscription levels could also be sold as separate products from the same large database; when Jane User wants to upgrade from basic service to extended service, a simple change in her customer record could grant her access to all links (and the information they link to) instead of the subset that the marketing department selected as the basic access set.

The Future

XLink isn't complete yet. While the XML spec has become an official W3C Recommendation, the XLink and XPointer specs are still in Working Draft status as the W3C XML Working Group debates and refines their details. XML developers at the Seattle conference this past spring were thrilled to see an advance showing of some compiled Netscape Navigator 5.0 code, which demonstrated some XLink capabilities along with its XML and CSS 2 support. Microsoft's Internet Explorer, the browser that had previously been well ahead of Navigator in XML support, is bound to catch up in XLink support with their own 5.0 release. Soon, we'll all be able to take advantage of this vendor-neutral W3C standard for mass-media and take distributed hypertext to the next level of creativity and commerce.