What should publishers do?

November 20, 1997

Seybold Report on Internet Publishing
Vol 2, No 3

There are many implications one might derive from these developments; we’ll suggest three.

Get serious about metadata. If you haven’t been keeping metadata on your documents and source files (text, images and other media), now is the time to begin doing so. RDF, the first XML application that both Microsoft and Netscape have pledged to support, ultimately will be the basis for site maps, better searching, branded channels and a host of other features that still haven’t been dreamed up.

Ask your vendor what it’s doing. If you’ve been using a document management system to keep metadata, now is the time to ask your vendor when (not if) it will have XML support.

Metadata has always been an integral component of professional publishing systems aimed at collaborative workgroups. But the use of that data was always system- or application-specific. For example, in newspapers, a wire-service header comes into the system and is then used to automatically route the document. But the display of the header was something each newspaper vendor implemented; there is no universal viewer that reads wire-service headers and stories. Now imagine that all of the newspaper’s customers have an RDF-aware viewer. They can get a whole collection of stories with their headers attached and be able to view stories by section, or by author, without going back to the server. Clearly, the paper’s editorial staff would want to take advantage of such a capability; indeed, they will probably want to drive it.

Even though this senario seems a bit impractical for the home consumer market today, it doesn’t take a great leap of faith to see this as very feasible in the near future. And it is already practical in an intranet and in many business-to-business publishing markets. Corporations will be using RDF to create their own "Yahoo-like" views of their internal Web sites. Commercial publishers will be using RDF and CDF to create site maps and channels. The bandwagon is rolling, and any vendor not planning to join just gave you a good reason to talk to its competitors.

Consider separating form from content. In some applications, design and content are best kept tightly interwoven, but in a surprisingly high percentage of cases, keeping the two distinct yields more flexibility. In the past, that flexibility was overshadowed by the cost and complexity of separating form and content, especially in the context of workflows that intertwined the two as pages were made up and corrected in word processors or desktop publishing programs.

Today, however, the cost and complexity of creating new forms of electronic deliverables are rising, even though the final delivery costs have plummeted. There is increasing pressure to deliver products to the market faster, a trend that shows no sign of abating. Experience has already shown that publishers that keep their content separate from its final presentation can both reduce development costs and cut the time it takes to bring them to market.

For now, the only XML authoring tools are SGML editors, although one might argue that relational databases also make a suitable input format for fielded data that could be exported as tag-delimited files. The shortage of XML editors is likely to change sometime in 1998. In the meantime, those that don’t have SGML can begin working on how to capture metadata as part of their editorial process.