The ICE Protocol: Automating the Exchange of Syndicated Content
October 30, 1998
The Seybold Report on Internet Publishing
Vol. 3, No. 3
THE PROPOSED Information and Content Exchange protocol (ICE) is finally seeing the light of day. Efforts to write the specification seem to have been moving at a glacial pace, due in part to a lack of information about what was going on. The authoring group has been meeting behind closed doors since the initiative was announced last February. Out of the public eye, this group has been sharing its work with an advisory council that has grown to 70 companies. During the last week of October, the draft specification was formally made public and submitted to the W3C.
The ICE specification is concerned with content distribution based on business rules. These rules govern the day-to-day distribution and manipulation of content, as well as how it is handled on a site. ICE-based systems will manage and automate syndication relationships, data transfer and results analysis. Combining industry-specific vocabularies and business logic, ICE promises to be the basis for complete solutions for content partners interested in syndicating any type of information.
The ICE protocol addresses the problem of sharing content between publishers and their business partners. On the surface, this may seem like a fairly trivial concern, but in reality, it is a complex problem that currently requires considerable mucking about to correct. Each project is essentially a one-off development project with little crossover utility. Programmers must combine Perl scripts, cron jobs, conversion routines, and even brute force screen-scraping to extract and convert content to the specifications of the site.
The ongoing management of the process can also be difficult due to mundane problems. (Do they publish on Columbus day? Was there an image with that story?) The result is phone calls and E-mail between technical staffs to confirm details and schedule resending of the data.
ICE provides a protocol for server-to-server content exchange that works in conjunction with XML document type definitions (DTDs). There are several other XML-based protocols for content distribution (CDF and OSD come to mind), but ICE's server-to-server operation and syndication focus should keep it distinct from other efforts.
The ICE group expects that vertical industries will create XML DTDs that contain the vocabulary specific to their businesses. Already there are a number of efforts to establish industry-specific DTDs that provide common vocabularies for data and document exchange: Rosetta.net, Ontology.org, and CommerceNet are all promoting targeted vocabularies in industries such as aerospace, electronics and health care. The ICE spec can take advantage of any existing or emerging DTDs; the protocol itself is indifferent to the format of the content.
It's important to recognize that ICE is geared toward automating content exchange, and it is not a replacement for the actual deal-making and associated human interaction. People will still need to meet and set up the details of the subscription and data formats, and to resolve monetary and business issues. Once these issues are resolved, ICE forms the basis for streamlining the routine data exchange between business partners.
ICE assumes that two parties are involved in a business relationship. The syndicator produces content that is consumed by subscribers. In ICE parlance, the subscriber is always another business and never an end user.
The protocol is just that -- a specification defining the rules for data transfer. Both the subscriber and syndicator will be running ICE tools: applications to process the messages and to pass the data on to a content-management system or to create some form of intermediate files for transformation to site-specific formats. The tools may be integrated with the content-management system or they may be independent of it. ICE tools transfer information in both directions, so the syndicator can receive information on how content is accessed on subscriber sites.
A key facet of ICE is its request/response model. All communications are paired; that is, every request requires a response even if it does not seem logical to give one. Each request and response has a unique ID. Subscriptions begin when the subscriber's ICE tool requests a catalog of offerings from the syndicator. The tools can automatically negotiate certain parameters, such as delivery schedule or delivery mode.
ICE is not concerned with the format of the data being sent nor the style of individual elements. It's important to note that ICE also does not delve into commerce transactions -- invoicing, funds transfer and so forth -- which are the province of EDI.
Elements of ICE
The protocol covers four general types of operations: subscription management, data delivery, event logs and miscellaneous functions.
Subscription management involves the mechanics of the subscription process. Parameters include start and stop dates, time and frequency of delivery, delivery method (push or pull) and alternate delivery addresses.
ICE also provides a means to transfer metadata about content-usage limitations, urgency, copyright and whether the producer must be credited. ICE does not enforce any of these rules; it relies on the subscriber's content-management system, human spot checks of the site, and the contract drawn up between the two parties concerning the syndication agreement.
Data delivery is the ICE protocol's raison d'etre. Once business partners commit to an ICE implementation, they exchange content in packages using ICE-compliant tools. ICE defines a sequenced package model that permits packages to contain either full or incremental updates to the content. Multiple content items can be sent in a single message, but care must be taken to keep the package size within practical limits to avoid huge downloads. Subscriber activity is tracked, so the syndicator always knows what items need to be sent to maintain the current version of the content.
Content can be placed directly in the ICE data stream or referenced indirectly as a URL. If the latter occurs, the subscriber's ICE tool would then request that URL from the syndicator. If the item is damaged during transmission, the subscriber can send a request for another delivery.
Tracking anomalies and diagnosing problems are a vital part of syndication management. Persistent problems need to be tracked down and corrected. The ICE spec lays out a number of events that are written into log files. The events have the usual three-digit numeric code with accompanying text string that occurs in an HTTP server log. ICE allows for event logs to be automatically distributed between partners, but this can be suppressed for security purposes.
ICE also provides a mechanism for supporting miscellaneous operations, such as sending messages destined for system administrators or debugging and troubleshooting.
ICE supports the notion that there may be constraints placed on content. It would be possible to limit a category, such as banner ads, to GIF files of no more than x pixels by y pixels, for example, or to limit headlines to 128 characters. ICE provides a way to simply reference constraints in a subscription. Packages that don't meet the assigned criteria are not processed, and an error message is returned to the sender.
Because there is no existing specification for defining constraints, the authoring group intends to define an ICE-constraint language in the future in the hope that it might lead a more general way to describe content constraints.
The original working group included representatives from Firefly and planned to address the exchange of user profile information via the P3P initiative. Microsoft acquired Firefly early in the development of the spec and took over Firefly's position in the authoring group. While ICE can be used to transport user profiles (or any type of data for that matter), P3P is not implemented in ICE. The spec places the responsibility for enforcing profile privacy principles at the application level, not the transport level.
ICE payloads are transmitted as HTTP post statements. Security is provided at the transport level via SSL, PGP or S/MIME. Applications can also use digital certificates to sign content sent within the ICE protocol with verification performed at the application level. It's also possible to restrict access to subscription information via a log-in and password mechanism.
Implementations of ICE will vary in quality. Content-management vendors are certain to build ICE tools into their product lines. Business partners using the same systems are likely to be able to create richer implementations of ICE that pour content directly into the content-management database. We assume that content-management systems that implement the spec should be able to interact with each other with some additional effort. The spec assumes that, in some cases, the ICE tools may not be of equal quality and allows unsolicited messages to be pushed to what are referred to as "minimal subscribers." Minimal subscribers are defined as those that do not have a persistent ICE server.
Vignette has been in the forefront of the development of the ICE spec, with Neil Webber, Vignette's chief technical officer, serving as the editor of the specification document. Vignette has dedicated a section of its Web site to the ICE protocol and announced earlier this year that it would be introducing new products that support the protocol.
It's no surprise, then, that Vignette timed the introduction of its Syndication Server to coincide with the release of the spec at the ICE summit. Syndication Server will be available in the first quarter of 1999 and pricing starts at around $50,000.
Inso, like Vignette, has recognized that ICE will help solve some of its customers' syndication problems. It already has been supporting content exchange via XML and its Command Line Interface, which opens DynaBase to third-party scripting languages, including Perl and Visual Basic. While the CLI is not the most user-friendly way to build a syndication product, it does offer companies with access to programming talent a way to implement ICE content syndication.
FutureTense is also planning to develop syndication tools based on ICE. The company is likely to focus on tools that are integrated with its IPS content-management system rather than stand-alone applications.
ShiftKey has a Java-based syndication application, SiClone, that is currently being used by Reuters and TheStreet.com to deliver content to their partners. SiClone runs on any platform with a Java Virtual Machine, and since it runs on HTTP, it can pass through firewalls and proxy servers. Version 1.1 of SiClone is not ICE-compliant, because it lacks a syndicator component. It tracks packages by time and date stamps rather than maintaining a dialog on the state of the content.
However, an interesting feature of ShiftKey's syndicator-less implementation is that SiClone can be used as an aggregation server without any software on the syndicator. This the opposite of ICE's approach of a strong syndicator and, optionally, a minimal subscriber. This method enables a publisher to access public domain records, such as the Federal Register or state statutes from government servers, and republish the information. The application operates only in scheduled pull mode; it does not support push.
SiClone goes beyond the ICE spec of just transporting content. It includes a transformation engine that implements CSS2 style sheets. With it, subscribers can alter content to fit the requirements of the site before storing it in a JDBC-compliant database or file system. SiClone also supports transformations of streaming media formats, including Netshow and RealAudio and RealVideo, changing the file location information embedded in the media so that it can run from subscribers' servers.
At the ICE summit, the company announced the beta of SiClone 2.0, which will support the full ICE protocol. SiClone is priced at $1,500 for a single copy.
The ICE specification is a significant step forward for those trying to automate the distribution of content on the Web. While it is not a silver bullet for those engaged in the conversion process, ICE will provide a stable platform on which developers can build transformation tools that process large quantities of material without human intervention. Assuming that a few vendors create successful products, there should be considerable reuse of the tools and processes, and the economies of scale that this provides for publishers should be dramatic.
Online syndication promises to be a major market for publishers. Dave Mathison, VP of development at Reuters New Media, who has been using SiClone to syndicate content for about a year, said that revenues for online-delivered content have "surpassed traditional markets by a dramatic amount" at Reuters. Many magazine and newspaper publishers that once syndicated to a few online services now have innumerable syndication opportunities on the Web, and ICE could play a key role in helping to make that business profitable.
Vignette has done a good job of shepherding the spec through the authoring stages and soliciting feedback from interested parties via the advisory council. Vignette clearly has a vested interest in defining a specification that its software supports, but it has made sure that competing vendors have ample opportunity to create ICE-compliant syndication servers, a strategy that should foster healthy competition among vendors and draw a wide base of support among content producers.
Though it is back on track as an important initiative, ICE is not finished: It must clear the W3C -- a process that could take up to a year and could result in changes to the specification. But the release of the initial protocol makes this an opportune time to comment on the proposal and to begin evaluating systems for implementation.