eSyndication: Heterogeneity Rules!
July 17, 2000
|Table of Contents|
If you are aware of syndicated columns in newspapers, or syndicated television serials, then you are already familiar with the concept of syndication. Syndication is a business model widely used in the print and broadcast industry, and rapidly making its way into the Internet world.
The idea behind it is that content producers syndicate the content they create to other companies, instead of directly delivering it to the end user. This frees the producers to focus on creating quality content, and leaves the distribution and delivery problems to the newspaper or television company. Newspapers or TV companies receive content from several producers to get high quality content at a manageable cost of production. There are also companies that re-syndicate, typically by customizing or repackaging the aggregated content, thereby forming a content distribution chain.
Syndication is becoming of increasing relevance to organizations doing business on the Internet. In this article, we'll use the term "eSyndication"* to refer to syndication over the Internet.
The eSyndication model is not just restricted to digital content, but can be extended to a variety of other digital goods and services**. In this article, we first explore the range of applications where eSyndication can be used effectively, the current solutions used to implement eSyndication, and the technical challenges it poses. We then discuss how XML plays a significant role in easing the problems of eSyndication and describe what one should look for in a good eSyndication solution.
The term "syndication" has been used in various contexts, with varying connotations and meanings. Here we follow the terminology proposed by the XML syndication protocol ICE (Information and Content Exchange), and use the following working definitions. eSyndication is the process where a syndicator (content producer or distributor) delivers content (any digital data) to a subscriber (content aggregator or destination) according to an agreed-upon recurring schedule (based on time or business rules). The syndicator is said to syndicate content to the subscriber, and the subscriber is said to subscribe to content from the syndicator.
Alternately, eSyndication can be described as a subscription-based content exchange, where subscription is an agreement or a relationship in which a syndicator agrees to deliver to a subscriber certain content according to a certain recurring schedule.
The model of eSyndication can be applied to a wide range of problems, especially to address the enterprise-to-enterprise content exchange needs. Here are some of the most common applications:
- Internet portals subscribe to headline news, stock quotes, weather news, etc., from various sources.
- eCommerce portals subscribe to catalog and inventory information from suppliers.
- Manufacturing enterprises syndicate product information and data sheets to their distributors, resellers and partners.
- Financial research houses syndicate their research and financial data to customers and partners.
- Intranet portals subscribe to content and product updates from their various internal departments.
To solve the problem of syndication, enterprises and medium-sized businesses alike have developed ad-hoc and proprietary solutions. These range from web scraping to email, FTP, and in some cases even physical shipment of the data on tape or CD-ROM. If you are syndicating content to one or two partners, you can perhaps get away with an ad-hoc solution. But as the number of exchange partners in your network increases, you will soon discover the painful limitations of ad-hoc solutions:
- They are rarely reliable, and cannot guarantee the timeliness of deliveries.
- They are rarely scalable. Since each link between exchange partners is built as a one-off, there is little re-use of infrastructure, and hence adding new partners to the network becomes very time consuming and expensive.
- They often require human intervention, with very little automation.
One of the biggest challenges to deal with is the heterogeneity of your partner network. This heterogeneity stems from the variety of content storage systems, content formats, and transport protocols in use today.
Content storage systems vary from simple file systems or relational databases to sophisticated content management systems. Content formats can be HTML, XML, CSV, audio and video formats like jpeg or mpeg, document format like PDF, Word, Excel, etc. The protocols used to ship the content include HTTP, FTP, SMTP, and IP-based proprietary protocols. In addition there are XML-based protocols like ICE, BizTalk, and so on.
In a typical syndication scenario, you have the content in one storage system and one format, and your partner wants the same content shipped to him or her in another format through some transport protocol, and will store the received content in a different storage system. Add several partners to your network and the problem of eSyndication becomes very interesting!
Besides supporting heterogeneous storage systems, formats, and protocols, what is equally important is the ability to effectively manage relationships with your partners. Syndication relationships (like most relationships!) are dynamic. Let us say that today you are sending HTML files to one of your partners using FTP. As your partner's business needs and IT infrastructure grow, they may not be willing to deal with FTP and HTML any more, but instead require content structured in XML and shipped over HTTP (and any confidential information shipped over HTTPS). Situations like this mean a good management system is required as an integral part of the syndication solution.
Scalability is a big challenge, and is often a missing piece in most home-grown solutions. You should ask yourself if your system scales with the number of partners in your network, the volume of content exchanged, and the frequency of updates. The most well-known aspect of scalability is the ability of the server to handle the load. An oft-ignored aspect of scalability, however, is ease of maintaining partner relationships. How easy is it to add a new partner to the network or modify the existing relationship? If the process of adding and maintaining relationships is laborious and requires a programmer, then your system does not scale well--even though your server may be capable of handling the load.
eSyndication differs from traditional syndication in that it is easier to automate by means of computer-to-computer communication. For two computers to communicate, they have to agree on some standard protocol. On the surface it appears that a binary standard is good for computers to talk to each other. Technologies like CORBA and DCOM are based on binary standards, but they are not suitable for enterprise to enterprise communications, as they call for a tight and expensive integration. Given the heterogeneity of computer hardware, operating systems, computer languages, network hardware, and network protocols, it often turns out that the good old text is the best medium to standardize business communication. This makes XML the natural choice for handling eSyndication.
Will XML solve the problem of heterogeneity? The widespread acceptance of XML over HTTP will certainly ease the problem of heterogeneity, but will not totally eliminate it. We will still be left with multiple XML formats to deal with. However, it is a lot easier to process XML, and we are seeing the market on its way to being flooded with tools to help us with XML transformations!
XML is not the answer by itself, but it is a means to an end. XML offers a framework for defining standards. Most XML standards today describe the format in which the data is laid out. There are also XML standards that are protocols, that is, they describe a grammar and a sequence for two computers to talk to each other. To automate eSyndication, we need an XML-based protocol.
The main XML-based protocol standard developed specifically for eSyndication is ICE (Information and Content Exchange). ICE can be used to ship content of any type, be it HTML, images, audio, video formats, or other XML data. It was designed for automating media content syndication, but can be used to syndicate catalog and other kinds of content as well. Media content is mostly distributed as HTML and multimedia files. ICE does not specify how your content should be structured. If you are syndicating content in XML, then you have to agree to a format with your partners. It is better to adapt an emerging XML standard than to invent your own. Which XML standards you choose will depend on the vertical industry you are in. For example, NewsML and NITF are emerging XML standards in the news industry.
With the emergence of electronic "market places," catalog syndication is gaining prominence. Each market place has its own way of dealing with catalogs. Some may have their own XML-based formats, some deal with catalogs in simple CSV (Comma Separated Value) format. In addition to syndicating catalog descriptions, you have to syndicate the pre-negotiated special prices for buyers and partners. For catalog syndication, there is an alphabet soup of emerging standards like cXML, xCBL, Rosettanet, eCX, OCF, and a whole bunch of proprietary formats. Unlike ICE, these standards are very specific in their description of the catalogs.
If you are looking for a vendor-supplied syndication solution, here is a check-list of features you should keep in mind:
- Adaptability: Support for easy integration with your content repositories like file systems, databases, or other content management systems.
- Multiple Protocols: Support for multiple transport protocols like HTTP, HTTPS, FTP, SMTP. And ability to add other protocols.
- XML Support: Native support for popular XML standards and support for "on the fly" XML transformations.
- Automation: Scheduling, or business-rule-based triggers. Support for both push and pull modes of delivery.
- Subscription Management: User-friendly GUI to create and maintain partner relationships. A browser-based UI will eliminate the need for installing client software.
- Scalability: Scale with the number of partners, amount of data moved, frequency of updates, number of concurrent deliveries.
- Efficiency: Send incremental updates to your content instead of sending all the content at every delivery.
- Security: Support for standard security features like firewalls, SSL, encryption, Virtual Private Networks (VPN).
- Reporting: A detailed log of every delivery for preparing business reports.
- Reliability: Retry delivery if your partner's server is down. Notify and log missed deliveries.
One increasingly important question to ask is whether the product can be hosted, or if you have to install the software. Application hosting can be convenient, and is gaining in popularity. Additionally, if you enforce a certain look and feel to all your systems, then make sure you can easily customize the software. Finally, ask which platforms are supported and ensure you don't limit your future options.
Enterprises are not isolated digital islands any more, and the need to exchange content with other organizations is growing in importance. Syndication is a simple but powerful model to exchange content such as catalogs, product data, financial data, and training materials with your business partners.
The biggest challenge of syndication is to deal with the heterogeneity of content formats and transport protocols used today. XML over HTTP offers a very affordable and easy way to automate syndication, and it promises to ease the heterogeneity problem.
If you are a content producer, syndicating your content can create new revenue channels. If you are a supplier, sooner or later you will find yourself syndicating your catalogs and product data to your own web site, to multiple market places, to other e-commerce sites, and to your partners. It is important to have a good syndication solution in place to streamline the delivery of catalogs to diverse destinations.
Dr. Mani Manickam is VP of Engineering at arcadiaOne, Inc.