XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


ROME in a Day: Parse and Publish Feeds in Java

February 22, 2006

Ready to parse and publish RSS and Atom feeds in Java? In this step-by-step tutorial, we'll show you how to pull in an existing feed, add your own content, and publish the results in a new format, all in 100 lines of code. (200 lines with whitespace and comments.)

Knowing that RSS and Atom feeds are "just" XML, you might think that parsing and creating syndicated feeds in Java should be a snap. Pick any one type of RSS, and you might be right. Unfortunately, there are at least ten flavors of RSS and Atom out there: RSS 0.90, RSS 0.91 Netscape, RSS 0.91 Userland, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom 0.3, and the newest addition to the bunch, Atom 1.0. Then there are all the namespace modules, like Dublin Core, Media, and so on. It's all messy enough to make a grown programmer cry. Wipe those tears, Java developers, and say hello to ROME.

When in ROME

ROME Logo In this tutorial, we'll be using ROME to do all the heavy lifting. ROME is an open source (Apache licensed) Java library which is designed to make it easy for you to parse and create syndicated feeds, regardless of format. In fact, all of the variants of RSS and Atom mentioned earlier are supported by ROME.

ROME doesn't just come with features, it also has a proven track record on sites like My AOL, CNET Networks, and Edmunds.com. The Powered By ROME wiki page describes how ROME is being used in these and other applications.

The basic approach of ROME is to parse any RSS or Atom feed item into a canonical bean interface. This lets you as a developer manage fairly homogeneous item beans regardless of their original format. Even better, ROME makes it easy to create a new RSS or Atom feed, using those very same beans. This tutorial is going to show you how to do just that.

Warming Up

To illustrate how to use ROME, we are going to mimic some features made popular by FeedBurner, a site which provides feed hosting and statistics for RSS and Atom publishers. FeedBurner itself doesn't use ROME (as far as I know), so we are going to mimic their end product, not their process.

FeedBurner offers a service called FeedFlare, by which publishers can add a contextual footer to each item in their RSS or Atom feed. (This is a great example of the Immediate Action pattern.) The links in the FeedFlare footer are built using data from the feed items, and allow the reader to easily email a link, bookmark the item in del.icio.us, and so on. Figure 1 shows a FeedFlare footer as displayed in NewsAlloy:

Figure 1
Figure 1. FeedBurner adds a FeedFlare footer to an RSS item. Click image for full-size screenshot.

To demonstrate how easy it is to use ROME, this tutorial will show you how to play the part of an--ahem--FeedWarmer. You will pull in any RSS x.x or Atom x.x feed of your choosing, read key information from each feed item, add an interactive footer, and then republish the results in a new format.

(If you have ever looked at the XML differences between RSS 1.0, RSS 2.0, and Atom 1.0, you'll realize that changing feed formats is no small feat. ROME, thankfully, handles all the hard work.)

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow