XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


What Is Atom

What Is Atom

October 26, 2005

The Atom Syndication Format is the next generation of XML-based file formats, designed to allow information--the contents of web pages, for example--to be syndicated between applications. Like RSS before it, Atom places the content and metadata of an internet resource into a machine-parsable format, perfect for displaying, filtering, remixing, and archiving.

In This Article:

  1. The Preservation of Metadata
  2. Constructs
  3. What's to Come?

This year, it seems, marked a turning point in the world of Syndication Formats. The collection of formats that started it all, RSS, has reached out of the tech world and into the mainstream. It's rare, nowadays, to find a news site or weblog that doesn't offer some flavor of feed. RSS is supported within Apple's Safari browser, within the next version of Windows, and with an every growing mass of applications, both desktop and web based. Its growth has been remarkable.

But as a technology grows, its shortcomings become more apparent. While the different versions of RSS are good for various applications--with RSS 2.0 very useful for simple syndication applications and ad hoc hacking, and RSS 1.0 the most commonly deployed version of the complex Semantic Web technology, RDF--neither format was perfect. RSS 2.0 is too loosely defined, and RSS 1.0, conversely, too complicated. And so, over the past three years, a volunteer development team has been building a format called Atom, which provides a formally-structured, and well-documented, system solely for the syndication of entire news articles and the like, as well as their respective payloads of metadata.

One of the key differences between the development of RSS and the development of Atom is that Atom's whole design process is held out in the open, on the Atom-Syntax mailing list and on the Atom wiki. The wiki is a great place to find the latest developments, issues, ideas, and pointers to the latest specification documents. It is well worth exploring, if you are interested in the history of the specification, and want to see why features are as they are.

Now, though, the specification is formalizing itself. On August 23, 2005 the Atom Syndication Format became a proposed standard at the Internet Engineering Task Force (IETF), after it was submitted by the AtomPub Working Group. It is this version that this article talks about, and which you should work with from now on. Older implementations might use versions 0.3 or 0.4 of the Atom format. They're out of date now.

So what are the differences between Atom and RSS? Apart from the process used to build the specification, and the rigor of the documentation, there are two main substantive changes. These are the Preservation of Metadata, and the concept of Constructs.

The Preservation of Metadata

The key issue when syndicating data is to make sure that you don't lose any information in the process. Apart from the document's content itself, we're also interested in preserving the fundamental metadata about the document too, namely:

  1. What it is called
  2. Who created it
  3. When it was created
  4. Where it is

We can know all of these things automatically--and should really keep them, but the different versions of RSS do not preserve this data by default. RSS 2.0, for example, doesn't require a date, an author, or a URI at all.

Atom, on the other hand, is specifically designed to never lose any data. To see this, take a look at this example of an Atom feed. This, as with all the code examples in this article, is taken from the official developer's guide:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

   <title>Example Feed</title>
   <link href="http://example.org/"/>
     <name>John Doe</name>

     <title>Atom-Powered Robots Run Amok</title>
     <link href="http://example.org/2003/12/13/atom03"/>
     <summary>Some text.</summary>


Pages: 1, 2

Next Pagearrow