Catching Up with the Atom Publishing Protocol
by Joe Gregorio
|
Pages: 1, 2, 3
Back to the Future
This isn't very different from the protocol described in draft-gregorio-09, which is what the WG started with almost two years ago. The language of gregorio-09 is a bit rough, and certainly not proper specification text, but the basic ideas are there. What was then called the EntryURI is now called a member resource. In the intervening years the FeedURI and PostURI have merged into a single collection resource. The basic editing model hasn't changed: we use the four basic methods of HTTP (GET, PUT, POST, and DELETE) to shuffle Atom Entries around.
One of the big differences from gregorio-09 is the absence of SOAP. A SOAP binding may appear some day, but it was considered too much to have that in the core protocol. There were also some proposals to use WebDAV. While those proposals were not accepted, they did have some influence and the WG adopted the term collection from the WebDAV spec. While the APP is not based on WebDAV, there seems to be general agreement that we should peacefully coexist and that a WebDAV collection can also be an APP collection.
You've Got Issues
Now it may look pretty simple but there are quite a few issues that have to be dealt with.
Who Controls the Elements in the Entry?
When creating or editing an entry using an Atom Entry, there is the basic question of which side controls which element in the Atom Entry. For example, the client should probably be in control of the atom:title. On the other hand, two elements cause some concern when considered in the context of the protocol. The first is atom:id. As the spec says,
The atom:id element conveys a permanent, universally unique identifier for an entry or feed.
Right. So what does that mean when creating an entry? Does that mean the client has to create a globally unique atom:id just to POST a new entry? Should the server overwrite that atom:id just to guarantee that the atom:id is unique? Remember the use case for this restriction: entries that were duplicated in multiple feeds wouldn't be displayed more than once in your aggregator. How do I transfer my entries from one blogging system to another and keep the atom:id the same?
Consider the following scenarios:
We have collection A and collection B. What should happen if I copy an entry "a" from collection A to collection B? Should that entry be given a new universally unique atom:id or should it keep the old one? It would be nice if we kept the same atom:id because that would avoid users seeing duplicate entries in their aggregator.
If copying "a" from A to B was done because we we're migrating content from one system to another, or adding content into a different category, then keeping the atom:id is the desired behavior.
On the other hand, what if I start editing entry "a" in collection B while keeping the original in collection A unchanged? What if I changed it significantly? Then I would have two entries with the same atom:id but diverging content. That's not very consonant with the idea of a "permanent, universally unique identifier for an entry or feed."
There are also issues with atom:updated.
The atom:updated element is a Date construct indicating the most recent instant in time when an entry or feed was modified in a way the publisher considers significant.
Let's repeat that last bit for emphasis -- "was modified in a way the publisher considers significant."
We're designing a protocol here, what the heck are we supposed to do with that? Is the client or server considered "the publisher"? And how exactly do they determine "significant"?
Introspection
Remember I said at the beginning that collections contain members that have representations as Atom Entries? I lied. There are actually two types of collections, one that contains Atom Entries and the other called a media collection that contains any type of media: images, PDFs, etc. The mechanics of dealing with media collections are the same as entry collections: POST a representation to the collection to create a new member resource; GET, PUT, and DELETE on the member resources to edit. The only difference is that the representation of a member doesn't have to be an Atom Entry.
A weblog is more than just a single list of entries. There may be a media collection, or you could have a link blog, a book list, and a music list. How does your blogging client discover all the collections associated with your blog? And what if you have multiple blogs associated with a single account, like on some third-party weblog hosting services?
Here we have the battle of the formats. There have been proposals to use all of the following: