Dreaming of an Atom Store: A Database for the Web
by Joe Gregorio
|
Pages: 1, 2
An Atom Store
The idea of an Atom Store has been bouncing around the blogosphere for a bit now, though not always called by that name. Jesse Andrews points out a few of the sources of inspiration, and as far as I know he was the first person to use the term "Atom Store":
Mark Pilgrim's magicline and monkey do could use it to store data
Rohit Khare & Ben Sittler at Commerce.net have been working on requirements of an Atom Store.
Joe Gregorios[sp], author of Atom Publishing Protocol, is researching it.
You can even hear Google's Adam Bosworth request it on IT Conversations, hoping MySQL folks don't become Oracle as Oracle doesn't scale the way an Atom Store could scale.
The range of applications that are being talked about here is breathtaking. The monkeydo and magicline usage of an Atom Store would be a remote persistence mechanism for a Greasemonkey script. Contrast that to the ideas that Adam Bosworth is talking about, databases that scale like Google's GFS does today.
It's All About the REST
That's a huge range of applications, but I think such a thing could happen. There are several forces driving it. First, you and I have lots of data, and it's stored in lots of places. I have my weblog, my email, my subscriptions to all my syndication feeds, maybe a del.icio.us and flickr account, and so on, and so on. You are not going to combine those all into one big, happy service. Ever.
I want my choices and even if you are a big company and end up being able to provide all those services under one brand, I doubt I would trust all that data in one place. Instead of consolidating services, what syndication over the past 5 years suggests is that now I can aggregate feeds from all those places into a single dashboard that let's me view the status of my far-flung data empire in a single view. Now if all those sources of data not only supplied a feed, but also supported the interface of an Atom Store, well now that passive view changes into a real dashboard -- not only are those entries viewable, but they're editable from one spot.
Yes, I know that some aggregators support search, and some even support some of the current blogging APIs, but that's very different from every source being searchable and editable. An aggregator is only going to be able to search across entries that have appeared since it started subscribing to that feed, and not any earlier ones.
The other advantage of an Atom Store is that it's built on top of RESTful services. That means that we get the advantages of REST -- caching and uniform interfaces and hypermedia as the engine of application state. For both OpenSearch and the APP there is an XML document that describes the capabilities of each endpoint. They are self describing. That allows another service to come along and wrap several Atom Stores together by reading those description documents and then presenting itself as an Atom Store, an aggregate of all those stores it uses. Now that aggregate store could be a melange of your disparate data, your weblog, your email, etc. On the other hand, it could be a uniform series of servers each with a subset of a huge store: now you're building a monster database.
"Just" Use a Database
Aren't these just the same promises made in the early days of SQL? Sure they are, but I think an Atom Store has a better chance of meeting the hype for several reasons: The first is that the data model is not wide open like SQL; the format is pretty restricted as far as the core elements of Atom are concerned. Secondly, the query and updating operations are not nearly as comprehensive as SQL. If you want to point to SQL as the only reasonable way to query over gigabytes of data, I'll just point to Google or Yahoo as counter examples.
It's Not All Puppies and Roses
Now that I've got you all worked into a lather over how great the world will be with Atom Stores on every street corner, let me splash a little cold water in your direction. I've kind of glossed over some areas that need work. Some of the open questions are:
More from |
|
Implementing the Atom Publishing Protocol httplib2: HTTP Persistence and Authentication Doing HTTP Caching Right: Introducing httplib2 |
- Indexing
- Does indexing have to be immediate for the idea to be beneficial?
- Annotating
- How do you know where to POST to for creating new entries vs. annotations?
- Creation
- If I POST a new Entry to an aggregate of a bunch of Atom Stores, which of those Atom Stores should it be created in? How should I route that POST?
- Foreign Markup
- Let's say I wanted to use an Atom Store for storing all the customer transactions in my e-commerce store. To do that effectively I may have to add some extra information to an Atom Entry to fully represent a transaction. How and where is that information stored and indexed? Do I start creating microformats for all of that data or do I stuff it in the Entry as foreign markup? How much indexing of foreign markup is useful? Do we need specialized indexing and search terms for that?
As you can see there's plenty of work to be done. Let's roll up our sleeves and make it happen.
- Single Sign On
2006-02-12 09:49:59 joekim - Single Sign On
2007-02-11 15:20:24 mdickey