Microformats and Web 2.0
by Micah Dubinko
|
Pages: 1, 2
Microformats Community and Process
In that earlier XML.com article, I dipped my toe into the waters of microformat design, proposing a format called the Exam Markup Language, or Examl. The first time most folks ever heard of it was reading the column. Turns out that was a mistake.
A microformats wiki page titled, So you wanna develop a new microformat? lists the steps one should follow. Note the emphasis on transparency throughout:
- Document current behavior
- Propose
- Iterate
Before starting work on a microformat, a fair amount of research and discussion needs to happen, generally in public. If only one person has ever worked in it, a microformat probably won't succeed. Research done, the next step is to propose the format in an appropriate forum, seeking copious feedback. Repeat as necessary.
The wiki asks two further guiding questions:
- If I looked at this microformat in a browser that didn't support CSS or had CSS turned off, would it still be human readable?
- Are this format's elements stylable with CSS?
To put it another way, those annoying non-semantic elements wither and fade under the scrutiny of the microformats community. This outlook enforces a proper view of markup as an intention-carrying component, not a presentational shortcut.
While I'm confessing past sins, I also wrote that "some gray areas remain. For example, is RSS a microformat? It seems to bear at least some of the characteristics of one..." It is true that RSS bears some characteristics, but analysis since that article has concluded that RSS is definitely not considered a microformat.
Most folks, though, won't need to create a microformat--they can use an existing one, secure in the knowledge of how much consideration has gone into its creation.
Influence on the Web
One possible objection is that microformats encourage "screen scraping"; instead of using a carefully crafted Web Services API, people (and their machines) will instead fetch regular pages and struggle on from there.
I asked Casey West about this. He noted that search engine crawlers will, except in very special cases, always prefer to enter a site through the same interface used by regular web browsers, because a single search engine would never be able (or even want) to keep up with all the possible third party APIs that might exist at any given time. In other words, microformats are a natural companion to REST-philosophy web services where useful data is only a GET request away. Microformats are human readable, but not at the expense of machine readability. Thus, it's not exactly fair to say one would have to "scrape" from a page with microformat data present--the data is structured and accessible by design. In other words, microformats tend to work better on the web.
A closely related question is, what kind of effect might microformats have on browsers and Web 2.0 applications that run in them? I liked West's answer, that basically they don't need to change beyond support for XHTML. Çelik added that a key principle involved is users controlling their own data. In several of his presentations, he asks the audience how many different email clients they've had over their lifetimes. How did the data migration go at each step? Not too well, usually, but intelligent use of microformats could perhaps improve the situation. This especially goes for calendar and address book applications, where existing microformat work is well-established.
Microformat Annoyances
Like any new technology, microformats don't solve every problem, and in fact introduce a few problems of their own. One is the general problem of microcontent, that is, useful units of data at a granularity less than that of individual documents. Many existing content management systems aren't equipped to deal with, say, a single XHTML document that contains 27 hCard instances. As microformats gain prominence, though, microcontent management systems should begin to catch up.
Presently, microformat progress is almost exclusively based on XHTML. Depending on your viewpoint, this may be a strength or a weakness. We'll get to possible alternatives in a bit. In some ways, the microformats movement and community competes with consortia-based standards development, which is slower to adjust to a new, less expansionary era. On the other hand, XHTML 2.0 shows all the signs of being an excellent microformat foundation--if and when it becomes supported by browsers.
As with any highly intentional language, working with microformats can sometimes be painful; the urge to insert presentational tags can be overpowering. For this reason, working with microformats requires eventually requires in-depth knowledge of XHTML, CSS, and other XML best practices. Any shortfall on these skills can make it hard to understand why certain things are done the way they are, and how to effectively make use of existing tools. Fortunately, the learning curve is not too steep, and the new skills can be added in an incremental, as-needed basis.
Lastly, standards developed as a microformat exist in a more constrained environment--new elements and attributes in general can't be created as needed. This can make versioning, already a hard problem, even worse. Existing microformats are young enough--and focused enough on solving a single small problem--that versioning hasn't become a serious problem. This will be an area to keep close tabs on in the future.
Things to Watch
The new generation of browsers finally supports more than just HTML. Will new microformats arise around SVG, XForms, or other existing markup languages? It's an open question.
Another question is how tightly microformats are (or need to be) bound to browsers. Many instances of full-scale XML vocabulary development fall outside browsers. In any of these cases, would it make sense to apply the microformat treatment to, say Docbook, OpenDocument, or UBL? Time and community interest will tell.
One more thing to keep an eye on: what is Mark Pilgrim up to? "Or do you just use your browser to browse? That's so 20th century."
The Bottom Line
Vocabulary proliferation is one of the biggest XML annoyances around. If you're like me, your brain can hold three, maybe four markup languages at a time. The microformats way of life prefers reusing existing work wherever possible. Recycled knowledge goes a long way. An active community works to continue progress on specifications, which tend to be easier reading than full-blown committee standards.
RSS is pretty successful today, but it took nearly nine years to get there. In a universe where, instead of RSS, an equivalent microformat started things off, would adoption have happened more quickly?
If you think the answer to that question might be "yes," then microformats are worth a look.
- icckkyyy
2005-11-01 14:22:17 vdubberly - Important techniques for word processors
2005-10-27 13:55:08 PeterSefton - In response to Norm
2005-10-21 14:47:28 Danny Ayers - In response to Norm
2005-10-31 07:48:21 BrunoVernay - why the limit?
2005-10-20 12:43:54 bryan rasmussen - What about validation?
2005-10-20 10:21:32 Norman Walsh - What about validation?
2005-10-20 12:44:35 bryan rasmussen - Great article
2005-10-20 04:46:35 Daniel Zambonini - Scraping
2005-10-20 04:02:22 Danny Ayers