On the Extreme Fringe of XML

August 3, 2005

Roger Sperberg

The "X" in XML stands for "extensible." It doesn't stand for "expert" or "extreme." But when I think of XML I always think of the Extreme Markup Languages conference as the place to become expert in XML. I say it's where the graduate seminars in XML are held. The marketing slogan for Extreme once said it's where presenters don't have to be afraid of being technical. ("A week of geek speak," as the Extreme wiki currently has it.)

When I attended my first Extreme, in 2000, I was new enough to XML that I thought that meant everything would go over my head. That's what I thought was meant by, "Not for beginners, nor the technically faint. This is the edge, the hard bits, the theory behind the practice, the practice that outstrips current theory — the Extreme." Instead I found plenty of people in my circumstances there, people responsible for planning and implementing new directions in their organization, who knew a bit already but not everything, and who wouldn't quail to have a presenter go into specifics. There were lots of geeks and über-geeks, but primarily people who could listen to geeks talk about "everything touched by the question of how best to allow information to describe itself" — the stuff of XML itself.

There was also a strange, extraordinary quality to the conference. When I figured it out, I pulled aside one of the conference chairs (there are five, the same five each year) and said in amazement: "There's nobody in the halls during the presentations! Everyone is sitting in a presentation!" This was no exaggeration. Literally every single attendee was listening to a presentation. This we attributed to two things — after every other presentation, there's a 45-minute interlude so that all the private conversations necessary for a conference to succeed can take place without anyone missing anything. At first, that seemed excessive to me; now I regard it as a necessity. No one wants to miss anything.

The other aspect of 100 percent attendance wasn't incredible affinity between audience and subject (which is high but not supernatural), but an awareness, I think, of how focused one's own experience is and how you can't solve all your own problems. I'm saying this poorly, but what I mean is that the technologies I'm dealing with today are ones I first heard of at Extreme three or five years ago and thought at the time to be irrelevant to my problems. I can see now that the sessions I attended to deal with my then-current problems — electronifying a publisher's content — were helpful and saved me a month here and there in resolving things. But I didn't know then what I would need to know now. And I wasn't alone. That's why no one wanted to miss any sessions. Might be on the test!

For instance, I work for a publisher and have always worked for publishers, as an editor and production person in IT support and new technologies. Four years ago, kind of out of nowhere it seemed, the conference organizers invited Elaine Svenonius to come and give a keynote address. Well, you probably have never heard of Svenonius, because she's not exactly a computer person. She's a librarian (well, professor emeritus of library science at UCLA to be precise). Hello? Is this really keynote material? And Allen Renear, another academic, gave a paper the next year on the problems librarians were having identifying electronic files — the old breakdown of work, manifestation, edition and item, whereby you could clearly place the Wizard of Oz; whether you were talking about the conception, play, movie, or specific copy of the book, was breaking down, as it were.

I listened politely. I was not excited, shall we say, but I was there. Luckily for me, as it happens. Last month I wrote a report to my management pointing out that with our electronic, online activity, we were more like a library than a publisher: locating information is the main service we were providing. The problems of locating information are just what librarians have been grappling with for two-hundred years.

And now my two bedside books are Svenonius' Intellectual Foundation of Information Organization (which, despite its formidable title and MIT press imprint, is really an introductory text for a nonlibrarian like me) and Vanda Broughton's Essential Classification, a straight library-school text. This latter title was recommended to me by someone I met at Extreme, Murray Altheim, who was then modularizing XHTML and who is now grappling with these same issues. And the bibliography of Renear's paper makes up half my current reading list.

That first Extreme I attended was the infamous one where the RDF people and the Topic Map people first approached each other, ready to duke it out, it seemed, and instead emerged warily as new best friends. I was so caught up in the practical end of markup that I concluded, "This is fringe stuff I definitely don't need to know about," and pretty much ignored it. I actually skipped a session!

Today, where I work, we are implementing a topic map-based system. And a major reason I'm going to Extreme this year is a paper Lars Marius Garshol is presenting on interoperability between Topic Maps and RDF using "Quads," or RDF-triples-plus-identity. (Full disclosure: I'm also going because the organizers offered me a deal, and I'm stay at an 1830 B&B that costs only $45 a night, the Alacoque.)


Extreme seems to be the place where extreme notions are freely talked about. Let me quote a closing keynote address that got captured and put online:

Those of you with long memories will remember the late 1980s and early 1990s when some of our number were already convinced that SGML was suitable as a universal modeling language and said so, loudly and in public. The more rational, or at least more conservative, members of our community would say, "Well ... calm down, Eliot. [Audience laughs.] SGML is very good for what it does, but there are some things for which, even though you could do them in SGML, it would be pointless, it would be dumb. For example, to make a graphics format in SGML — no one would do that. [More laughter.] It would be dumb to make a programming language in SGML; nobody would do that." [Laughter.] I come to you from teaching a workshop in XSLT in which the home-run demonstration was using XSLT to generate SVG images, and I submit to you that that is a demonstration of the network effect in action.

No one thought that not being in markup would be a disadvantage for a programming language or a graphics language. But when we tried doing them in markup, we discovered a lot of advantages that we had never suspected. That's the network effect at work.

Well, I'm very glad that Eliot Kimber had those notions about markup, extreme as they were interpreted as being, and spoke up about them. Extreme is a small forum, intimate almost, and that made it natural that I met Eliot there. I went out to dinner in groups with Jon Bosak, Norm Walsh, Eve Maler, Henry Thompson, Dave Hollander, Steve DeRose, and Michael Sperberg-McQueen, markup pioneers whose names I would otherwise only know from the standards those names appear on. This is, after all, a place to talk about "technical aspects of markup, markup languages, markup systems, and markup applications." (That would be the "M" in XML for those keeping score.)

And, to keep things relative, at the first conference I attended, Syd Bauman of Brown University's Scholarly Technology Group taught a one-day workshop in Python and it's at Extreme that I first encountered a session on Eiffel and heard of Ruby. Who knew then that these would be such essential tools now? Well, Syd Bauman and Sam Hunting, I suppose, which is why I'm glad I met them at Extreme. Not just theory but languages. (I didn't forget the "L.")


Librarians, markup theorists, programmers, even some actual extremists. There's more to the mix. Five years ago, I thought E-Books would be top of the heap by now. I was wrong. I first heard of approaches to overlapping markup (eg, nonhierarchical and hence non-easy XML) from Extreme. I sense that's a really significant area, but it hasn't had the impact I've expected.

The most interesting irrelevancy at that first conference I attended was another out-of-left-field keynote. In it, the speaker searched a photo database for images of "happy people" and one result was a photo captioned, "father watches daughter take first steps." That was mind-blowing — no "happy" in the caption! No "happy" synonym! AI in search, wow! And I learned that Doug Lenat of CyCorp was not so obscure. I would have predicted that Cyc's techniques would have revolutionized search as we know it by, oh, 2005. Well, not right, again.

At other conferences, and I have a publishing bent, I've heard Bill Joy (of Sun), Steve Jobs, Steve Ballmer, John Warnock (of Adobe) and lesser-known luminaries of the biggest players in the computing sphere speak about new directions. These guys don't keynote Extreme. The bigger names have had slick presentations and spoken knowledgeably about where we might be going, realistically it seemed to me, given their companies' influence on things. At Extreme, there's been as much interest in where we are going, and speculation sounds truly authoritative, but the content has most often been the reporting of the pioneers at the fringe, and not the business leaders' visions. And I don't think anyone knows exactly which fringe we'll be at next, but the ones discussed have all looked equally likely.

What's There

This year, the papers are divided into a variety of areas: schemas, Topic Maps and RDF, querying XML documents, XSLT, tools and techniques, theory and philosophy, markup of overlapping hierarchies, and uses and applications of markup, to use the categories and order offered by the official program.

The papers have not only already been written, they're already online. I like that. I could say, oh, the paper on "x" by "y" looks interesting. But the only papers I can say that about come from my short-term perspective. That's what I'd like to know about this year. All the things that I'd like to know about three years down the road, well, my track record at predicting what I need to know about what I don't know yet is mixed. The peer-reviewed approach by Extreme has just as many misses, but way more hits.

Frankly, some years I've preferred the closer-to-hand, stick-to-a-single-track approach I can take at the XML 200x conferences. Usually it's a different set of issues that take me to Extreme, a sense of needing to scan the horizon and know that I have enough time to prepare for what is coming three years or more down the pike. Like everyone, I have to pick and choose which conferences I can afford to attend, and I don't have the time for all that I want to attend. So what's good on the schedule at Extreme this year? Really, who can say yet? (Ask me in three years.) Some schema stuff. Topic Map meets RDF stuff. A keynote on how mainstream matters start out on the fringe (my thought exactly — I'm beginning to think like Tommie Usdin!). XSLT and overlapping markup stuff. Elliotte Rusty Harold. Michel Biezunski. Markup stuff for this and markup stuff for that. It's a conference about markup! There'll be stuff on what I already figure I need to know and stuff on things I don't yet realize I need to know — research stuff and graduate classes in XML. Some of the topics are extremely useful and some extremely abstract. Theory and practice. Fortunately I can find the real experts at hand to help me sort through things, like Steve Newcomb, Lee Iverson, and those Technorati guys. I always come back from Montreal thinking how extraordinary my experience was.

And, in the end, I guess the "X" in Extreme Markup Languages really stands for extensible too, because I always find that my horizons, knowledge, and possibilities have all been extended.

Extreme Markup Languages 2005 is going on now. The conference takes place in Montreal each summer. The conference website has links to descriptions of last year's conference, including Jim Mason's report on