XTech 2005

June 1, 2005

Micah Dubinko

"I understand XML, I just don't understand any of the other things that start with 'X'"--Overheard at XTech 2005

In this XML-Deviant column, I reflect on the recent XTech conference in Amsterdam, May 25-27, 2005. The conference used to be called 'XML Europe', but under the guidance of conference chair Edd Dumbill, this year's event got a new name as well as two new conference tracks: Open Data, and Browser Technologies. Thus, instead of a pure XML focus, the event could stretch out more and cover related areas such as web applications, weblogs, search, and the semantic web. Even the conference schedule got remixed into several new forms, showing a mix of both open data and browser technology.

Attendees liked this new format, though it did at times create some tough choices over which sessions to attend. As a result, a purely chronological discussion of the sessions would incur several abrupt context switches. So, the discussion follows topical grouping rather than the strict order in which things occurred.

Open Data

The conference day started with keynotes on Wednesday only. The first came from Paula Le Dieu, who holds several titles, including Executive Director, Science Commons, Creative Commons International. She spoke at length about the BBC's ongoing effort under the banner of Creative Archive. You have to love a talk that include clips of overdubbed world leaders crooning and a music video created by a huge number of in-game players of Star Wars Galaxies, a popular, massively multiplayer game. By showing off such works, Le Dieu emphasized that people are using new technology to be creative in new ways that weren't even imagined a few years ago--but in order for this to work, copyright and other IP laws need to be appropriately balanced. The time is coming soon, if it's not here already, when individuals and small groups can produce new forms of creativity that can compete with mass media.

Overall, the BBC archive has 1.5 million video and film pieces and half a million recordings. Before being able to release some of these materials they needed a communications and legal framework to make these materials available to "rip, mix, and share". The license they ended up with is similar to a Creative Commons license, with five main facets:

  1. Non-commercial use only.
  2. Share-alike clause (derived works must be placed under the same license).
  3. Attribution required.
  4. No endorsement and no derogatory use (not for campaigning, soapboxing or to defame others).
  5. Available to broadband users within the UK for use primarily within the UK.

Some of the restrictions beyond vanilla Creative Commons licenses are necessary because of the way many BBC works tend to be originated and funded. However, it will be interesting to see how the UK-only rule works in practice out on the inherently borderless internet. Le Dieu did note that the project is experimenting with GEOIP restrictions, and that any more sophisticated kind of DRM simply "doesn't work".


In another session, Michael Brauer, chair of the OASIS OpenDocument committee, talked about OpenDocument, which was recently ratified as an OASIS standard. Work on the file format has been ongoing since 1999, and since 2002 within OASIS. In his words, the format provides a way to store documents so that they will still be readable "in 20 years". Despite this refreshingly conservative stance, the format has taken in several recent XML technologies, including Relax NG, CSS3, XForms 1.0, and SVG 1.1. OpenOffice 2.0 will use OpenDocument as the primary, native save format. Other packages, including KOffice, are also coming online with support.


Novell's Jon Trowbridge gave an informative and entertaining talk on "Beagle: Free and Open Desktop Search". Beagle is an open source search service much like the recently vaunted Spotlight included in OS X Tiger. Does it work? "More or less. You can run it every day," said Trowbridge. A key strength of the system is that it works mainly with open data formats, reading from the file system, email/contacts (via Evolution), IM (via Gaim), notes (via Tomboy), audio and ID3 tags, image and EXIF metadata, source code, and even PDF. Trowbridge's conclusion and take-home point: "Just like data should be free and open, the way to search it should be free and open."

Compound Documents

Morning sessions on Friday covered the always-challenging area of compound documents and integration. IBM's Kevin Kelly spoke about a Model Driven approach to Compound Documents. The two major categories of compound documents are 1) by reference: for example, with an <object> or <img> tag pointing to an external file such as a SVG image; and 2) by inclusion: for example, with SVG elements appearing inline in an XHTML document. Kelly's work in this area utilizes the Eclipse framework, which includes an extensive modeling framework, to develop platform-independent, and then platform-specific, models of compound documents. Within Eclipse, this allows new compound document types to be created for editing based on an internal model repository, complete with a list of allowed root elements and directed editing to show what is allowed at any point. Kelly concluded with an intriguing thought: "Couldn't we use these models for other things as well?"

Immediately following Kelly's session, Mark Baker talked about the ongoing work within the W3C Compound Documents Working Group. Unfortunately, an official working draft wasn't available by the time of the conference, leaving little to talk about, but the long Q & A session that followed proved to be useful.

Overall, the open data movement shows signs of being a healthy, thriving, growing community. The early struggles of the Compound Documents group, however, point to deeper problems, which I suspect are related to inadequate scoping of the initial XML namespaces work. One of the original visions for XML has always been the ability to combine all different sorts of vocabularies into structured documents, in a semantically meaningful way. XML namespaces helped solve the potential problem of conflicting element names, but didn't anticipate bigger problems like cross-vocabulary DOM access, event flow, combinatorial mediatype explosions, or content negotiation. It's a testament to the strength of open data that it has attained so much traction, even in the current environment. The Compound Document Working Group has quite a challenge ahead.

Browser Technology

The other keynote to open the conference came from Mike Shaver of the Mozilla Foundation, who asked the question "What if the next big thing on the web isn't a big thing at all?" He talked about "little bangs", localized, sharp, incremental technology changes that take advantage of the installed base. As examples he cited the <canvas> element, a programmable image tag, SVG, and browser extensions like Greasemonkey. He predicted that some little bangs to come include better accessibility, drag and drop, richer widgets, platform fidelity, and an offline model.


Steven Pemberton's talk continued on a similar theme, with the title "XHTML2: Accessible, Usable, Device Independent, and Semantic", delivered to a completely packed audience that overflowed out into the hallway. Pemberton emphasized that XHTML 2.0 makes significant strides in accessibility and usability, while retaining a structure and feel that will be instantly familiar to existing HTML authors.

RDF and XHTML have had a rocky history, but a solution may be near. Pemberton described how XHTML2 uses familiar <link> and <meta> elements to represent useful RDF triples in an easy-to-grasp (and DTD-validatable) syntax, something that "won't send HTML authors screaming in the other direction". Further, XHTML 2.0 allows metadata anywhere in the document, not just in the header, which should encourage authors to create more metadata. In short, lots of long-standing metadata problems have been solved. In a similar vein, a new role attribute allows common substructures, say a toolbar or a blogroll, to be cleanly and unambiguously identified.

Pemberton said XHTML 2.0 will enter Last Call "real, real soon".

A lively Q&A session followed. Notable responses: Consensus has played a major role in shaping XHTML 2.0. So has interaction between Working Groups, or sometimes deciding when another group should be in charge--for example the CSS Working Group would need to address the request for special style treatment of role attributes. The HLink specification was basically a position statement, that XML meeting their requirements really was possible. QNames in attribute values is a ship that has already sailed, since XSLT 1.0 and XML Schema. Significant effort continues to be expended on reducing the negative impact XML namespaces have on HTML authors, which are "allergic" to them.

Extending HTML4

Immediately following, Ian Hickson presented a contrasting talk, "Proposing Extensions to HTML4 and the DOM", highlighting some of the work by the Web Hypertext Application Technology Working Group, or WHAT WG, a loose coalition between Opera, Apple, and Mozilla. The proposals covered a huge surface area, from a <datagrid> widget to <canvas>, from drag-and-drop to contentEditable as already implemented in IE, from a <section> element almost exactly like that in XHTML 2.0 to new elements like <aside> or <footer>. A working title for the set of extensions is HTML5. True, browser vendors had caused trouble years ago by racing ahead with their own elements, but this time things are different, since there are specifications. From the audience Steven Pemberton, a veteran himself of the earlier browser wars, commented that they had forgotten the <irony> tag.

During this session, I found the IRC backchannel to be particularly incisive. IRC participants pulled no punches in pointing out apparent inconsistencies in the talk: the extensions are called "proposals" but are nearly completely implemented; description of the process as "open" v. an invitation-only roster.

An even more lively Q&A session followed, which was captured by Leigh Dodds, a former XML-Deviant columnist, on the XTech Wiki. Some highlights: many folks would like a 3D version of <canvas>. Much of the WHAT WG feedback has said that divergence from (X)HTML is not of any concern. In fact, Hickson stated that of all the potential client-side technologies in the same general space, only a single one may survive in the long-term, so making compromises to align better with others is too costly. Assessing what features go into HTML5 is largely a personal judgment call, though a few others on the WHAT WG charter have veto power. The mailing list has about 400 subscribers, of which about two dozen are active. Every comment given will receive a personal response. Microsoft was invited into the group, but didn't respond. Mobile vendors and others haven't expressed a large amount of interest in participating. Dean Edwards of IE7 fame has expressed interest, and should be able to implement 90% of HTML5 for IE.


With all that action going on, it's easy to lose track of all the other browser technology covered at the conference. Rob Relyea of Microsoft spoke about XAML to another packed crowd. Microsoft is emphasizing the platform aspects of the markup language, as well as making it usable for a "10-ft UI"--the type of simple interface that's usable from a greater distance to the screen.

News networks like CNN call TV "interactive" because they have nice graphics, transparency effects, and so on. XAML is adding all the same sorts of features. At its heart, though, XAML is an interchangeable XML representation of underlying .NET objects. This tight coupling effectively forecloses clean separation of content from presentation, as well as reuse of existing vocabularies like XHTML or CSS.


Immediately following the XAML talk, Google's Ben Goodger presented on XUL, the Mozilla markup language used to provide the user interface throughout several Mozilla product lines. The talk had more of a bare-metal developer feel, as opposed to the more polished marketing-slick feel of the prior talk. XUL has the major advantage of working cross-platform, as well as a proven deployment with millions of applications in use. XBL, the binding language used with XUL, adds an additional layer of functionality. The big news is that Firefox 1.5 will be a full "XULRunner" application, which will make it very easy to deploy entire lightweight XUL applications in a lightweight fashion.


Later, Brian Ryner dug even deeper into XUL and XBL, and gave an introduction to XTF, the Extensible Tag Framework, a set of APIs enabling things like a pluggable XForms module for Mozilla and Firefox.

Oliver Steele presented about Laszlo, the open source XML vocabulary and Flash interface for web applications. Like the XAML team, the Laszlo folks have put a huge amount of work into a "broadcast quality" interface experience, complete with tasteful animation and partial transparency effects. As a former commercial product, it required a licensed server; however, now that it's an open source product, it is available in a server-less profile.

Jeff Bennett talked about xfy (pronounced X-fie) an XML Document Management solution from Justsystem, makers of the most popular Japanese-language word processor. As a stand-alone app, xfy is able to understand several different vocabularies, including XForms and SVG, and intelligently process them in combination. Several attendees compared it to X-Smiles, though as a commercial application it had a great deal more polish.

As someone personally contemplating new development of a web application, I find the number and quality of options available to be staggering. Much food for thought.

And we talked about XML too

While the new parts of the conference were popular, the core of the conference was still XML, and the presentations there did not disappoint.

The topic of XML pipelining was all the rage about a year ago, though it seems to have died down a bit lately. So it was refreshing to see a solid technical presentation from Jeni Tennison extolling the virtues of "Managing Complex Document Generation through Pipelining". The presentation traced out the interaction between designers, developers, and authors. Often, by structuring a workflow into smaller, specific transformations in a pipeline, all the parties involved will be happier and have better-defined (and easier-to-debug) individual pieces to manage. While not a panacea, it can be a palliative, and new features in XSLT 2.0 make defining a pipeline even easier.

RESTful Services

Leigh Dodds spoke about "Connecting Social Content Services using FOAF, RDF and REST". He commented that many existing services, to the extent that they are RESTful, are so by accident. With a little effort, such services could provide a much more useful face to the world, including proper application of HTTP status codes.

XForms, Etc.

A couple of XForms talks fell into the general XML category as well. The first, from Klaas Bals of Inventive Designs, showed off a new integration of XForms with XSL-FO, providing dynamic, great-looking Universal Business Language (UBL) forms. By using XSLT triggered off the instance data, the product is able to accomplish all kinds of rich effects, and their demos seemed to shine the brightest while doing reasonably complicated multi-page layouts, like the kind you get with your monthly telephone bill.

Another XForms talk came from Eric Bruchez of Orbeon, who demonstrated the power (and client-freedom) that results from using a server-side XForms engine. The flexibility of XForms not only allows a server-side implementation, it even enables some optimizations that wouldn't be possible in a client-only environment.

Rick Jelliffe's talk was "Organic Extensibility as a Browser Design Approach, as Implemented in the TreeWorld Browser for Ad Hoc XML". Jelliffe's company, Topologi, has a lightweight TreeWorld browser, not for HTML but for semi-structured tree objects. His talk covered lessons learned with the software as a component in products and systems.

The conference concluded after lunch on Friday with a final keynote from Microsoft's Jean Paoli, who started off with a video about the virtualization of French public services in a small village of about 400, using Microsoft technologies. Paoli said that he was "Very happy to see XForms in OpenOffice", as a validation of XML for documents and document processing. He clarified the licensing situation around the XML vocabularies used in Office 2003, stating that anyone is free to use the languages, with proper attribution, through a perpetual, royalty-free, open source-compatible license. In fact, the earlier video, which showed people completing InfoPath forms in ordinary browsers, was possible through an open source converter from the InfoPath format to an ASP.NET format.

Paoli's passion for XML and documents shined through the entire talk, especially two of the final points. He spoke out against binary XML, simply saying "No, please," and concluded with a prediction: In 2010 75% of new documents worldwide will be created in XML.


As Edd Dumbill wrote in his blog, "From my immodest point of view, XTech 2005 was pretty much the best conference I've ever chaired." If hallway conversations are any indicator, the conference was well-received indeed.

This kind of conference gives a good idea of the general direction XML and related technologies will take over the next year, as well as a chance to reflect.

Two technologies in particular stood out. One is XForms--the number of XForms mentions has noticeably increased, even surpassing the level they were at around the time it became a W3C Recommendation. The other is UBL, which is quietly showing up in all kinds of places. These technologies represent, in a way, a return to the original goals of XML. I expect to see more of both of these over the next year, hopefully in concert.

Next year, XTech will likely happen in Amsterdam again. See you there!

Overheard at XTech

Finally, a few more interesting or humorous quotes overheard at XTech.

"Nobody likes to talk to developers."

"ISO is glad to hear this. We're from ISO."

"The lengths that people will go to in order to engage in massive media."

"One WG member builds a browser in a refrigerator."

"Basically that says, you can't do anything."

"We changed the terms of the license to say you can do what you're already doing."

"XML Schema datatypes are mostly harmless."