Menu

The State of XML: Why Individuals Matter

May 30, 2001

Edd Dumbill

Introduction

Table of Contents

New Beginnings

XML Over the Last Year

Good News and Bad

Don't Lose Control of XML

Conclusion

This article is adapted from the closing keynote speech I delivered at XML Europe 2001 in Berlin, May 2001. I describe the progress of XML over the last year, emphasizing that in an industry increasingly dominated by large vendors, individual contributors are still key.

New Beginnings

XML has a tendency to spark new beginnings. Many existing technologies are being re-engineered to take advantage of XML, gaining interoperability benefits previously too costly to realize; industries are finding that XML vocabularies can form a basis for collaboration and cost-cutting, where such cooperation was previously thought counterproductive. XML's influence is proving disruptive to the technological status quo.

For better or for worse, many parts of today's computing infrastructure are being re-examined in the light of XML. For better, in that the benefit to be gained from interoperability at the syntax level is large. For worse, in that lessons from the past are being overlooked; however, not learning from history is too broad a charge to lay on the shoulders of overzealous XML developers alone.

Adding XML into your computing environment can be like initiating a chain reaction. Once one component can import, export, or process XML, it becomes obvious that there will be great benefit if the next component does, and the next, and so on. Within organizations and systems, XML is starting to form the basis for a "data bus," where information can flow between applications with less resistance and effort than previously.

To illustrate the diversity of XML's applications, here are just some of the areas XML has moved into recently.

  • Distributed computing: SOAP and the XML Protocol work have XML playing an important role as a wire format for intermachine communication, which would have beggared belief a few years ago.
  • Configuration: XML is now a popular choice for the humble configuration file, a ready-made, expressive syntax that means developers don't have to worry about creating new syntaxes and parsers for configuration and state files.
  • Directory services: The long running DSML effort provides an XML-based version of LDAP, and the recently created UDDI provides a more specific directory service for "web services."
  • Storage: WebDAV allows for the storage and management of data in remote filesystems using an XML-based protocol. All major databases now offer some degree of XML storage and searching.
  • Page layout: As XSL-FO and SVG near their completion, we now have languages for paginated, precise layout with pointy brackets. It is early days, but XML is making incursions into the world of professional print production.
  • HTML: XHTML, particularly XHTML Modularization, is changing the face of web page markup. Merely having well-formed XML in web pages yields many benefits, including easy parsing of pages, and a reliable platform for client-side applications. Interestingly, many of these benefits are being reaped in non-PC-based browsing environments.
  • Knowledge management: technologies such as Topic Maps and RDF seek to further construct the "XML data bus" by providing further layers of semantic interoperability. Work is also underway to bring angle brackets to logic and proofs.

Looking at these diverse areas of application, it would seem that little is safe from the attack of the angle brackets -- and this is without looking at the many initiatives in vertical industries to create XML frameworks and vocabularies.

XML Over the Last Year

The progress made by core XML technologies over the last year can be split into three categories: new initiatives, works-in-progress, and completed projects.

New: XML Protocol

The W3C's XML Protocol work is at the center of the most recent "revolutionary" shift in Web computing: web services. Although horribly over-marketed, the central benefit of web services is that they seek to standardize machine-to-machine exchange of XML. In particular, the aim is to make such exchanges easy for application and database developers. The use of XML over HTTP carries with it less overhead and fewer headaches than using the machinery of CORBA or DCOM.

There is a body of opinion that holds that the W3C should have rubber-stamped the SOAP protocol, devised by Microsoft et al., which is already widely deployed and implemented. However, the role played by the XML Protocol Working Group is more crucially a political one, namely, to form industry consensus (i.e. removing the Microsoft stigma), than the strictly technical role of standardization. The unprecedented size of the Working Group -- at my last count it had 84 members, including invited experts -- confirms this suspicion. A lean, mean, spec machine? Doubtful.

New: Semantic Web

For a long time, the Semantic Web has been Tim Berners-Lee's ambition for the future development of the World Wide Web. Enabling the machine-processing of web pages, the aim of the Semantic Web is to make the Web more useful for users. This year, the W3C has officially chartered a Semantic Web Activity.

One of the most immediate tasks of this Activity is fixing the Resource Description Framework (RDF), a core Semantic Web technology which suffers from a poorly written specification, and which has many unresolved issues. An encouraging development is the growing dialogue between the W3C and the developers of Topic Maps applications (who mostly have an SGML/ISO heritage), who are attempting to solve similar problems in semantic representation.

New: Many Verticals

There are now many more applications of XML in specific industry sectors than is possible to keep track of. Not all of these efforts will succeed -- XML is no silver bullet -- but there are beneficial effects of XML's adoption here: hitherto undiscovered possibilities for interoperability with other industries, and cost-savings to be gained from agreement.

As development continues, common chunks of technology across verticals are being factored out, and may soon be available as a platform on which industry-specific applications can be built. Perhaps the most ambitious project to create an underlying platform is ebXML, of which more may be found below.

Getting There: SVG

It has taken its time, but Scalable Vector Graphics in XML is nearly a stable W3C Recommendation. This is exciting news for those to whom high-fidelity graphical representation is important. There are many applications for which the existing range of rendering options is inadequate, e.g. the distribution of engineering drawings.

Another exciting feature of SVG is its small file size compared to bitmap graphics, especially for complex drawings. Combined with its inherent flexibility SVG is a good choice for use on mobile and other compact devices, as well as the desktop browser.

SVG represents the most important step forward in web user interface technology for a long time. The test, as ever, will be in its deployment. There are some excellent implementations of SVG in the field, and it would be great to see SVG make it as a default feature in web browsers.

Getting There: XSL-FO

A styling and pagination language has been in the mind of XML's creators since the beginning. As the work on XSL proceeded, it was decided that the transformation part, XSLT, was immensely useful as a separate technology. While XSLT was being spun off, the formatting objects half of XSL languished a bit. Now formatting objects are reaching maturity too and have the potential to make considerable changes in the way everyday page layout and printing is done.

I was excited to hear of a book written from beginning to end using XML. The author wrote the book's content in XML, then used XSLT to transform to XSL-FO (employing the skills of an external designer to fashion the look of the book in XSL-FO). He then went straight from XSL-FO to PDF, which went to the printer. The XML Europe 2001 conference proceedings were produced in a similar manner. While there is much development to do before XSL-FO can cater to all printing requirements, it's starting to have an impact in low-end applications.

There are other areas of use for XSL-FO. It's been used, for example, to computer-generate printed labels with bar codes. It could also be used within a company in the production of business cards -- most companies still design all their cards by hand, and a change of address could cost a lot of money and time. It's not just in typical document production where XSL-FO could have an impact.

Complete: XML Schema

The completion of the W3C XML Schema Definition Language comes as a relief to many. The most controversial XML technology over the last two years, W3C XML Schema should please most that were involved in the project, but it has never been anything less than a hot potato. At the commencement of the XML Schema work in 1999, everyone was worried that Microsoft would, rather than implementing the nascent spec, continue down its own path with XDR, its proposal for an XML schema language. Happily, Microsoft committed to the W3C route. As XML Schema developed, the focus of the dissent shifted to dissatisfaction with the spec itself: many found the drafts hard to read, others thought XML Schema lacked vital features, others thought that there were too many features.

The political imperative to complete XML Schema proved irresistible, and a (sometimes uneasy) middle ground has been been found. As with XML Protocol, the big story about W3C XML Schema seems more about political consensus than the technical details of the specification itself. Schema provides the hook from XML into databases and programming language data types, which is the missing link for many developers using XML, cutting down the effort required to bind XML into programs as a data transport. Yet at the same time, it carries a worrying risk of increasing XML developers' reliance on external tools in order to abstract away the difficult details of the XML Schema technology. I'll say more about this worry below.

Complete: XML Topic Maps

Related Articles

XML Europe 2001 Conference Report: A Web Less Boring

Around and About at XML Europe 2001

 

The completion of the XML Topic Maps specification marks an important step forward for document technologies in XML. It's certainly generated a lot of interest, as well as several new companies. Topic maps technologies are now making their way into mainstream content management solutions. There is a lot yet that the XTM community need to do: education has to be a priority -- many respected XML developers still don't have a clue what a topic map is actually for -- as well as continuing the work on integration with the Web.

Complete: ebXML

The UN/CEFACT and OASIS ebXML project to create an XML framework for global electronic business has now run the course of its chartered eighteen months. During that time it has created a platform on which it is hoped industries can build electronic commerce systems. Given the time constraints, ebXML is by no means finished. It represents a solid start, rather than a conclusion. Work is now continuing in separate working groups chartered by the UN and OASIS.

Providing an alternate approach to the much-hyped web services route, ebXML is tackling a similar problem, but, if you will, from the other end. It is hoped that the overarching framework from ebXML and the low-level web services technology will meet to complement each other. Much implementation experience is required, and during that process the more ambitious (or even unrealistic) aims will be tempered.

Good News and Bad

The developments of the last year are, no doubt, good news for XML. More and more of the tools we need to make use of XML are now in place: whether for developers or those creating industry frameworks. New areas open to XML as standardization and implementation continues. And, finally, as new incentives are created, the attack of the angle brackets in the name of interoperability proceeds apace.

Not everything in the garden is rosy, however. Despite some excellent progress over the last year, there have been several disappointments, and also some new challenges thrown up. Linking and pointing in XML (the W3C's XLink and XPointer specifications) are still lagging behind. Hypertext seems no longer the darling of mainstream markup technology. The notion of competition in the Web browser market seems like a very dead thing indeed, and there is little leverage left for users to encourage browser vendors to directly support XML technologies such as XHTML. In fact, the cutting edge of browser technology seems to have transferred to the mobile market, where much better use is being made of XML to reap benefits in areas such as speed of parsing.

XML's increasing uptake as the natural choice for new file formats, coupled with the Namespace Recommendation, poses some difficulties for existing Internet infrastructure. Until now, the Internet Media Type (MIME type) of a file pretty much established its content (e.g, audio/wav, image/jpeg, text/html). Now it is possible to mix languages -- SVG, XHTML, and SMIL for example -- in one document. Beyond establishing such a document as text/xml, there's little more we can do. Clearly more work needs to be done to revise Internet infrastructure. This is much easier said than done, and it requires the participation of dedicated individuals. The steering of any specification through the IETF is no easy task. There seems little immediate commercial imperative for companies to fund such work.

Don't Lose Control of XML

As the XML family of specifications grows, and money flows into all things XML, there are dangers that loom on the horizon. One highly worrying trend is the apparent rise of political over technical motives for participation in W3C Working Groups. The implausible sizes of the Schema and Protocol WGs indicate that many vendors consider it important to be part of these developments. This is all well and good -- consensus can only increase interoperability -- but as one XML old-hand said, the new members from the fringe all feel the need to make some kind of contribution in order to justify their participation. The result is a difficult time for the WG chairs and spec editors, not to mention the inevitable consequences of design-by-committee. Perhaps the W3C should invent a new level of participation to suit the political needs of vendors without jeopardizing the hard technical work done by the core of the working groups?

XML Schema is considered by the W3C to be part of the new core of XML, together with XML 1.0 and Namespaces in XML.  That the specification caters to many needs is a weakness as well as its strength. Although unintentionally, the W3C has significantly raised the barrier for participation in XML. I wrote above that XML has an excellent role in lowering the barrier for participation in many areas, such as pagination or intermachine communications, which makes this increase in size of "core XML" all the more concerning.

Apologists for XML Schema typically counter protests of complexity by assuring developers that tools will shortly be available, and so they need not make the effort to understand all of the specification. (The fact that this excuse was at one point given to potential developers of open source XML Schema processors is breathtaking.) While most developers with a reasonable budget for buying software development tools will accept this reasoning, it results in not only a raising of the technical barrier for participation in XML but of the financial barrier too.

If global business using XML is ever, as the ebXML initiative intends, to involve the whole world, then XML cannot become a plaything of vendors who wish to spin up the version numbers game. At the basic level of XML, remarkably few resources are required to process it -- the simplifying invention of well-formed documents was arguably the gateway to XML's widespread success. To backpedal on such decisions now by further complicating XML, runs counter to the original intent of its creators. XML may yet become the victim of its own success.

This tendency to complexity underlines the continuing importance of heroes in XML. XML's history is littered with individuals whose vision and technical ability have made significant contributions. Vital technologies such as SAX, and more recently the emergent RDDL, may never have emerged if it were not for these people (not to mention XML 1.0 and XSLT). Yet we now seem to be in a position where many of the heroes have tired, and those that remain are seen as quaint oddities of the XML community by some vendors.

Perhaps that's right; now that the groundwork is done, and practical application is underway, aren't these folk just crazy idealists? Absolutely not. While considering the large scale deployment of XML in companies such as Boeing, recall that XML started with the vision and determination of a handful of people, and not as a corporate strategic initiative. From conversation with big industry consumers of XML products, it has become apparent that these companies actually value the contribution of individuals in the XML community very highly, for the role they play in keeping the software vendors sharp and on their toes.

The emergence of alternative XML schema proposals RELAX, TREX and Schematron -- all from individual contributors -- has done much to raise the profile of the technical debate about XML Schema, and it seems likely to have a positive effect on the future development of the schema language at the W3C. It's good to have consensus on the core of XML, but even W3C specifications continually need competition at a technical level, as there can be a considerable gap between what the vendors in the Working Groups want and what is in the best interests of users or XML itself.

Conclusion

The progress of adoption and change wrought by XML has accelerated over the last year, but with it comes certain dangers. XML must not be allowed to become so complex that it defeats the point of its original creation and unacceptably raises the level of financial and technological resource needed to use it. A growing reliance on vendor products also runs the risk of creating an identifiable market growth area, which, when it inevitably hits a decline, could take a chunk of XML as a technology down with it. Because of these dangers, the role of individual contributors in the XML community (whether affiliated with a company or not) is more important than ever. They remain among the most creative and influential participants in the development of XML.