XML Reduced

October 11, 2000

Leigh Dodds

Perhaps reeling from the hype associated with the XML family of standards, members of the XML-DEV list this week have been getting back to the basics to discover what's necessary for a first-time XML developer.

Don't Believe the Hype

The boom, and the possible subsequent bust, and, through it all, the hype surrounding new technologies is a fact of life in the IT industry. It provides a rich source of material for the technology media. Unsurprisingly, XML has received its share of criticism, including recent claims that it's killing the Web. Rarely tolerant of this kind of reporting, XML-DEV has reacted to some of these claims.

Observing that much good work has been done in the area of server-side XML, David Megginson noted that these efforts rarely make for good press. Client-side XML (that is, XML on the Web) is a more interesting topic, but it's seen little real progress. This is largely attributable to the slow addition of XML capabilities to the current generation of browsers.

Megginson commented that the community may have brought some of this criticism on itself.

When XML came out 2 and a half years ago, XML's promoters (W3C and otherwise) made a Faustian bargain -- promote XML as the next-generation of HTML (which is plainly misleading, but very interesting) rather as than a low-level layer for serializing tree structures in clear text (which is accurate, but boring as hell).

To put this another way, XML has promised to deliver a semantically richer, and less broken, Web, but we're still a long way from realizing this promise. Hype has lead many to believe that these advances might take place sooner rather than later. Yet it is important to ensure that this doesn't detract from the many real and immediate benefits of XML technologies.

Michael Maddox noted that XML, like any technology solution, involves certain trade-offs, but produces results which are much better than the alternatives.

As computer scientists, haven't we realized that no single paradigm perfectly fits every problem? We make tradeoffs in every solution, trying to minimize the pain and maximize efficiency, but never entirely ridding ourselves of the pain and never quite building the perfect system. We know this is true, even with XML. But we know it's a better solution for interop[ability] and a world of other issues that currently plague our field.

However, while the developer community may not believe the hype, it still may not be in a position to deliver all the available rewards. As Matthew Gertner pointedly asked,

How many of the numerous XML experts on this list are really keeping up with the standards and can make intelligent statements about more than a couple of them?

The proliferation of XML-based standards has been a source of anxiety within the community, raising concerns about interoperability and the stability of the XML infrastructure being developed. But for beleaguered developer Richard Lanyon, the real question is

how [many] of the XML-associated technologies do you need in order to be able to start working on XML?

Where to Begin?

Jonathan Robie attempted to answer Lanyon's question, drawing a distinction between the core XML technologies and their related standards.

You can be an XML expert if you really know XML itself and a few related core technologies (DOM, XSLT, DTD design, perhaps W3C Schema). There are many horizontal and vertical technologies or standards based on XML, and a good handful of XML-centric technologies like W3C Query that are under development, but you don't need to know these to get real work done today.

Sometimes media and marketing organizations seem to suggest that every new thing done with or for XML is part of XML itself. Sure, it is always good to learn from the things other people are doing with XML, and to see if there is already a well-designed solution to the problem you are trying to solve, but that doesn't mean I have to keep up with everything anyone is doing with or for XML.

One effort that's attempted to reduce XML down to its essence is the Common XML specification, produced by members of the SML-DEV mailing list. Common XML defines a subset of the XML language that highlights the important aspects of the XML specification and builds a foundation for interoperability. The specification includes a guidance on the costs and benefits associated with specific XML language features.

Simon St. Laurent, who edited the Common XML specification, noted that beyond the core, individual requirements vary considerably.

Beyond that core of elements, attributes, and simple namespaces, it seems things really diverge. Some people want nothing to do with DTDs, planning to move directly to Schemas (or even RELAX). Others don't care about either DTDs or Schemas. Some folks can get their work done with SAX alone, while others use just the DOM and others need SAX, DOM, and XSLT working together, and then maybe they switch to JDOM from DOM and...

Hypertext people need XLink and/or RDF and/or Topic Maps, while programmers might be more interested in SOAP, XML-RPC, or some other protocol or messaging work.

Michael Brennan praised the Common XML effort and indicated the difficulty that newcomers face when approaching the array of XML-related standards.

I feel for people who are just starting to approach these technologies. I think many on this list don't really have a good sense of what a challenge it can be for someone just to figure out which of the specs are relevant to their needs. On the other hand, those who complain about fragmentation and the loss of simplicity don't have a good grasp of the breadth of problems being solved by it all. It all gets lumped together as "XML", when it is really a rich plethora of solutions to varied problems that happen to share XML as a common foundation.

Brennan stressed the importance of providing developers with a road map to these specifications as well as the value of supporting tutorials and mentorship.

David Megginson added some useful advice for the new XML user and warned against attempting too much too soon.

My advice to a new XML user would be to learn XML 1.0 itself, XML Namespaces, and (if she's a coder) at least one of the XML-related APIs. A glance at a Unicode tutorial might be a good idea as well.

After that, she should ignore the other specs until she has a serious problem that she cannot easily solve otherwise; if she never ends up reading RDF, SMIL, DOM, SAX, XML Schemas, XLink, XPointer, XSLT, SOAP, RSS, CSS, XHTML, XHTML modules, etc., then she didn't need them in the first place.

On the other hand, if she reads these specs too early, she'll just end up inventing problems for the solutions she's learned. Part of my consulting work is cleaning up after people like that.

This is useful advice: a great deal can be achieved with the trinity of XML, DOM, and SAX without recourse to the other specifications. All hype aside there are a lot of interesting and important applications which can be developed on these foundations.