
XML 2.0 -- Can We Get There From Here?
by Kendall Grant ClarkFebruary 20, 2002
What Will XML 2.0 Look Like?
The W3C's Technical Architecture Group (TAG) was chartered primarily to make the tough, overarching decisions about how all the various parts and pieces of "Web technology" are supposed to fit together. In Mary Shelley's famous Gothic novel, Frankenstein, the narrator creates an eight foot tall creature out of bits and pieces of ordinary humans stolen from graves and charnel houses. While it is too harsh to suggest that W3C Working Groups are like charnel houses, one gets the sense that the work of the TAG is going to be rather more hodge-podge and ad hoc than one might assume from its lofty name. The range and quality and sheer number of the requests for TAG adjudication already suggest that there are a lot of corner cases lurking in dark corners.
Perhaps it should surprise no one, then, that out of the ad hoc, variegated work of TAG comes one of the first substantive proposals or trial runs or "thought experiments" (though that's a misnomer, really) for what XML 2.0 might look like. It is fitting that Tim Bray, who was so instrumental in XML 1.0, would first offer a draft for what XML 2.0 might eventually become: Extensible Markup Language - SW (for Skunkworks). Though, it should be said at the outset, Bray offers XML-SW as a highly provisional proposal; or, as he put it, "nobody so far - not even me - has taken the stand that this is a good idea". But it is a start.
XML-SW is a conglomeration of XML 1.0 2nd edition minus the DTD machinery, including entities, with the addition of namespaces, XML Base and XML infoset. The result, in Bray's view, as well as that of some other notable XML developers, is a net gain of simplicity and elegance. Bray described some of the changes in detail.
All the endless circumlocutions around parameter entities: gone. "For interoperability": gone. The attribute value normalization and line-end handling migrate into the infoset, where they belong. xml:base goes with xml:lang and xml:space into a section about reserved attributes. Namespaces go into the discussion of elements and attributes, where they belong. "standalone=": gone. There's a nice "other markup" section for comments, PIs, and a vestigial doctype declaration. The vestigial doctype is defined purely syntactically and has no internal subset - a low-cost way to let people do DTD validation with XML 1.0 processors. The conformance section has real content, including the error-handling, which has migrated out of its awkward home in the definitions list. All the links out of infoset and namespaces are internal.
As interesting as these changes may be, what's even more interesting, I think, is the degree to which the process that will finally settle these questions, as well as all of the wrangling and squabbling leading up to the start of that process, isn't technical at all. Whatever XML 2.0 eventually becomes technically, the process that creates it will be more social and political than anything else, and it's that process which seems perilous and fragile at best.
|
| |
Bray is certainly aware of all the non-technical work which goes into making, to say nothing of remaking, an important public standard like XML. He even suggests a kind of in-advance rule or guideline. "The temptation to introduce," Bray says, "JUST A FEW little obvious improvements that nobody could possibly disagree with is overwhelming, but that is a slippery slope leading into the most noisome of ratholes". And so it is both a "noisome rathole" and an "overwhelming temptation".
So overwhelming, in fact, that no one could resist it and still make a substantive proposal for XML 2.0. The whole point of creating a new version of the standard is to improve upon the existing one. And so, as Bray clearly knows, his message announcing XML-SW abjures precisely what it accomplishes; namely, Bray's XML-SW makes many small improvements, for example, dropping the DTD and entity machinery, even as he pleads with others to refrain from suggesting their pet improvements.
Whatever else the value of XML-SW, or any such proposal, Bray's in-advance rule for structuring the XML 2.0 process is doomed to fail, that is, it's certain to be ignored by everyone who is centrally or peripherally involved in the conversation from which the world will get a revised XML specification. If anyone has moral standing to go first in proposing what XML 2.0 might look like, Bray and a few others have it. Someone has to go first. Someone has to make a proposal which will initiate the ideally collaborative and consensual process. To put this point in Shelleyean terms, Frankenstein had to create the monster, but after that it gained a life of its own.
But no one has enough standing to ask others to refrain from suggesting the "little obvious improvements that nobody could possibly disagree with". Sorting out those suggested improvements, saying yes to some and no to others, just is the process, and it will be messy and political, and there isn't anything anyone can do to prevent it.
Frankenstein's Monster's Neck Bolts
|
Also in XML-Deviant | |
As if to illustrate this general point, most of the subsequent discussion of Bray's XML-SW draft focused on whether XML 2.0 should drop processing instructions (PIs). PIs became something akin to the neck bolts in the classic Karloff version of Frankenstein; a part of the monster it was never entirely clear he needed but without which he wouldn't be the monster. Which is to say that PIs are an XML feature that some people cannot imagine living without and for which others cannot imagine ever having a good use.
David Orchard responded to XML-SW by saying that "[t]his is really great stuff. While I think that PIs should also be lopped off, and XInclude for entity replacement and an optional XML Schema validation level added, I can certainly live with this." But, Elliotte Rusty Harold responded, PIs can be very useful. Tim Berners-Lee took a sort of mixed approach, agreeing that PIs are useful, but suggesting they be excised nonetheless. "I feel they are harmful," he said, "because they bypass all the extensibility power one has with namespaces to make well-defined extensions. PIs also add a barnacle onto the XML syntax which it really doesn't need."
In response to the anti-PI clamor, Simon St. Laurent suggested that PIs could not be replaced by elements unless "you'd be willing to throw validation out". Further, St. Laurent added, "I would hope that the W3C would drop its continuing institutional animus against processing instructions. If there is a need to blast some bit of XML's SGML heritage as incompatible with the Web, may I suggest notations, unparsed entities, or both."
However syntactically inelegant or practically useful, there is some cost to removing PIs, just as there would have been some cost to the monster of removing the neck bolts, even if he no longer needed them. Dan Connolly makes exactly this point when he said that "the cost of keeping PIs is lower than the cost of getting rid of them...The cost of keeping PIs is no more than the cost of comments, as far as I can tell: one method in the SAX API, a few lines in the XPath spec, etc."
Norm Walsh, chair of DocBook's Technical Committee, offered a strikingly concrete example of the practical, everyday goodness of PIs, which really ought to be required reading for anyone who suggests they be removed.
What I really want here is, uh, how can I describe this? What I want is an instruction that I can insert into my document that will tell a particular processor that it should do something special. I want a, wait for it, a processing instruction! ...
The PI is entirely harmless (and invisible) to processors that don't care about it, but provides useful information for processors that go out of their way to look for it.
Lastly, Elliotte Rusty Harold suggested that PIs are an important part of the extensibility of XML documents. PIs are useful as a way to add processing "information to documents written in XML vocabularies we do not control and cannot change. Perhaps schema languages should be written in a more permissive fashion so that they automatically allow anything from other namespaces...[but] that is not how either DTDs or the W3C XML Schema Language is written." Which is a nicely abstract way of making the very concrete point Norm Walsh made.
Conclusion
As the discussion about processing instructions makes amply clear, the indispensably useful feature of one constituency is the unbearably ugly wart of another. What goes for PIs is sure to go for the DTD machinery Bray excised from his XML-SW and perhaps other things, to say nothing of all the other improvements which lurk in the minds of XML devotees.
The problem lies not in figuring out which XML 2.0 we'll end up with. The real problem lies in managing the process of getting from here to there, a process that shows every sign of being far more politically difficult than getting XML 1.0 was to begin with. What the XML world needs are proposals and thought experiments about what that process will look like, how it will be managed, whether the corporate entities that sponsor the W3C are willing to consider industry-wide and public goods in addition to institutional self-interest, and so on.
Do we need an XML 2.0, and if so who should call the shots as to what does and does not get in it?
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- I am willing to agree on Tim's in-advance rule
2002-02-27 20:57:58 Makoto Murata [Reply]
Kendall Clark wrote: "Whatever else the value of XML-SW, or
any such proposal, Bray's in-advance rule for structuring the
XML 2.0 process is doomed to fail, that is, it's certain to be
ignored by everyone who is centrally or peripherally involved
in the conversation from which the world will get a revised
XML specification."
I was a member of the original XML WG. Although I have some
favorite extensions to XML, I am happy to agree on Tim's
in-advance rule. Without that rule, XML 2.0 is likely to
become a 400-page monster. If this rule is not agreed in
advance, I would oppose to create XML 2.0.
<warning>Speaking for myself only</warning>
MURATA Makoto (FAMILY Given)
- processing instructions
2002-02-27 02:37:12 bryan rasmussen [Reply]
I can't understand anyone considering processing instructions an ugly wart because they don't get in the way of anything you want to do, the only way they could ever cause grief is if you need to work with them, in which case, well, duh, you need to work with them it seems, the implication of needing to work with something being that you need that something. I hardly ever use pi's, generally finding that some other solution is better, but just because I hardly ever use something doesn't mean I can't see the need for it.
- patience with existing standards
2002-02-21 03:13:51 jim fuller [Reply]
Now is the time to allow for the enormous range of specifications to be adopted, in their current ( and flawed ) form.
Adoption and maturity of existing standards will create barriers to change later on, but allow for a more representative range of commercially based use cases to be created.
In addition, there are many XML 1.0 dependant specifications and technologies that require more time, such as;
- schema: the rich range of validation technologies requires convergance
- query mechanism: XSLT/XPATH 2.0 and XQuery require adoption prior to monkeying around with XML.
- signature and encryption: lack of implementations
- input: lack of xforms implementations
- xlink: hasen't stuck as an integrated standard, lack of implementations
Currently, I would like to poll every author of an 'xml' book to see just how many commercial implementations they have been involved with; with respect to the range of technologies that XML 1.0 enables.
I think ( and yes, this is anecdotal ) that one would find the number less then 5 projects.... and of that 5; budgets of less then a £1,000,000 ( admittedly a poor metric, but probably honest ).
And if this rings true for them, one can only imagine adoption rates outside of this group.
I'm all for discussion of XML 2.0, but think that the current discussions presented, especially in this article, are academic( normally i find xml-dev quite esoteric which can be fun on certain days ) when contrasted to the problems encountered in implementation of a solution.
Without feedback from actual commercial experience in migration and use cases with existing XML 1.0, embarking on 'fixing' XML 1.0 would be achieved by those who do not represent the requirements of the largest group of potential users/developers.
regards, jim fuller
-------------------------------------------
on-IDLE ltd. www.on-idle.com
Technical Director cutlass@secure0.com
James Fuller jim@on-idle.com
34 Bedford Row, London UK WC1R 4JH
T: (44) 02077945829 F:(44)02077941628
-------------------------------------------
