XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

A Path to Enlightenment
by Leigh Dodds | Pages: 1, 2

A Departure from Type

Also in XML-Deviant

The More Things Change

Agile XML

Composition

Apple Watch

Life After Ajax?

Arguments about types seem to fill the landscape this week, as runoff from the schema and namespace debate summarized in last week's XML-Deviant continues. (None too surprising, perhaps, given that those topics have probably consumed more XML-DEV threads than any others).

It seems that there are different viewpoints about where and how, or even whether, type information is associated with XML markup. The core issues relate to whether one is working with simple well-formed markup or with systems that involve validation and strong typing. Tim Bray has been at the center of this discussion, arguing at length that properly labeled markup is the key to maximum flexibility.

Q1: Why would you use XML?
A1: One of the important reasons is so that you can re-use data for purposes other than those envisioned by its creator. This is why, in the document space, XML is an unqualifiedly better storage format than MS Word, Frame, PDF, or any other proprietary binary display-oriented format. A lot of the XML-as-serialized-objects people probably don't care that much about this, but I think they're missing an important boat. Computers are important because they are *general-purpose* machines, and to the extent that you can make data general-purpose as well, you win.

Q2: Why would you use namespaces?
A2: One of the important reasons is so you can pull together data objects from multiple sources without losing track of where the pieces come from.

If you believe A1 and A2, then it seems to me like you get maximum re-usability and ability to mix-n-match if you've got everything unambiguously and completely labeled with minimal reliance on context.

Shirking mentions of "type" as the key to enlightenment, Bray also presented a worldview which he argued was consistent with all interpretations: XML + Namespaces is a means to associate labels with data structures. One is free to build any kind of architecture on this, including one that is strongly influenced by type information.

While this may show good architectural design by not limiting decisions in other layers, several people, including Paul Prescod, were concerned that the plurality of different architectures isn't being explored; rather, in fact typing is leading the way.

For better or for worse, the emerging XML architecture DOES elevate schemas, validation and ty[p]e declarations above other "XML processing applications". For example SOAP and WSDL implementations use XML schema types to do type conversions. WSDL actually uses XML Schema as some kind of abstract type definition system (completely distinct from its use as an XML validation tool). XSLT 2.0 and XPath 2.0 are also going to use information from the schema.

These specifications do not build on XML Schema for its validation facilities. They do for its t[ype] system. So flaws in that system will eventually become material to all XML users. Some future applications may not deal with element labels (or ulabels) at all. They will deal with t[ype] names.

Publication of the XQuery 1.0 and XPath 2.0 Functions and Operators Version 1.0 Working Draft clearly demonstrates Prescod's point.

These discussions hint at further splits in the community. Is there a potential fork on our road? Some of us are interested only in well-formed documents, while others want to mix in a constraints (validation) mechanism. Both groups seem to need much less than is currently being designed. And still others desire strong typing and object-oriented features; these are the ones who seem currently be most well-served by the latest W3C deliverables. But their satisfaction may be a detriment to those seeking a simpler existence.

Reverend Occam's Barber Shop

This wouldn't be the first time that a fork in the road has been highlighted. In fact in another thread this week the same suggestion, simplification through refactoring, has been made several times. Pertinent to the previous discussion, Alexander Nakhimovsky suggested refactoring W3C XML Schemas.

...it would be a good thing, IMO, if XML Schema were *re-factored* into the validating part (such as RELAX NG) and a "complex-type-relations" part, for use in specialized applications.

Following a discussion concerning the use of XPointer within XInclude, which showed that a streaming processing model is not possible without some kind of subsetting (ideally of XPath), Sean McGrath made a plea to the W3C to explore further subsetting of their output.

The W3C could stop for breath and find out what pieces of XML 1.0 the majority *really* use. Don't ask vendors - they are not a reliable source of information. Don't ask consultants, their business case is based on complexity. Don't ask theoreticians, they find it all easy (and even if they don't they will probably say they do as they have egos like the rest of us and they are paid to be really smart).

Instead, ask XML users. Zoom in on the uncommonly used bits that cause the most problems for the ancillary specifications. Work towards issuing new iterations of the core specifications that take things OUT rather than add stuff in. A bold, brave step that would differentiate W3C from all the previous tower-of-babel standards bodies.

Do it as an experiment. Do it as a controlled fork. If it does not yield benefits, scrap it.

Some argued that this isn't as easy as it might seem. Henry Thompson wondered how one identifies the users who matter:

There is a fundamental question buried within your request: who are the users whose voices matter, and how do we identify them? Remember Lot pleading for Sodom and Gomorrah? How many people really using feature F of XML 1.0 + Namespaces + ... does it take in order to render it safe from pruning?

Michael Champion agreed with Thompson, believing that a close shave at Reverend Occam's Barbershop, while overdue, would be a hard fought battle:

Like all zealots, I'm firmly convinced that "the people" are on our side <grin> and that very few of the paying customers would object if XML+namespaces+infoset+schemas got a close shave at Rev. Occam's Barbershop. On the other hand, I know that plenty of XML specialists would raise hell if their favorite feature was shaved off. So I have the same question "who are the users whose voices matter and how do we identify them?" I don't have any great suggestions.

Yet these aren't so much arguments against attempting the task as they are an acknowledgment that no subset is likely to please everyone. If the same viewpoint had won out several years ago, then we wouldn't have XML in front of us now. And this subset hasn't stopped useful work being done. Another point worth making is that many argue for refactoring rather than subsetting; the former involves retaining functionality while improving the architecture. So there are other means to the same end. Smaller, refactored specifications will also lend themselves well to being put together in different ways and with alternate piece in key positions. This seems a good way to achieve greater satisfaction.

So, at the end of this week's jaunt, we've come full circle. For much of the last year the same obstacles have been in our way. One wonders how long this will continue before new ground is struck?



1 to 1 of 1
  1. Well said, but expect the worst
    2001-08-29 20:01:56 Brad Clawsie
1 to 1 of 1