XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Watching TAG Again

Watching TAG Again

July 03, 2002

Keeping Up With the Taggers

Every few months I report on the state of the W3C's Technical Architecture Group (TAG). The TAG is probably the single most influential group of web architects, if for no other reason than the scope of W3C standards work, for which, institutionally, it serves as something like a court of last appeal. In my last report, called "Tag Watch", I reviewed the TAG's issues list and found that it was declining to address between a quarter and a third of the requests that it received, and that it had accepted but not yet resolved eight issues.

Upon reviewing the public TAG materials recently, I was surprised to learn that several issues pending in April are still unresolved, including an algorithm for creating URI from a qualified name, the nature and identity of "namespace documents" (which is partially resolved), the nature and identity of XML documents "composed of content in mixed namespaces" (which appears to have last been formally discussed in early May), the extent of HTTP dereferencing, and, finally, the use of HTTP as a substrate for higher-level protocols.

New Issues

I have previously suggested that the TAG may well become a bottleneck in the development of W3C standards. While it's not yet clear whether that suggestion will be borne out, the number of issues which the TAG agrees to address continues to multiply. Since my last report, it has accepted seven new issues, raising the total from 16 to 23. Of those seven, one was pretty quickly resolved, four others have been assigned to TAG members, and two have been accepted but not yet assigned.

A few of the new issues concern specific W3C recommendations or the general principles which ought to guide W3C recommendations. First, responding to a request from Misha Wolf, and owing to its impact on so many W3C recommendations, TAG agreed to review a draft of Character Model for the World Wide Web. The resolution consists of Chris Lilley's notes on the charmod draft and Norm Walsh's amendment to such. In short, the TAG suggests that the charmod spec should not add additional rules concerning character sets or character encodings beyond those of the XML specification.

Responding to a request from Rob Lanphier, the TAG accepted and assigned to Chris Lilley a request for clarification about the principle which ought to guide W3C specifications concerning error recovery. What should recommendations say about how conforming software should recover from errors? This TAG issue is presently unresolved.

Also in XML-Deviant

The More Things Change

Agile XML

Composition

Apple Watch

Life After Ajax?

Mark Baker raised (in early January) and the TAG accepted (not until early June), an issue about W3C working groups and media types. The XML Protocol WG specifically asked whether WGs should or could define new media types, whether some coordination of custom types should be attempted, and how media types and XML namespaces are related. In short, the question touches on the canonical reading of RFC 3023 as it pertains to W3C working groups -- do the SHOULD provisions of RFC 3023, Section 7.1, apply to W3C WG productions? This issue was assigned to Chris Lilley and is presently unresolved.

At least one issue about which a great deal of blood has been shed on many mailing lists, including not least of which XML-Dev, is the use of qualified names (which I covered in detail in "The Value of Names in Attributes"). Should QNames appear in attribute content, for example? Perhaps bowing to the power of practice, the TAG's resolution, as represented by Norm Walsh's Using Qualified Names as Identifiers in Content", suggests that despite the warts of QNames as identifiers, the practice was already too widespread for any architectural group to deem inappropriate:

Whatever the architectural ramifications of using QNames as identifiers in contexts other than XML element and attribute names, it is already established practice.

It is simply not practical to suggest that this usage should be forbidden on architectural grounds.

This not only smacks of a kind of prudential wisdom, but also suggests that in some instances, the TAG's reach exceeds its grasp. Some kinds of convention, as evidenced by widespread usage and lack of equally widespread consensus to overturn it, are simply not reversible. And given the persistence and range of discussions about simplifying written-by-hand representations of XML (or of the XML infoset, I suppose one should say), the TAG's acknowledgment that QNames are very handy because concise is yet another indication that, whatever its machine-to-machine charms, XML can be a pain for humans to deal with by hand. In other words,

The TAG recognizes that there are pragmatic reasons why it is desirable to provide the same kind of URI/local-name shortcuts that QNames provide for element and attribute names in other contexts.

But it has not yielded to convention without some attempt to sharpen practice into best practice. Its architectural recommendations about QNames include,

  • Specifications should not introduce QNames into mixed content or attribute values with untyped string content.
  • Specifications should not introduce union types that include xs:QName as a possible component.
  • Specifications should not use tokens that are syntactically QNames...unless they are also semantically QNames.
  • Specifications describing an XML language must not introduce new namespace declaration or scoping rules.
  • Element or attribute values that contain a single QName should be declared with the xs:QName type.

Another issue which has occasioned mountains of discussion is PSVI and the XML infoset. The question of the relation of type augmented XML, the XML infoset, and PSVI is a vexatious one. This was evidenced by the fact that even some TAG members weren't completely sure about the extent or nature of the issue, which Tim Bray raised in email to the TAG mailing list. This is something we will have to keep our eyes on going forward as it was only very recently raised and accepted, and it's unclear at this point what the TAG is going to do.

One of the problems with the scope of the W3C's output is fragmentation, which is a prime reason that the TAG is supposed to set architectural principles. The risks of fragmentation are particularly great when, in and across disparate domains or contexts, various working groups, often with substantively different goals, set out to accomplish similar subgoals. For example, several different W3C specifications, despite their different overarching goals, may need to say something about formatting properties. As Steve Zilles said, in an email which the TAG subsequently accepted for review,

Related Reading

XML Schema

XML Schema
The W3C's Object-Oriented Descriptions for XML
By Eric van der Vlist

Formatting properties are the properties that various document formats (HTML, XML, SVG, SMIL, MathML, ...) use to control the styling of the content of the format for some presentation medium, such as display screens, audio systems or printed page.

For an illustrative example that shows the architectural problem, consider embedding an XHTML/XML chunk within an SVG chunk within an XHTML/XML document. Further, assume that the XHTML/XML pieces are styled either with CSS or XSL. Typically, the author (and the reader) would want consistent styling for all three pieces. For the styling to be consistent, both SVG and CSS/XSL must use the same properties for the same purpose. In addition to using the same properties, the interpretation of those properties must be the same in both SVG and in CSS/XSL (or one or the other piece may have no understanding of the property).

Without some measure of coordination, fragmentation across formatting properties seems inevitable. And that's not, to put it bluntly, a good thing. The TAG's resolution of this issue, "Consistency of Formatting Property Names, Values, and Semantics", drafted by Norm Walsh, establishes a sensible principle:

Formatting property names, values, and semantics must be consistent across all specifications. Whenever a working group suggests the creation of a new formatting property, or the addition of, or a change to, an existing formatting property's allowed values, the working group must show a strong justification for not using an existing formatting property or properties that are related to the proposed new property or value.

Clearly innovation on the web will create situations where new properties are required and existing properties will need to be extended. What we must avoid doing is changing the semantics of existing properties in ways that introduce unnecessary interoperability issues.

Lastly, Tim Berners-Lee suggested that the TAG consider the extent of XLink's applicability, particularly, whether xlink:href should be required for any URI parameter, for every URI parameter which points to a document hyperlinked for the present document, or whether it should be optional. For example, SVG uses xlink for referencing symbols, and it can also be used to embed images or other content within a document. After some debate as to the political implications of accepting this as an issue, the TAG did so. It remains to be assigned and is presently unresolved.

Conclusion

Each time I review the TAG's progress, I come away thinking that it's some kind of very low-grade miracle that, on the good days, the Web works as well as it does. The details do in fact matter, but they don't all matter equally. The Web's fundamentals seem as sound as ever, which is a good thing since, increasingly, some of the higher levels are murky and ill-conceived. Straightening out the details, answering the tough calls, and solving the corner cases can be thankless work, but someone has to do it.