XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

In this week's XML-Deviant, we take a look at two conversations on the XML-DEV mailing list that highlight XML's disruptive aspect -- more specifically, the disturbance XML can cause to the dominant incumbent in a technology area in which XML is being introduced.

XHTML

XHTML is the evil/liberating tool of a bunch of religious maniacs/lazy developers/visionaries intent on confusing/improving/revolutionizing the lives of hapless/worthy/troublesome web developers, aided and abetted by broken/customer-supporting/pragmatic web-browser implementations.

Or so we learn in a thread started off by Len Bullard about embedding XML in HTML and the rules for its processing. More entertaining than the mechanics of Len's trouble was the bile spilled onto the list from XHTML-haters.

It's an interesting question: should web browsers really be processing XML? Isn't their domain the broken mishmash we call HTML? Leave it to other applications to deal with XML. Joshua Allen speaks, and opens the case for the prosecution of XHTML:

I'm really suspicious of calls to rewrite a better, XML-aware browser. "Better" is always in theory; in practice you end up with tons of bugs and unintended consequences. XHTML purists have had a number of years to prove their thesis, and all they have proved is that the "pure" way results in yet more complexity for web developers, no appreciable benefit, and new bugs.

It's tough for Microsoft employees such as Allen to mention bugs, as Elliotte Rusty Harold points out:

I'm not sure what thesis you think the so-called XHTML purists are trying to prove, but the new bugs seem mostly to be in Microsoft products and the benefits are quite clear.

What benefits? Harold goes on to argue the benefits of content that is both human and machine readable.

XHTML vastly simplifies machine processing for all sorts of purposes. For instance, I could not generate RSS feeds for my web sites if they were not well-formed XML ... if more data were available in well-formed HTML, I could do more cool things with it. For instance, why should Amazon/Google/eBay, etc. have to provide separate interfaces to the same data for web services and for browsers? Why can't one suffice for both? If they were using XHTML or XML, instead of HTML, one set of pages would serve double duty.

Ah, screen-scraping. "Dubious at best," says Dare Obasanjo. Worse, cries Allen, it's narcissism! Laziness might be nearer to the mark. Allen continues to the heart of his complaint:

For people who really want re-purposable data; we already have capability to do XML+XSLT+CSS. My RSS feed and OPML feed are both pure XML (no XHTML crap) and render nicely in IE and Mozilla. XHTML is a Frankenstein.

Mike Champion, who if he ever founds a monastic order will be held in history as St. Michael of the Pragmatists, agrees that developers are lazy, but suggests that this might still be preferable:

It's amazing how much ingenuity goes into doing useful things with tag soup and minimal metadata, but how much real benefit we get from all that lousy HTML via browsers, search engines, script applications, etc. But maybe it is more globally efficient to have a small group of developers learn how to make sense out of tag soup than to force the masses to deal with the very real pain of full standards compliance.

He goes on to address what is probably the root of the argument, differing opinions on what XHTML is actually for:

Still, the point of XHTML is not so much to be a stopgap but to bring some rigor to content so that all sorts of XML technology can be thrown at it. Screen-scraping is just one use case, there's also querying, transformation, syndication, content re-use (without worrying about the HTML escaping hassles), web-services enablement ... for just about every XML infrastructure spec, there's a plausible scenario in which having web content in XHTML enables all sorts of interesting things without relying on tidy or a tag soup parser to build a clean syntax or info-set.

It's time to see some give and take, says Champion, and realign some of the W3C Recommendations with reality -- a "refactoring" of the specs to accept real-world concerns and also to reinvigorate the vision enabled by standardization.

Does ebXML Simplify EDI?

"There's no XML in complexity" is the fallacious cry we often hear from eager XML adopters embarking upon the re-engineering of their domain in XML. One user is certainly suspicious of these claims about the electronic business technologies ebXML and EDI.

Is it really a fallacy? I for one was convinced by ebXML advocates at XML conferences who lauded the increased equality of opportunity that ebXML afforded.

Peter Hunsberger indicates that there's little advantage for existing EDI users with the XML-ification of EDI. Not too surprising. But he also points out that ebXML's lower cost of implementation is a key factor:

Bottom line, for me, was that if you already had a (good) EDI exchange mechanism in place there was little benefit to XML. To clarify my parenthetic qualification; it depended a lot on your tools, many were hard to adopt to new business areas. However, the lower cost of entry for XML-based technologies allows it to displace EDI technologies from the bottom up and horizontally.

Whether the assertion that ebXML opens up electronic trading to smaller businesses is still contested, though. I'm sure we'd all like to think it does, but it may not be in simplicity of specification that it achieves its aims. According to Dale Moberg:

There was originally some hope that ebXML specifications could be used to produce a solution more attractive to small- and medium-sized businesses that would be less trouble to use ... Actually ebXML ended up defining functionality beyond what is usually defined by EDI standards. So from that standpoint, it is not simpler. Whether it promotes simpler solutions for end users is debatable, but there are vendors pursuing simplified ways to make use of ebXML under the covers (of forms) so that the complexity of ebXML is largely concealed from end users. I think it is safe to say that it is probably not simpler for implementers (by implementers, I mean the software vendors or open-source providers, not the end-user deployers).

Additionally, though not mentioned in this debate, one large factor in ebXML's favor is the ability to function over the Internet, not expensive closed networks. With both ebXML and XHTML, simplification isn't really the touchstone, it's more about future opportunity.

Births, Deaths, Marriages

The latest announcements from XML-DEV.

Piccolo

Lesser-known Java XML parser returns from the dead, fixing bugs, improving performance, and moving to an Apache license.

Oxygen XML Editor 4.2

Commercial schema-aware XML editor and XSLT editor/debugger. New features include presentation of schema information while browsing and editing an XML document.

Second Semantic Technologies for eGov Conference

Call for papers for conference focused on using semantic web technologies for e-government.

Mark Logic XML Query Engine

Free license to run a 50MB-limited version of the XQuery engine available.

RFC 3023 Redux

Death to text/xml! XPointer added as a fragment identifier for application/xml. XBase recognized for specifying base URIs.

XMLOpen 2004 Program Available

UK-based XML and open source conference. Star speaker line-up, with plenary sessions from Rick Jelliffe, Jeni Tennison, and Sean McGrath.

XQuery and XSLT Interim Working Drafts

Incorporate changes made so far due to issues received from the Last Call working drafts.

SEC Initiative to Assess Benefits of Tagged Data in Commission Filings

Will the SEC accept filings in XBRL?

Also in XML-Deviant

The More Things Change

Agile XML

Composition

Apple Watch

Life After Ajax?

XML for Binary Interchange, Addressing Machine-to-Machine Interoperability & Tactical and Mobile Computing

Ironically, space is limited at the conference with the longest title ever.

Scrapings

World-Wide-Wait reimplemented with SOAP ... 2.5 hours of solitude ... Mails to XML-DEV last week 128, Len rating 10% ... Riddle me this -- one of these namespace technologies is not like the other ... Stumbling up against XML-DEV's anti-verbosity measures ... you know XML has made it when ... it takes off like this.



1 to 2 of 2
  1. Ignorant or Forgetfull
    2004-08-06 08:31:40 
    None of the posts touched on the very real and beneficial concept of using XML/XSLT to transform XHTML documents into other XHTML documents. This is a great advantage when you have a group of designers creating the pages and the data is to be populated later from a back end or the page needs to be localized. There are also benefits in scripting and CSS because in HTML you don't need to close the tags so the browser is guessing where the end tag is. This guesswork can throw off CSS calls and script calls and cause the developer extra work. While HTML CAN be well formed there's no good way to verify that it IS well formed. Tidy and widely available XML tools make it easy to validate and check XHTML files for errors. 


    I don't understand how some developers will say "Well I can create well formed documents with HTML" and then go on to bash XHTML. If you can create well formed documents then what is the problem with using XHTML. It's not complicated, it's not rocket science. It sounds to me that the lazy people aren't on the XHTML side of the argument but on the XHTML-haters side.

  2. Anti Verbosity Measures
    2004-07-29 07:13:22 Len Bullard
    It's good to know BigBubba has arrived at OASIS.  Better late than never.


    The funny bit here, Edd, is how many don't notice that XML and HTML are incompatible while article after article on the web insist that XML is the child of HTML.


    http://www.intersystems.com/cache/technology/whitepapers/hybrid.html


    has the usual errors.


    Once the kudzu is in the fields, you can never quite kill it off, and that is why XHTML finds little fertile ground. YAGNI.


    OTOH, the concluding remarks in the thread were more interesting. In summary, the web browser as a development platform isn't the only contender. If we want full XML conformance, predictable and reliable implementation, then the HTML-centric browsers must give way to the next generation of systems; yet even XML is just a side show. Systems such as Longhorn with built-in rules for extensibility are the future. The browser wars are over and the framework competitions are beginning. This has implications in the device space given a push toward a richer interface that won't play well on platforms where the limiting factor is the batteries, not pixel space. For all the talk of convergence, quite the opposite is probable.


    len

1 to 2 of 2