Gentrifying the Web

September 13, 2000

Leigh Dodds

Taking a look at XHTML, the XML-Deviant finds that although the W3C HTML Activity is moving forward, the rest of the web is still lagging behind.

XHTML and Beyond

Few XML developers haven't heard of XHTML. Ask one of them to describe what it is, and the answer you get will be something like: "it's just HTML 4 expressed in XML." This is quite true, but it only captures what is the first phase of a much more ambitious task. Take a look at the W3C HTML Activity page, and you'll discover that XHTML 1.0 is just the first step down a long road which is set to radically alter the face of the Web.

Following the XHTML 1.0 specification comes a suite of documents that describe how XHTML can be divided up into a set of modules, and how one goes about defining new modules. This adds a great deal of extensibility to the language, allowing user-defined subsets and extensions to be created within a common framework.

Modularization is the real launching point for the XHTML effort, and the foundation upon which later versions of the language will be built. XHTML 1.1 and XHTML Basic are the first two "flavors" of XHTML to be built using this modular architecture. XHTML 2.0 is already on the horizon, with a draft being planned for as early as the end of the year. This will merge in other standards (e.g. XLink), requiring rewrites and extensions to many modules.

One of the driving forces behind XHTML is the wide range of new types of devices which will be using it as a content markup language. XHTML Basic is specifically targeted at these devices. The ultimate aim is a system in which

... software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module

While no one underestimates the scale of this task, there's a big problem: the web authoring community is already a long way behind, even at these early stages.

Image Problem

Commenting on the XHTML mailing list, Molly Holzschlag pointed out that XHTML still hasn't made significant penetration into the web community:

If anyone on this list believes that XHTML 1.0 -- much less modularization and other concepts under discussion -- is being employed and studied by, or in some cases even known to, the majority of professionals in the industry (excluding the high-end developers and standards-oriented few), it's time for me to pour you a fresh cup of my extra-strong coffee.

There's also an intriguing argument to be made that the move to XML applications is in effect a gentrification of the Web -- making it the domain of programmers. While that may seem attractive in terms of well-built technology, there's the paradox that the Web of the people, by the people, will no longer be as an accessible reality, something that breaks my poor grass-roots heart.

Holzschlag continued by expressing support for the standards initiative but warned that dismissal of current technologies and practices could do more harm than good:

... I honestly believe that to be too casual about methods that people have learned to use and are using today is a position that does nothing to help those good people of the world doing their jobs. If anything, it scares the hell out of people. And that does very little to help -- and perhaps even limits -- the idealistic goal of creating more effective and appropriate technologies.

Michel Rodriguez characterized the effort to move the Web to XML as an attempt by IT departments to regain control of the web by increasing complexity:

XML strikes me as being somehow the revenge of IT departments, which were taken completely by surprise by the success of the web: they found that users had created web sites, including a huge amount of ad hoc CGI's in whatever language they felt like using. XML offers those IT people a wonderful opportunity to wrestle the power back from the hands of unsuspecting users.

Views like this suggest that content authors are still far from clear on the benefits of XHTML. Noting that XHTML will eventually incorporate a host of other XML specifications, one can sympathize with this viewpoint. Suddenly, there's a whole lot more to learn.

Disappointed with the growth of XML on the web, Simon St. Laurent believed that the XML and Web communities still have to talk effectively:

I think the folks "selling" XML and XHTML have done a really bad job of it, at least with regard to Web developers. After about ten tantrums on XML-Dev, I did get people to stop using Web developers as a synonym for huddled masses, but I haven't really seen the XML community reach out to the Web community in a significant, sustained way.

[It's] going to take a lot more work, a lot more consideration of the 'grass roots,' and a lot more of "here's how your current skills will help you" rather than "here's why your current practices suck."

Browser Blues

While XHTML does have some tool support, in many cases these are developer-oriented, rather than user-oriented. Because XHTML 1.0 is just the foundation of a new architecture, it offers little to the content author. The mass of invalid HTML on the web speaks volumes about the perceived importance of validation -- a viewpoint that will have to change if the future of XML on the web is to be realized.

The point at which XHTML 1.1 becomes supported in a popular browser is the point when serious use is likely to begin. The requirement that content must be well-formed and, hopefully, valid will be a welcome side effect of embracing the new features browser support for XHTML will offer.

Basic XML and CSS support in browsers is far from consistent, so support for XHTML 1.1 seems a long way off. Some XML browsers are available, although they have yet to gain significant market penetration, and again seem primarily intended for developers. Peter Murray-Rust has long campaigned on XML-DEV for the development of an open source XML browser. Mozilla may fill that niche eventually, although personal experience has shown that it still needs a lot of work just on usability. When XHTML 1.1 reaches Candidate Recommendation, one hopes that the W3C will extend their Amaya web browser as a basis for a trial implementation. Amaya already supports XHTML 1.0.

Another issue for the XML community to consider is exactly how a browser should "automatically load the software" required to process an XHTML module. This may not be a technically difficult problem to solve, especially in languages like Java. But if it isn't standardized, then emerging browsers will use proprietary techniques. If we have learned anything from the recent URI, Namespace and Schema debate, it's that this will probably be a difficult standard to agree upon. But at least that will give the rest of the Web a chance to to catch up.