What's Next for HTML?

September 4, 2002

Micah Dubinko

HTML has always been the predominant data format of the Web, and it has influenced the design of nearly every XML vocabulary ever written. So it's a little disappointing to see the languid adoption of XHTML, which takes HTML from its SGML heritage into the world of XML. Partly this is because XHTML 1.0 dealt only with the XML-specific mechanics of the transition, and version 1.1 addressed only modularization -- ways to combine bits and pieces of XML languages. Benefits associated with switching to XHTML 1.x are modest at best.

XHTML 2.0, in contrast, marks the beginning of a new phase of development for HTML. The key goal in version 2.0 is to factor away the accumulated baggage in the language while retaining the familiarity and simplicity that has always been associated with HTML. This means that the language will improve in ways that aren't backwards compatible. This article takes a deeper look at some of the new features in XHTML 2.0, based on the initial public W3C documents of XHTML 2.0 and XFrames, dated 5 and 6 August 2002 respectively, and the further developed XML Events and XForms 1.0 drafts dated 12 and 21 August 2002 respectively.


The two biggest areas in need of improvement are forms and frames, both of which are now independent W3C specifications, which XHTML 2.0 includes by reference. Another article discusses the XForms 1.0 specification, which in XHTML 2.0 is completely replacing <form> and the associated patchwork of form-related tags. XForms was recently republished incorporating Last Call public comments and shortly will move to the Candidate Recommendation phase to build implementation experience.

The XFrames specification replaces the <frameset> element, which was removed entirely from XHTML 1.1. XFrames is actually a new document type, and not a part of XHTML itself; it can be used with XHTML, SVG, or nearly any other web language. The concept is simple: an XFrames document defines only the basic description of how to compose multiple documents into a single view, like this example with a row containing two columns:

<frames xmlns="">
  <style> ... </style>
    <frame id="navigation"/>
    <frame id="main"/>

Each <frame> may contain a default URI, but in general, deciding what to render in each frame is accomplished by a special URI fragment syntax, with a mapping from each frame to a relative or absolute URI. For example:,main=main.html)

The great thing about this design is that framesets are fully bookmarkable, the "back" button in browsers will work as expected, and it's harder for sites to "framejack" (use frames to represent someone else's content as their own), since the URLs are more visible. Details, like the sizes and borders of the frames, are all defined in a style sheet language such as CSS. The specification makes a point of not limiting the presentation to ordinary non-overlapping frames seen on the Web today. For instance, each frame could be a movable window-like page, or a set of tabs -- again under control of a style sheet. To encourage this kind of separation, the XFrames language includes a <style> element to contain the not-yet-specified CSS that can control such presentation details.

XML Events

Another, more general problem in markup languages is the proliferation of scripting and event-specific syntaxes. For example, a typical HTML interface to a search engine might include code like this (while SVG and SMIL have similar but incompatible syntaxes):

      function init()
      { document.frm.query.focus();} 
      // --></script>
  <body onload="init()">

For one thing, this requires a specific attribute onload to be part of the HTML language, so any change to the event model would require a change to the language. Worse yet, it's awkward to describe what scripting language should be used to interpret the attribute. And why should script be needed at all for such a simple function?

XML Events addresses all of these problems with a declarative syntax that builds directly on DOM Level 2 Events. In this system, every event begins at the root of the DOM tree and "propagates" to a particular target node in a process called the "capture phase". Any node in the line between the root and the target can act as an "observer", which can trigger declarative actions (including script), stop the event from continuing, or block the default action of the event. An additional phase, called the "bubbling phase", occurs after the event reaches the target and returns to the root node. In many cases, however, the interesting things happen right at the target element, and thus both the capture phase and the bubble phase can be ignored.

In the search engine example, the "load" event makes its way to the <body> target node. A "handler", such as <xforms:setfocus>, can cause some action to occur in response to the event. The following example shows this:

  <xforms:setfocus control="query" id="focus_handler"/>
  <listener event="load" observer="body_id" handler="#focus_handler"/>
  <body id="body_id">

(Note: The initial draft of XHTML indicates the presence of the <listener> element, as well as XForms elements, but doesn't indicate where the elements are allowed.)

XML Events also defines various defaulting scenarios, using "global attributes", that alleviate the need for a separate <listener> element. The events attributes can be placed on the observer element:

<body ev:event="load" ev:handler="#focus_handler">

Or on the handler element:

<xforms:setfocus control="query" ev:event="load" ev:observer="body_id"/>

In cases where the handler is a child element of the observer, then the only needed attribute is ev:event, a pattern that is frequently used in XForms. No matter what the syntax, the ability to use declarative markup instead of scripting is a welcome change for anyone who has had to write or maintain web pages, and the end result works much better in non-visual browsers and accessibility tools.

At the time of writing, one piece that is still missing is a promised Part 2 of XML Events, which will define specific event handlers to replace many common uses of script. Future versions of SVG and SMIL will probably use both the XML Events framework and the common handlers.

Changes to XHTML

Changes to the main language in XHTML 2.0 have been covered previously on (in an XML-Deviant column, "XHTML 2.0: The Latest Trick"). Still, it's interesting to look for patterns in the changes so far:

<section> and <h> elements
This design favors structural markup that encloses an entire logical unit, rather than <hn> elements that just mark the boundaries. This nudges authors in the direction of more meaningful markup.
Navigation Lists (<nl>)
Declarative navigation single-handedly can remove the need for scripting from millions of web pages. This will be a huge win for producers and consumers alike.
<applet> and <img> removed in favor of <object>
HTML specifications have been suggesting this for a number of years, and now it's official. The <object> element is more capable than the alternatives it replaces, for instance allowing multiple levels of fallback, and removing two unnecessary elements simplifies the language.

The aspect of XHTML 2.0 that has gathered the most attention (and which was reviewed in a recent XML-Deviant column, "The Absent Yet Present Link") is something that is unchanged from the earliest versions of HTML: the use of href for hyperlinks, rather than the different, incompatible syntax recommended by XLink. This has come to the fore in XHTML 2.0, since backwards compatibility is less of a goal, and since linking attributes are proposed to be added to almost every element. The HTML Working Group has promised to release information on the technical reasons for this approach, as part of a specification that has come to be called "HLink". A future article will take a look at HLink, XLink, and their relationship to XHTML.

Using XHTML 2.0 Today

Modern browsers, including Internet Explorer, the Mozilla family, and Opera, have been moving toward universality, providing hooks for generic XML processing rather than using hard-coded semantics of HTML. Sjoerd Visscher has capitalized on this, providing a demonstration of XHTML 2.0 on current browsers. His implementation uses a combination of behaviors on Internet Explorer, XBL on Mozilla, and special linking styles on Opera.

Sjoerd also has done some good work on an XSLT converstion of XHTML 2.0 into something renderable on both Mozilla and IE.

Browser vendors are beginning to release information on their plans to support XHTML 2.0 and family. Opera has been working on an improved layout engine, which their standards engineers claim will support XHTML 2.0 by the time it becomes a W3C Recommendation. The Mozilla project has feature requests for XHTML 2.0, XForms, XFrames, and XML Events. Interested parties can register their desire for these features to be implemented by visiting these links and clicking on "vote for this bug".

Future Directions

While XForms and XML Events are more or less fully developed, so far only the initial Working Drafts of XHTML 2.0 and XFrames have been released. It's natural to wonder what kinds of changes are still in store. Based on a large amount of activity on the public www-html mailing list, a few topics that have been proposed are:

  • Whether to allow lists and <object> to have a caption.
  • Whether to use RDF or a simpler RDF-compatible format for metadata.
  • Possible new elements <number>, <notice>, and <footer>.
  • Whether or not to keep <hr>, and whether it is presentational or semantic in nature.
  • Whether to allow <title> within the body section.

While it may not make it into the official specification, Masayasu Ishikawa has posted some unofficial RELAX NG schemas for various XHTML 2.0 modules.

The ultimate outcome of each of these issues is in the hands of the HTML Working Group. They welcome feedback and discussion on the www-html mailing list. (This is a subscription list. To subscribe, send an initial message to with the subject "subscribe"). A new roadmap shows the estimated time frame for the different pieces of the HTML family.

What You Can Do

The question on everyone's mind is whether XHTML 2.0 has any chance at widespread adoption, especially since XHTML 1.x still isn't properly implemented in mainstream browsers. The two basic approaches to evolving a markup language are to maintain backwards compatibility or to make a clean break. Till recently the W3C has explored the compatibility approach to the limits. The result has been a document format that has costs associated with switching to it, but relatively limited benefits. By making improvements without the shackles of strict compatibility, XHTML 2.0 is now able to provide enough benefit to justify the cost of switching.


The main disadvantage of the clean break approach is the resulting chicken-and-egg problem -- if no browsers support XHTML 2.0, why write web pages in it? And if there's no web pages written in it, what incentive do browser vendors have to support it? Both of these questions need to be addressed.

As this article has shown, XHTML 2.0 isn't that incompatible with modern browsers, which are already flexible enough to handle much of XHTML 2.0. (In fact, this article was written, prepared, proofread, and distributed in XHTML 2.0 format.) The prospect of clean, declarative markup over custom-written scripting and other hacks gives webmasters ample motivation to begin experimenting with XHTML 2.0 and to start asking their favorite browser vendor when they will natively support the standard.

From a growing murmur of discontent with HTML 4.x and XHTML 1.x, it's clear that demand for better client-side web solutions is building to a critical level. Will this need be filled with an open standard like XHTML 2.0, or will a closed proprietary solution fill the gap? This will be a key question over the next year. The answer will depend in part on the choices you make.