XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Forming Opinions

Forming Opinions

April 20, 2005

"Tao has reality and evidence but no action or physical form." — Chuang Tzu

Nobody looks forward to forms. Stop a person on the street and ask them what they think about forms and you'll get an earful. Curiously, though, in XML circles forms hold a great deal of interest. Admittedly, not the filling of forms per se, but the technology involved.

Forms and XML have a special affinity. Both are embodiments of structured data exchange. Completing, say, a vacation request by writing on a blank sheet (or Word document) is far less efficient than filling out a form that asks the right questions. Getting rid of paper also proves popular with technology fans. The same holds for data exchange. The right rules and constraints, as embodied in XML syntax and specific vocabulary schemas, yield a vast improvement over an "anything goes" policy of interchange.

Recently, the W3C published a new Member Submission: Web Forms 2.0, or WF2, based on a numbering system where the 1.0 version is the forms chapter of HTML 4.01 plus some DOM interfaces, which I collectively call "classic forms". To be clear, the Submission process is designed to "to propose technology or other ideas for consideration by the Team" — that is, W3C staffers. Unlike documents on the Recommendation track, Submission status doesn't imply any future course for the W3C or any endorsement of the content. It hasn't run the gauntlet of broad participation, review cycles for conformity with the W3C vision, accessibility, international-friendliness, web architecture integration, or Intellectual Property Rights claims — all the things that make official W3C specifications take so long. A Member Submission is just an idea in writing.

WF2 isn't the first forms-related Submission to the W3C. XFDL from UWI (now PureEdge), XFA from JetForm (now Adobe), Form-based Device Input and Upload in HTML from Cisco, and User Agent Authentication Forms from Microsoft were also submitted back in the heady days of 1998 and 1999. These documents were not taken directly to the Recommendation track, vendor wishes notwithstanding, but did provide useful information for what eventually developed as a standard. A segment of the HTML working group considered the existing forms landscape and with W3C approval eventually developed the XForms 1.0 Recommendation.

Exploring Forms

To really get a feel for this new specification, I need to get my hands on some running code as a reference point. I'll start with some basic code and see how WF2 could help. The code is a simple client-side forms framework that I'm calling FormAttic. For full details beyond the brief summary here, I've set up a Wiki page, and am releasing the code under a Creative Commons license.

The client I wrote this for is non-technical, and needs to be able to easily change things around. Thus the key feature of FormAttic is that it uses a declarative technique to record author intent, and is highly amenable to cut-and-paste modification. While it is based on an "executable definition" in JavaScript code, markup could easily be substituted in its place.

Intentional JavaScript

function configure_required(require, require_one_of) {

  require("Please include your last name", "lname");
  require("Please include your first name", "fname");
  require_one_of("Pick your favorite colors", "red", "green", "blue");

In FormAttic, no special markup other than id attributes is needed on form controls.

Here require and require_one_of are actually functions passed in to the configuration method configure_required, which establishes a context for the declarations. The first parameter of each is a message to be shown if the validation fails, and subsequent parameters name the controls that are getting marked as required. Notice that require_one_of is an instance of a general case of selecting one or more items from a list — something that requires extra scripting in classic HTML forms. This format is easily extended to cover additional form declarations, such as data types. The remainder of the library consists of initialization code, event handlers that get attached in the right places, and auxiliary functions.

The first part of the WF2 document describes the goals and scope of the specification and relationships to existing materials at length. These will be covered in a later installment. For now, let's skip section 1 and get right into the next section — and the code.

Extending Forms

Section 2 starts off saying "At the heart of any form lies the form controls", which seems like a poor choice of words. Many behind-the-scenes structured data exchanges, including tasks currently done with XMLHTTP, can be implemented as a form without controls. The W3C lists current requirement for just this feature. In practice, the primary definition of a form is a model, often as part of a Model-View-Controller pattern; form controls are secondary. It's hard to tell whether this sentence is a minor oversight barely worth mentioning, or something hinting at deeper levels of underlying assumptions. Keep an eye on this as we proceed.

One important class of changes in WF2 consists of newly added attributes that older clients will rightly ignore as unknown. Included in this list is the required attribute. For FormAttic, this could be implemented by a tiny adjustment to each required form control:

Required control in WF2

<input type="text" name="fname" id="fname" required="required" />

Browsers that supported WF2 would "just work" with this change. But to support older browsers, an attached script would need to locate all such attributes and configure things such that the form won't submit until the required controls are satisfactorily filled. Essentially, this is what FormAttic script already does. The only difference is where the description of the required state is kept. FormAttic keeps it all in one place, and WF2 spreads it out across the form control markup. Which is better?

The answer, of course, is "it depends". If the document markup is simple or fairly clean, there wouldn't be much difference between the two. On the other hand, in a document riddled with nested tables, font tags, and all manner of non-semantic tag soup, having everything in one place — generally towards the top of the document — is a big plus. Indeed, this was a major motivation for the design of FormAttic.

Another difference is in the DOM interface. Under WF2 the required state of a form control would be available for inspection or modification under a validity property of the form control object. But again, for older browsers, the script would need to do feature detection and in cases where the property isn't implemented, go off and do its own thing. Less adroit scripters might even fall back into old habits of browser version sniffing. On the other hand, for browsers that lack a validity property, it would be possible with some care to implement one in client-side script.

Forming an Opinion

A solid case could be made for either all-in-one-place configuration, as in FormAttic, or spread-out-among-form-controls-configuration, as in WF2. The relative benefit of either option is outweighed by the benefit of working with whichever one has wider support in the overall community. My experience pushes me in the direction of keeping things separate, but I'm keeping an open mind and am willing to be persuaded.

As things worked out, I wasn't involved in the production of WF2 to date. As I prepare these columns, I really am forming initial opinions of the technology. Next week, as I continue reviewing WF2, I will focus on the many things I like in the specification.

Births, Deaths, and Marriages

Topologi Difference Detective

A US$29 utility for a broad range of XML-diff actions.

Sun Java Streaming XML Parser

An implementation of JSR, available at java.net.

XPL Draft Specification

A first draft of the XML Pipeline specification from Orbeon.

Relax NG seminar in Leuven, Belgium

Relax NG session featuring Eric van der Vlist.

XML Enhancements for Java 1.0

A compiler and runtime system to extend Java 1.4 with first-class support for XML.

Documents and Data

If you walk behind the wall, you see little trees growing behind some of the bricks.

Serious QName advice from Rick Jelliffe.

There should be some kind of award for finding prior-art this fast.

This week's final word on namespaces.