Forming Opinions, Part 2
April 27, 2005
"But digital technologies enable a different kind of tinkering — with abstract ideas, though in concrete form." -- Lawrence Lessig, Free Culture
Previously, this column examined Web Forms 2.0, or WF2, a technical report recently presented to the W3C by Opera and the Mozilla Foundation. WF2 seeks to extend forms-related aspects of SGML-based HTML 4.01 and XHTML 1.x.
Effectively extending an existing vocabulary is always a challenge; keeping track of which pieces are free to change versus which pieces need to be locked down. The HTML 4.01 Recommendation, in particular the forms chapter, is not a paragon of specification rigor. It leaves several opportunities, some by design and some by accident, for little tweaks and adjustments.
To resume the discussion, we'll continue our look inside WF2 where we left off in section 2. One of my favorite parts of this section consists of all the little tweaks suggested to classic forms as we know them. Anyone who has worked with form-scripting has probably run into one of these limitations. These include:
1. Allowing empty content for
optgroup. In cases where script will insert values, the initial state is best left empty, but doing so currently triggers validation errors.
2. Controls no longer need to be nested underneath a
formelement, and can instead use an IDREF to point to one. With this change, a form can be declared in the
headfurther separating concerns.
3. Final word on some debates about how specific corner cases should behave, specifically for radio group or single-select with no initial selections.
4. Mentioning platform behavior for 'label' and allowing nesting of
5. Allowing nested
formelements (though with no semantic significance implied by the nesting).
At first this seems unusual, but the need can easily arise in portal situations, where small, self-contained blocks of markup are needed, and aren't allowed to make assumptions about surrounding content.
Another useful resource is description of common features that are widely implemented
and other browsers, but not formally documented as a standard anywhere. In this camp
autocomplete attribute. There are likely others as well, but the arrangement
of the document makes it difficult to see which features fall into this category.
Newly-added attributes, including
required, were covered previously. In the
same camp are
pattern for regular expressions, all of which work pretty much like you'd
expect them to. Also welcome are wider applicability of
readonly attributes (though I'd like to see
readonly go even
farther — to cover radio buttons, checkboxes, and lists).
WF2 defines several new values for existing attributes. One of the areas where HTML
loosely defined is the
type attribute, as in
type="text">. The specification fails to mention what happens if some new value
is present in that attribute. (WF2 deals extensively with this topic.) Most browsers,
encountering an unknown attribute value for an input
type, will fall back to a
standard text control, so that as long as fallback users don't mind manually typing
things like "2008-12-31T18:36:00", the form will continue to work. This is actually
clever hack that provides some of the same benefits as XForms Schema-driven design,
an xsd namespace in sight.
One more nice touch in this section is an
output control that displays form
values. In a script-based solution it's not much different than just having a
span or other element with a known
id, though with the addition
of external form data, it becomes more useful.
Since I promised to highlight the parts of WF2 I like, this week's installment will skip section 3 on repeating sections, and move on to DOM features.
The recurring theme of WF2 is dependency on script. In some cases, like calculations, script is necessary to perform basic functions. In other cases, script is needed transitionally to replicate new things defined by WF2. But either way, authors have a genuine need for DOM interfaces. DOM access is important. Script isn't inherently evil or even inherently inaccessible, only specific uses of it are. When different browser platforms each use different DOM interfaces, a developer's job of using script in a non-evil way becomes that much more difficult.
So there is value in hammering out common ground. Section 4, and later section 7, define a scripting interface that appears to be a straightforward extrapolation of classic forms, including new script-bearing attributes hard-wired into the language for each new event. But I think the W3C would have a hard time swallowing it.
The situation with events is much like a smaller-scale replay of the situation with forms. The W3C has collectively identified many limitations in the inline attribute syntax style, and is pursuing a different course, based on XML Events. In the long term, the W3C is right. In the short term, though, I think there are some interesting possibilities for transitional strategies.
The original forms DOM took several years of work, and not entirely within the W3C, before a significant amount of cross-browser scripting became feasible. WF2 is clearly attempting to accelerate this process, though some of the more advanced DOM functions introduced in WF2, useful though they are, stray pretty far from what conventional HTML scripters will be comfortable with.
XML In, XML Out
The original specs for classic forms date back to the early 1990's, well before XML became a Recommendation in 1998. Increasingly, XML is used as an interoperability layer or an on-the-wire format between different systems. As discussed before, XML and forms share a deep connection, so a natural impulse is to extend older forms systems with XML. WF2 does this in sections 5 and 6, which among other things define a new content type "application/x-www-form+xml" and a specific syntax for it. An example of the submitted XML basically looks like this:
XML data submitted from WF2
<submission xmlns=...> <field name="fname" index="0">Moe</field> <file name="file" index="0" filename="todo.txt" type="text/plain;charset=UTF-8"> VG9kbzogd3JpdGUgcGFydCAz </file> <field name="dt" index="0">2008-12-31T18:36:00</field> </submission>
Note that WF2 intentionally doesn't define a specific namespace string that would be used here, in order to discourage premature implementation. Moving in an XML direction is beneficial, even with the obvious observation that what we have is Yet Another XML vocabulary. Any useful back-end processing of data submitted this way would require a custom transformation step, one that would be similar in structure to code for converting urlencoded or multipart submission data into an XML vocabulary of choice. In contrast, other systems like Microsoft InfoPath work with arbitrary XML and can avoid an extra transformation step during the data integration phase of a forms project.
WF2 also defines similar ways to bring XML data into the form as data or list choices, which is even more useful as it can help server developers steer around the area of templating madness. Existing formats like UBL, though, will still require pre- (and post-) processing.
So XML features makes this week's list of things I like in WF2, though just barely. More advanced features naturally drift away from the stated ideal of leveraging existing developer knowledge. This interesting relationship between the past and future of forms will be the focus of the next issue of XML-Deviant.
Births, Deaths, and Marriages
A slow week for announcements in the XML world.
Call for implementations: UBL input specifications - Developer's Preview
A preview of my own work on modeling UBL for form authors.
Documents and Data
Sound advice on spokespersonship
REST + XML namespaces + XSLT (oh my!)
More thoughts on Documents v. Databases