XForms, XML Schema, and ROX

August 17, 2007

"If I have an XML schema, is there any way that I can work with that schema to build forms for populating instances of that schema?"

Over the years, I've seen a number of variations on this same question, and generally for a pretty good reason. It takes a lot of work to create a schema in the first place, but when you're done, what you end up with, in general, is something that seems like it should be good to generate something; you have data type information, constraint information, enumerations, and enough other pieces that it would seem that making forms from them should be a cake walk.

However, the process is generally fraught with more land mines than you might expect. While the schema is generally a pretty good blueprint from which to start, it faces a number of problems that tend to limit the ability to build forms of any sort:

Labels and localization: it is generally possible to use the names of elements as labels in a pinch, but such labels aren't necessarily ideal for professional looking forms. Moreover, shifting out of the language that the schema was originally written in can usually destroy any hope for simple text transformation functions.
Non-schema interface elements: even the most basic form has a need for more than just fields of text. There needs to be a way of introducing additional content, from descriptions to containment boxes to the form in a consistent manner.
Binding fields: you need to devise a consistent mechanism for tying in each element to an interface control, such that changes to the value of one controls the appearance of the other and vice versa.
Presentation choices: a number of schema elements may have multiple ways of being expressed to retrieve the same data or they may be some form of a container that could be represented in different ways. Because of this choice, it's not always easy to determine from a single element declaration in a schema how best to express this content.
Reordering and filtering: often, the order of elements in a schema may be less than ideal from an interface standpoint or elements may be displayed that would better be hidden in practice (such as record ID numbers).

Having laid out these objections, its also worth bringing up one point that I've heard expressed more than once: that a schema-generated output makes for uninteresting or awkward user interfaces. As it turns out, this particular argument is something of a red herring. Typically, schemas tend to be designed hierarchically, with the most significant portions of the schema appearing near the top of the schema.

Curiously enough, this structure is actually pretty typical of most form content, with the order of items appearing in a form usually correlating fairly closely with their significance. If you have a mechanism for providing some kind of exception handling when this is not the case in your forms generators, you can usually get the schema to do the heavy lifting for you.

Working with ROX

I've long been intrigued by XForms, in part because they represent a pure declarative approach to creating interactive forms, but also because they work against a common XML-based data model or, to put it other terms, an XML schema plus its associated instance data. Because of that, the idea that you can create an XForms document from a schema is not as absurd as it may seem on first reading, because that form is in fact working against the schema anyway and, as such, probably reflects the inner schema more readily than a more raw HTML format might.

Because of this, I set out some time ago to create a set of transformations that would perform such a mapping between schema and XForms content, and after a few false starts, settled upon the following architecture:

Schema design: start with a schema that is Element Nominal (explained shortly), that explicitly defines cardinality (the minimum and maximum number of child elements in a parent element) and that uses an explicitly declared namespace. In general, such schemas can be transformed easily from Element Referential schemas.
Base metastructure: using an XSLT Transformation, analyze the relationships between elements and assign them into metastructure categories (such as Category Sets, Collections, Records, Bags, etc., explained shortly). The result of this is a base metastructure document.
Metastructure overlays: an overlay uses a similar schematic notation to define exceptions to the Base Metastructure. Such overlays work either by overriding the default characteristics of an element in the base metastructure (such as the order of its children) or by adding new elements that define additional behaviors or presentations (such as adding a style element for a given element). Overlays inherit from other internally referenced overlays, so that you can have one overlay that provides a French language version on top of an English language one, another that establishes XBL bindings on elements of certain types and yet another that contains help information, and these are then evaluated from the outermost overlay (as specified by a pointer in a configuration document) to the innermost (so that overlay two gets added into overlay one, and the result then gets added into the base metastructure).
Editor and report generators: once the final metastructure is created, it in turn generates three things: an XForms Editor document, an XForms Report document, and an XML Default Instance used to populate newly created records. It also registers the newly created items in a configuration file (conf.xml) in an associated XML Database (in this particular case, the eXist XML database).
XQuery (eXist) publishing system: the server-side code for handling the actual publishing of the various components uses the eXist XML database and eXist's XQuery system as its primary engine.
Syndication feeds: one of the key aspects of this system is to recognize that the objects created should be usable by as large an audience as possible. For this reason, the objects so created in this process will be accessible via syndication feeds (so that you can get a list of most recent invoices, bibliographic entries, medical requests, etc., readable by standard news readers). This system also includes hooks for an additional summarization transformation that performs a summary on a given entry, with a default being created as part of the report generators. System managers can add views of this summary data as additional XSLT transformations, passed to the client as part of the syndication feed.
Searches: similarly, structured data establishes a fairly high bar in terms of being able to search against a given database. The system defines a default OpenSearch template for performing base searches on objects of a given type, with the option of adding additional custom searches with more specific parameters for querying the system in more novel ways (and generating the output accordingly). The combination of Custom Search and Custom View makes it possible to customize the output in a wide variety of ways while at the same time keeping the base system simple to use.
Validators and work-flow system: users can save XML files built from within the XForms without validating them (in essence working with draft copies), but once it becomes necessary to add it into the system, the XForms would run against server side validators that use XSD and Schematron (client-side validators are added in as part of the overlay layer). Once a document is validated, it then triggers on the server side a work-flow event that sets the flags necessary to move it to new syndication queues.

Now, if you've been following all of that, one thing that you may be thinking is that this system has suddenly jumped from simply generating XForms from schemas and is now essentially running as a full blown application. Yup, it is. Realistically, the XForms act essentially as mechanisms for creating valid and consistent XML, but without some kind of server-side component the created XForms are fairly worthless. With them, on the other hand, this system is rather disturbingly powerful.

In previous articles and blogs, I've mentioned an open source project that I've been working on, called the REST Objectified XML System, or ROX System. What I've outlined here is essentially the core of that system. The idea behind this is to marry the concepts inherent in data editing (which typically have been built one field at a time, usually with an implicitly defined data model) with document editing CMS systems, which is the basis for the full name.

As an aside, the acronym preceded the formal title of the project. Those of us working on this project, including David Baker of eVision, Mark Birbeck of X-Port, Dan McCreary and myself, happened to like the acronym better than any other we could come up with. What the acronym means, however, has shifted back and forth for a while, and includes such titles as REST on XML, REST Oriented XML, and Really Out-there XML, which is my own favorite, if somewhat irreverent, name for what we're trying to do. However, the jury's still out, if you can think of a more suitable meaning, by all means, let me know!

Generating MetaStructure

One of the key issues I faced while designing this system was attempting to deal with schematic information at an abstract level, what was essentially the meta-schema (or design patterns) of a given object. Such a meta-schema needed to be semantically neutral, because it needed to apply to any schema, not just a single given schema. Moreover, it couldn't be dependent on a particular naming convention, as different people and organizations tend to use different conventions, often for very good organizational reasons. Thus began my search for the metastructure of a given schema.

One of the things that seemed to emerge in this process was that there were a number of criteria that determined which pattern a given element fell into:

Arity: the minOccurs and maxOccurs values for a given element.
Containment: for a given container node, the number (and arities) of each child element.
Identity and uniqueness: whether a given node had some form of identifier that uniquely differentiated that node from other nodes.
Choice: whether one or an other element was considered to be in force at any given time.
Atomic type: for simple types, this is the XSD primitive or atomic types such as string, number, date, and so forth.

These criteria open up a fairly wide possible matrix of choices (some of which are of course nonsensical), but in general what emerged from this were the following patterns:

Category binder: an element that contains a set of unique (min=0/max=1) subcontainers. Category binders are often represented as tabs or menus.
Collection: a container element that holds a set of zero or more (min=0/max=unbounded) schematically identical elements (known as records.
Record: a container element that is both unique (it has a unique identifier in the context of the collection) and has either property elements, collections, category binders, or bags as children.
Bag: an element that contains either property elements or other bags and serves primarily as a grouping mechanism. A good example of a bag is a <name> element that may contain <firstname> and <lastname> children.
Property: an element that only contains a scalar value; in a given record or bag, such a property is typically unique (occurs once and only once), though it may not be unique within the entire schema.
Switch: a container element holding two or more container elements, only one of which can be relevant at any given time. For instance, a switch element may hold an American address block, a Canadian address block, and a European address block, but only one of those blocks can be defined in any given schema.

Additionally, properties can have different expressions based on the schema type of the element; strings, dates, numbers, and so forth each may bind to different types of controls, while enumerants bind to selected enumerated properties, with the arity of the base type (one or more than one allowed value) defining whether the property displays as a single or multi-item selection control. This a given meta-structure binding may include both a pattern (Property) and a type (Enumerated Single).

The specific transformations to generate the metastructure patterns for each element were done using an XSLT2 template (specifically Saxon 8.9, though there is no specific Saxon code utilized. One underlying assumption in the transformation was that the schema must be defined as an Element Nominal Schema (the author's notation). Such a schema assumes that there are only a limited number (preferably only one) of high-level element declarations, and each declaration in turn points to a Complex Type that holds the child elements:

<xs:schema ...>

    <xs:element name="toplevel" type="toplevel.Type"/>

    

    <xs:complexType name="toplevel.Type">

        <xs:sequence>

            <xs:element name="sublevelA" type="sublevelA.Type"/>

            <xs:element name="sublevelB" type="sublevelB.Type"/>

        </xs:sequence>

    </xs:complexType>

    <xs:complexType name="sublevelA.Type">

         ... 

    </xs:complexType>

    <xs:complexType name="sublevelB.Type">

    ... 

    </xs:complexType>

    ...

</xs:schema>

Note that in this particular format, most of the content of the schema should be either simple or complex type declarations and child elements are defined specifically by name (thus Element Nominal), rather than containing a reference to another element declaration block elsewhere in the schema. Moreover, in this particular example (and in the examples used in ROX in general) the use of anonymous type declarations, where an element contains an encoding complex type element that isn't named, is discouraged.

The transformations themselves are reasonably complex, and, as such ,are best reviewed offline; check out the references at the end of this article for the set of code used in this application, including XSLT transformations and XQueries. Once the transformation is done, however, it creates a one-to-one mapping between each element in the schema and its corresponding matched template that describes the element's namespace, CSS overlays, constraints, labels and so forth. The results of this (in the particular case for a CV schema) looks as follows for a portion of the metastructure document:

<structure lang="us-en" root-match="cv:CV" 

    namespace="http://www.casrai.org/xmlns/2007/cv" 

    namespace-prefix="cv">

    ...

    <element match="cv:Book">

    <title>Book</title>

    <display-title>yes</display-title>

    <type>cv:Book.Type</type>

    <core-type>complexType</core-type>

    <metadata>Record</metadata>

    <propertysheet>

        <property label="Title Of Work" 

            select="cv:TitleOfWork"/>

        <property label="Co Authors" 

            select="cv:CoAuthors"/>

        <property label="Publisher" 

            select="cv:Publisher"/>

        <property label="Published Work Status" 

            select="cv:PublishedWorkStatus"/>

        <property label="Number Of Pages" 

            select="cv:NumberOfPages"/>

        <property label="Publication Location" 

            select="cv:PublicationLocation"/>

        <property label="Publication Year" 

            select="cv:PublicationYear"/>

        <property label="Edition" 

            select="cv:Edition"/>

        <property label="Number Of Volumes" 

            select="cv:NumberOfVolumes"/>

        <property label="Volume" 

            select="cv:Volume"/>

        <property label="Series Title" 

            select="cv:SeriesTitle"/>

        <property label="Is Refereed" 

            select="cv:IsRefereed"/>

        <property label="Description" 

            select="cv:Description"/>

    </propertysheet>

</element>

<element match="cv:Books">

    <title>Books</title>

    <display-title>yes</display-title>

    <type>cv:Books.Type</type>

    <core-type>complexType</core-type>

    <metadata>Collection</metadata>

    <collection>

        <record label="Book" select="cv:Book"/>

    </collection>

</element>

<pattern match="Collection"/>

<pattern match="Record"/>

<pattern match="CategoryBinder"/>

<pattern match="Property" type="String"/>

    ...

</structure>

The structures so described can be thought of as an alternative schema, one that concentrates on the larger relationships between the pieces. Thus the <cv:Books> element gets described as a collection of <cv:Book> elements, while each <cv:Book> element in turn is made up of a set of properties, given in the associated order. The select attribute provides an XPath expression relative to the match attribute in the <element> description; in essence this is a match to the schema element. One advantage to this approach is that the properties can be rearranged, and the property select attributes can also be overridden to point to different elements defined by the corresponding XPath expressions.

Similarly, the generators also create default stubs for the various default metadata patterns. These pattern templates may be used to define CSS classes or establish other bindings (such as XBL bindings) in the initial generators, but more significantly, they also provide ways to change the default implementation.

Overriding MetaStructure

You can make a rather surprisingly robust application just with the default generated metastructure, though whether it will be pleasant to look at or even work with will be highly dependent upon the schema in question. It is more likely that you may want 85 percent of the functionality as is, but you may want to override the remaining 15 percent to handle special cases that are unique to this particular schema. This is the role of the overlays.

Whereas the base metastructure is automatically generated, the overlays are created by hand (though it's my intent to develop tools to generate the overlays through other user interfaces) and provide the means to shape the forms in a manner that better expresses the needs of the final application.

Each overlay document is similar to the best metastructure document in that it supports an <overlay> container element holding collections of <element> and <pattern> elements. In general, only those elements or patterns that are specifically being overridden need to be included.

<overlay id="cv-overlay-standard">

    <element match="cv:Book">

        <propertysheet>

            <property label="Title" select="cv:TitleOfWork"/>

            <property label="Publisher" select="cv:Publisher"/>

        </propertysheet>

        <label-style>color:blue;weight:bold;</label-style>

    </element>

    <pattern match="Record">

        <box-style>border:inset 3px;</box-style>

    </pattern>

</overlay>

In the overlay cv-overlay-standard, the first entry indicates that for the <cv:Book> element, the property sheet is replaced with a different one that has only two fields, while the label-style (the style of the label for the book control) is set to a bold blue. In the second entry, all elements that are of pattern Record are wrapped in a three-pixel border that surrounds both the label and the relevant control. Note that other properties on the element match, such as the title used for the label, remain the same as before; only those entries that are first-level children of the match element are overridden.

The processor that merges the overlay with the base metastructure looks in a system level file called conf.xml in order to determine which overlays to use, and in what order:

<conf>

    <domain id="app" path="/db/app">

        <topics>

            <topic id="cv" namespace-prefix="cv" 

            namespace="http://www.metaphoricalweb.org/xmlns/2007/cv" 

            object-data="/db/app/cv/data" 

            object-template="/db/app/cv/templates/new.xml">

                <overlay-set id="editor-en">

                    <overlay ref="cv-overlay-standard"/>

                </overlay-set>

                <overlay-set id="editor-fr">

                    <overlay ref="cv-overlay-standard"/>

                    <overlay ref="cv-overlay-fr"/>

                </overlay-set>

            </topic>

        </topics>

    </domain>

</conf>

The ROX processor can handle multiple accounts, so the domain specifies the account in question and gives pertinent information about namespaces, the location of critical resources and collections, and the like. In this particular case, the topic (cv) contains as children two overlay sets: cv-overlay-en and cv-overlay-fr for the English and French overlays of the editor respectively, and the control program which generates the overlays takes a parameter that specifies which overlay set will be applied to generate the output. In the French case, the ordering here is important; cv-overlay-standard is applied directly to the generated base metastructure first, then cv-overlay-fr is applied on the resulting metastructure. This final metastructure then defines the characteristics that will be used for the generation of the XForms.

The full set of actions in a given overlay are still being defined, but include the following:

box-style: the style of the general container for both label and control within the output language.
label-style: the style of the label on the entry.
control-style: the style of the control on the entry (can be used to assign XBL bindings).
title: changes the title used within the label.
display-title: provides a quick way to hide or show a given title, especially in non-CSS based renderings.
metadata: changes the default metadata characteristics of the entity.
prefix: contains a block of content to be inserted prior to the entry itself.
suffix: contains a block of content to be inserted after the entry itself.
propertysheet: used with tables, property sheets, tabsets, categories, and bags to indicate which properties need to be displayed in the control.
collection: indicates for a collection the item being collected and can also contain a pointer to the resource required when new items in a given collection need to be created.
calculated: indicates that, rather than taking the value directly from an entry in the data model, the processor should calculate the value (using XPath) and display that as part of the output (note that such controls are, by definition, read-only).
rule: contains an XPath rule that constrains the value within the control, given as a schematron assertion. In XForms, this will usually be translated as an <xf:bind> statement.
relevance: contains an XPath rule that constrains the relevance of a control, given as a schematron assertion. In XForms, this will usually be translated as an <xf:bind> statement.
type: contains an XPath rule that indicates the data-type of the control, containing the name of a class defined in the schema or using simple xsd: types. Note that except for structural elements (such as enumerations) schema type is generally ignored as a constraint unless it is specified as an overlay property. as an <xf:bind> statement.
passthru: when a constraining filter such as a property sheet is defined, the passthru element makes it possible for other elements to be processed automatically (they are normally terminated). This is a child of the property sheet, and its order in the property sheet indicates where it will be processed in the output document.
action: takes an ev:event attribute appropriate to the output format (such as an XForms event) and invokes the corresponding handler (the specific syntax is still being worked out on this one).
help: is invoked whenever help is requested and the object in question has the focus. This contains text or XHTML content.
tooltip: is invoked typically during a rollover or similar action and appears as a tooltip on both the control and the label.

This list is fairly daunting, admittedly, but it is also an indication of the role that ROX itself plays; the idea here is that the schema generally provides basic structure for an XForms or related form document, but cannot always (usually) determine intent. ROX does a lot of the heavy lifting of establishing the underlying mappings and building the presentation structures, but it should be seen primarily as a tool for getting developers to the 90 percent mark. The overlays should take developers at least the next 9 percent without actually having to dip into XForms code directly and, moreover, are designed so that common tasks such as generating foreign language versions of a given site become a simple matter of setting up a dictionary as an overlay.

Note that the element properties also include two other optional attributes: auth-user and auth-group. Normally, you can only have only one element template per element, but you can create specialized templates that are valid only for the user or group indicated, so that an admin group, for instance, could see a section of a record that would be different from the way that the typical user would see it. Note that order listing here is consequently important; if an authenticated element template appears after a non-authenticated one, the non-authenticated one will be the one used (this may change to a priority system).

From MetaStructure to Forms

Until this point, what has been created is still not a form, per se; it is, in essence, a customized schema with (strong) presentation hints. Once the metastructure is built and augmented, the next stage is to generate the actual presentation pages. Note that this does not necessarily need to be an XForms presentation; the output of the third transformation could be any XML-based language such as Adobe Flex, Microsoft XAML, or Open Lazslo though, given the one-to-one correspondance between presentation and model, the XForms model is probably the best. The code that ROX is currently targeting is the Mozilla XForms add-on, though transformations to X-Port, Orbeon, and Chiba will likely not be far behind.

The basic generator that ROX uses also assumes that eXist is the server technology that will be used for deploying the XForms and is built accordingly. Again, it's possible that additional XML databases will be deployed in the future as certain core features (such as XQuery server language extensions) but that's still at the speculative stage. Several features of the eXist database—its focus on REST based solutions, its ability to work as a middle-tier layer for connection into larger XML and SQL databases, the ability to combine XQUery and XSLT—call in the same package and a generally performant engine for low-to-mid sized data queries make it ideal for the kind of applications that XForms in particular works well against.

In the default case, the ROX system transforms the final metastructure into a contained XForms model, one that uses relative XPaths in order to create local contexts. To give an example of what that means, consider the book situation above. The transformation engine works by walking down each element in the metastructure tree and creating <xf:repeat> nodes to create a local context. A fairly simplified version of the output follows:

<xf:repeat nodeset="cv:Books">

    <div class="cv:Books collection">

    <div class="label"><xf:label>Books</xf:label></div>

    <div class="body">

        <xf:repeat node="cv:Book">

            <div class="cv:Book record">

                <div class="label"><xf:label>Book</xf:label></div>

                <div class="body">

                    <xf:repeat nodex="cv:Title">

                        <div class="cv:Title property">

                            <div class="label"><xf:label>Title</xf:label></div>

                            <div class="body"><xf:input ref="."/></div> 

                        </div>

                    </xf:repeat>

                    <xf:repeat nodex="cv:Publisher">

                        <div class="cv:Publisher property">

                            <div class="label"><xf:label>Publisher</xf:label></div>

                            <div class="body"><xf:input ref="."/></div> 

                        </div>

                    </xf:repeat>

                </div>            

        </xf:repeat>

    </div>

    </div>

</xf:repeat>

The <xf:repeat> nodes in general serve to establish context for their internal contents. The advantage to this is that it does not necessitate that the system establish a fully resolved XPath expression, and changing specific nodes can often be as simple as modifying the overlay <property>'s select attribute.

In a similar manner, the CSS is actually generated from the various template style elements and inserted directly into the web page (though a static CSS document can also be set up). This approach was taken largely because it minimizes the initialization time of XBL bindings. Note that the generator also creates sets of tabs corresponding to various Category Binder types, thus letting the schema actually drive the larger scale user interfaces on the web page. XML Code can also be inserted before or after given elements via the prefix and suffix actions; these similarly inherit the context for that particular element, so that things like additional XForms content can be introduced through these mechanisms without requiring that you edit the page's XML content directly.

Working with the ROX System

The process of defining and implementing the various subsections of the ROX application are ongoing and have been underway for about the last six months. We are now at a stage where it's worthwhile to make the code publicly available, so I announced the creation of a new Google Project called ROX Server, at http://code.google.com/p/rox-server/. I'll be updating code and documentation to this site over the course of the next week, but if you are interested in XForms generation and building support for robust XML forms solutions, then I'd ask that you check out the site and become a contributing member.

At this stage, most of the core generation tools are done, though the terms and functionality differ a little from the list above (the latter reflects the direction that the site will be going). Already the system is able to build and host 100+ field XForms pages, producing output that is fast, functional, and highly interactive. The XQuery publishing and search capabilities (to be discussed in my next article) are also nearing a functional completion point, though there's a fair amount of finishing needed in both the transformations and the organizational code in order to assure that its at a level suitable for enterprise production.

Once the core system is completed, there are three additional areas of code development. The first is a system for building a formal workflow engine into ROX. This project is ongoing now and is built partially upon the Schematron core language. When done, this should make it possible to create, comment on, and approve or reject appropriate forms as part of an orchestrated work flow. The second area is the creation of libraries of standard packs that enable functionality for commonly used schemas. The intent here is to make ROX Server usable right out of the box for common tasks (from invoicing and scheduling to library systems and metadata tagging solutions). The final arena is the deployment of ROX Server as the foundation for a general content management system a la Drupal.

ROX Server are written using open standards technologies (and we will strive to use established standards whenever possible) and the code is made available under a GPL license.

My goal with ROX Server is simple: I believe that as we move to an increasingly interactive Web 2.0, the effective processing of XML-based objects not only will happen, but must happen. Currently an incredible amount of time and money is spent trying to build and manage form-based systems; by providing a set of tools for building rich XForms, I see ROX Server and technologies like it able to free up budgets for other, more important (and more interesting) development projects.

For more information, check out the following resources:

ROX Server Project Site
XForms.org
Or contact me at kurt(dot)cagle(at)gmail(dot)com.

Kurt Cagle is the project lead for the ROX Forms project, the webmaster for XForms.org and an author and information architect specializing in XML, AJAX, and Semantic Web technologies. He lives in Victoria, British Columbia with his wife and daughters, and is watching nervously as the oldest heads off to high school.