Menu

XForms and Microsoft InfoPath

October 29, 2003

Micah Dubinko

Micah Dubinko is author of XForms Essentials.

"Do not go where the path may lead; go instead where there is no path and leave a trail." -- Ralph Waldo Emerson

This month Microsoft is releasing Microsoft Office InfoPath 2003, putting an end to speculation about what the software giant's approach to XML data collection would be. InfoPath appears on the surface to be similar in functionality to several of the XForms engines I wrote about earlier. Although the official Microsoft FAQ no longer even mentions XForms as of this writing, InfoPath is frequently compared and contrasted with XForms.

To be more precise: InfoPath is an application and XForms is a data format, so they can't directly be compared. It is possible, on the other hand, to compare the data format and processing model underlying InfoPath with XForms, which is the approach this article takes.

How Does It Work?

The InfoPath application, much like an XForms engine, converts user input into new or modified XML, which can then be fed into a back-end system. A single Windows-based application is used for both designing and completing a form.

An InfoPath document is stored and processed as several files, which can be either combined into a single CAB-compressed file with a file extension of .xsn or stored in the same directory. Either way, there are several key components.

manifest.xsf

This file contains a manifest, a listing of all other files in the bundle, called a solution. InfoPath keeps several other details here as well, including metadata, information on toolbars and menus associated with each view, information on external data sources, and error messages.

This file is roughly analogous to an XForms Model in that it contains the central hub a form.

view1.xsl, view2.xsl, view3.xsl, ...

One or more sequentially-numbered XSLT files are always included, each one defining an InfoPath view, which presents an editable view of a portion of the XML data under consideration. Each XSLT accepts the XML instance as input and produces an output format similar to HTML forms, but augmented with several InfoPath-specific features.

This XSLT transformation doesn't have an equivalent in XForms, but the HTML-like format produced by the transformation is conceptually similar to the XForms User Interface specification.

template.xml

This file contains the actual XML data that is edited by InfoPath. When the overall InfoPath form is published to a well-known location, the XML instance can be separately transported, via email or any other supported path. A special pair of processing instructions included in this file help maintain the connection between the form data and the rest of the form:

<?mso-infoPathSolution solutionVersion="1.0.0.1" href="manifest.xsf"
   productVersion="11.0.5531" PIVersion="1.0.0.0" ?>
<?mso-application progid="InfoPath.Document"?>

When Internet Explorer encounters any XML document with these processing instructions, it attempts to launch the locally installed InfoPath application, pointing it to the indicated manifest file. Unfortunately, this technique is specific to IE and will not work with other browsers.

myschema.xsd

InfoPath is strongly based on W3C XML Schema (WXS), and the application maintains a WXS instance for the main XML data. InfoPath will refuse to open a form if the template XML is not valid according to the specified schema. A graphical Data Source view in InfoPath, while not a full-fledged editor, allows the designer to make changes to the WXS.

script.js

InfoPath also includes extensive scripting capabilities, in either JScript or VBScript. If an InfoPath document contains any script, it is stored by default in this file.

It's possible for the InfoPath document to contain other user-inserted files as well, including images, XML that can be used as an alternative data source, and even HTML files that can be displayed in a special area called the Task Pane.

Similar, Different

How comparable is InfoPath to XForms? At a high level, both seek to overcome a similar challenge: translating user interaction into XML. Upon closer examination, however, the two technologies differ in focus, target audience, and scope.

Focus
The InfoPath application is focused on providing a superb visual environment, of similar quality to the rest of the Microsoft Office System suite, for creating and filling out forms. In contrast, the XForms specification is designed to encourage implementations not to focus exclusively on visual media, but rather to define only the intent behind various data-gathering controls. The XForms specification gives implementations wide latitude in choices relating to how a particular form control can be implemented, though new CSS properties can narrow down the specific appearance of a form control. Additionally, while XForms is designed to be readily produced by automated tools, InfoPath appears to be put together in such a way that only mouse-designed forms are readily possible.
Target Audience
The recommended system requirements for InfoPath demand a fairly modern Intel-compatible computer: a Pentium III or better as well as Microsoft Windows 2000 (with Service Pack 3) or greater. Further, the software is bundled only in the Enterprise version of Office System, which will in practice be most often used by larger, more Microsoft-committed organizations. By contrast, the XForms specification was designed to work on the broadest possible range of devices, from tiny phones and PDAs to beefy servers. XForms software is being made available in a variety of packages, both open source and commercial, on an assortment of platforms.
Scope
XForms encourages development using a defined declarative XML syntax, while InfoPath, like HTML forms, continues to encourage the deployment of script. Some interesting differences are also found in the choices of form controls supported. For example, InfoPath includes ordered and unordered lists as a form control, but doesn't support the equivalent of a multiple selection or free entry select form control (combobox).

A Word on Standards

InfoPath is built upon an impressive list of standard technologies, including WXS, DOM, and XSLT. For web developers modifying existing InfoPath content, such a design can be of great assistance. Other design decisions in InfoPath, however, tend to reduce the ability to use InfoPath with non-Microsoft browsers, platforms, or servers. For example, any investment in designing InfoPath solutions can be difficult to recoup in the face of changing to a different set of tools, no matter how standards-compliant they are.

A Real-World Example

Despite the differences, comparisons between XForms and InfoPath have been inevitable. A chapter in XForms Essentials examines a UBL purchase order application. It is possible to recreate that application in InfoPath and thus compare the results. Doing so is largely a hand-to-mouse experience with the InfoPath application. The result is shown in Figure 1.

The design view of Microsoft InfoPath
The design view of Microsoft InfoPath
(click for larger image)

Limitations in InfoPath made a few changes necessary—for example, there is no match for the range control for a volume-control slider—but overall the solution ended up quite similar to that of XForms.

One notable difference, however, is that tables, which can be seen here as dotted lines, are required for any kind of layout, which might make things more challenging for non-visual users.

The other major difference was the lack of declarative elements. In XForms, the bind element establishes a relationship that the XForms Processor sticks to at all times; for example, that the currency type entered once gets copied to all of the line items. In InfoPath, script attached to a number of different entry points is required. To conform to the rules of UBL with this simple interface, the purchase order application had four assertions to maintain:

  1. A single currency code is copied into each repeating line item.
  2. Since the currency code appears in two places in each line item, it is also copied to the second location.
  3. For each line item, the extended price is calculated as price times quantity.
  4. The total of the extended price across all line items is summed up.

In XForms, these four assertions are accomplished through four XML elements, each of which contains a desired condition to be met:

<xforms:bind nodeset="cat:OrderLine/cat:LineExtensionAmount/@currencyID"
     calculate="../../cat:LineExtensionTotalAmount/@currencyID"/>
<xforms:bind nodeset="cat:OrderLine/cat:Item/cat:BasePrice/cat:PriceAmount/@currencyID"
     calculate="../../../../cat:LineExtensionTotalAmount/@currencyID"/>
<xforms:bind nodeset="cat:OrderLine/cat:LineExtensionAmount" 
     type="xs:decimal"
     calculate="../cat:Quantity * ../cat:Item/cat:BasePrice/cat:PriceAmount"/>
<xforms:bind nodeset="cat:LineExtensionTotalAmount" type="xs:decimal"
     calculate="sum(../cat:OrderLine/cat:LineExtensionAmount)"/>

In InfoPath, however, the needed script is approximately 70 lines:

XDocument.DOM.setProperty("SelectionNamespaces",
    'xmlns:cat="urn:oasis:names:tc:ubl:CommonAggregateTypes:1.0:0.70"
    xmlns:ns1="urn:oasis:names:tc:ubl:Order:1.0:0.70"
    xmlns:my="http://schemas.microsoft.com/office/infopath/2003/myXSD/2003-04-19T20:40:35"');

function XDocument::OnLoad(eventObj) {
  updateCurrency();
}

// This function is associated with: 
// /ns1:Order/cat:OrderLine/cat:Quantity
function msoxd_cat_Quantity::OnAfterChange(eventObj) {
  recalcLineItem(eventObj.Site.parentNode);
  recalcTotal();
}

// This function is associated with: 
// /ns1:Order/cat:OrderLine/cat:Item/cat:BasePrice/cat:PriceAmount
function msoxd_cat_PriceAmount::OnAfterChange(eventObj) {
  recalcLineItem(eventObj.Site.parentNode.parentNode.parentNode);
  recalcTotal();
}

// This function is associated with: /ns1:Order/cat:LineExtensionTotalAmount/@currencyID
function msoxd__LineExtensionTotalAmount_currencyID_attr::OnAfterChange(eventObj) {
  updateCurrency();
}

function recalcLineItem( lineNode ) {
  var quantity = lineNode.selectSingleNode("cat:Quantity");
  var price = lineNode.selectSingleNode("cat:Item/cat:BasePrice/cat:PriceAmount");
  var extended = lineNode.selectSingleNode("cat:LineExtensionAmount");
  var extPrice = parseFloat(getElementValue(quantity)) * parseFloat(getElementValue(price));

  setElementValue(extended , floatToString(extPrice, 2));
}

function recalcTotal() {
  var dom = XDocument.DOM;
  var extended = dom.selectSingleNode("/ns1:Order/cat:LineExtensionTotalAmount");
  var newTotal = sum("/ns1:Order/cat:OrderLine/cat:LineExtensionAmount");
   setElementValue( extended, newTotal );
}

function updateCurrency() {
  var dom = XDocument.DOM;
  var copyFrom = dom.selectSingleNode("/ns1:Order/cat:LineExtensionTotalAmount/@currencyID");
  var lines = dom.selectNodes("/ns1:Order/cat:OrderLine");

  // loop through each line item, copying in the currencyID
  for (var idx=0; idx<lines.length; idx++) {
    var copyTo = lines[idx].selectSingleNode("cat:LineExtensionAmount/@currencyID");
    copyTo.nodeValue = copyFrom.nodeValue;
    copyTo = lines[idx].selectSingleNode("cat:Item/cat:BasePrice/cat:PriceAmount/@currencyID");
    copyTo.nodeValue = copyFrom.nodeValue;
  }
}

// Utility functions

function getElementValue( node ) {
  if (node.firstChild)
    return node.firstChild.nodeValue;
  else
    return "";
}

function setElementValue( node, newval ) {
  if (node.firstChild) {
    node.firstChild.nodeValue = newval;
  } else {
    var textnode = node.ownerDocument.createTextNode( newval );
    node.appendChild( textnode );
  }
}

function sum(xpath) {
  var nodes = XDocument.DOM.selectNodes(xpath);
  var total = 0;
  for (var idx=0; idx<nodes.length; idx++) {
    total = total + parseFloat(getElementValue(nodes[idx]));
  }
  return total;
}

function floatToString(value)
{
  return "" + value;
}

This script approach requires a bit of care in getting and setting values from XML elements. In accordance with the DOM worldview, actual data values are stored in a text node child of the element node, except that an empty value is signified by the lack of any text child node.

On my initial attempt at this script, I neglected to attach an OnLoad entry point. The solution seemed to work, but had a subtle bug: if the initial value in the currency selection list was never changed, the currency value never got propagated throughout the XML. One advantage of a declarative approach is that it applies equally to initial conditions as well as the ongoing state. The XForms example didn't have to be special-cased for initialization.

Though not used here, the extensive sample forms that come with InfoPath include a large library of script that can be reused via copy-and-paste. The resulting InfoPath document can be filled out in the same application that designed the form, as shown in figure 2.

The data entry view of Microsoft InfoPath
The data entry view of Microsoft InfoPath
(click for larger image)

Conclusion

Both InfoPath and XForms are version 1.0 efforts, and both are likely to improve substantially in future revisions. For organizations that have already licensed Office System 2003, InfoPath will provide an excellent means to automate data collection tasks. For use on systems not running Office System 2003, including Mac and Linux desktops, phones, PDAs, and even some PCs, XForms remains a better path.

Resources