U.S. Federal XML Guidelines

February 6, 2002

Alan Kotok

The United States federal government's XML Work Group, a sub-committee of the Chief Information Officers Council (CIOC), drafted its first guidelines that spell out best practices for the use of XML in federal agencies. This document, which begun circulating in early draft form for comment in January 2002, shows that U.S. government agencies, major users of information technology, are trying hard to get their hands around fast-moving XML developments. But the guidelines also show the difficulties the group faces.

The CIOC created the XML Work Group in June 2000 and gave it the job of identifying best practices and recommending standards for XML in federal agencies. The U.S. Navy had already written a set of XML guidelines to cover its operations, and the government-wide work group took the Navy's document and generalized it.

As of early February 2002, agencies were still reviewing the draft XML document, which, upon approval of the XML Work Group, will be submitted to the CIOC, and eventually to the Office of Management and Budget (OMB) for consideration as government-wide policy.

Any practices or guidelines adopted across the federal government will have a large impact on XML developments elsewhere. Just the sheer scale and scope of federal information technology spending can cause ripples throughout the private sector economy. Federal Computer Week magazine reported in December 2001 that U.S. agencies are expected to spend in the current fiscal year some $45 billion on information technology. And Congressional leaders anticipate a 10 percent increase in the next fiscal year, which begins 1 October.

Getting a Fix on XML Standards

The first substantive section of the guidelines, on software applications, jumps right into the often confusing subject of XML standards. One of the XML Work Group's continuing responsibilities is to partner with standards bodies involved with XML to help guide agencies transition from EDI to XML-based exchanges. The growing number and overlapping reach of standards related to or based on XML makes that continuing task a real challenge.

The U.S. government bases its IT policies, like everything else in government, on explicit legislative authority. Federal agencies are required to follow existing technical standards before writing their own, a policy which comes from the National Technology Transfer and Advancement Act of 1995, Public Law 104-113. OMB issued its guidance to agencies on this law in circular A-119 in 1998 which directs agencies to "use voluntary consensus standards in lieu of government-unique standards except where inconsistent with law or otherwise impractical."

The XML guidelines say that, as a general rule, production applications should use software that implements only World Wide Web Consortium (W3C) final recommendations, the W3C's term for approved standards. However, the guidelines leave open the use of software based on specifications other than W3C recommendations if the agency can ensure that no competing W3C recommendation exists, and that the specification is the product of a "credible, recognized consortium or organization." The document then lists several examples including OASIS, UN/CEFACT, Object Management Group, Open Applications Group, Universal Description Discovery and Integration (UDDI), RosettaNet, and BizTalk.

The document also provides cases where agencies may use W3C proposed or candidate recommendations, or even working drafts, usually for pilot tests or advance-concept demonstrations. For proposed recommendations, or where new versions are anticipated soon, such as the release of Simple Object Access Protocol (SOAP) 1.2, agencies are required to update their software to support the final recommendations.

The guidelines expressly prohibit the use of any specifications that compete directly with W3C recommendations. Later, in section 4.1, the guidelines say flatly that only W3C-recommended languages shall be used within the government for describing documents, except for document-oriented applications, where agencies may continue to use document-type definitions (DTDs). That prohibition would seem to apply to RELAX NG, an OASIS specification that provides many of the functions of XML Schema (see Eric van der Vlist's recent article on RELAX NG).

Marion Royal, of General Services Administration and co-chair of the XML Work Group, says the decision to follow W3C recommendations is based on the W3C's international scope. Royal added that much of the federal government's work has significant international implications, and as a result, specifying standards with more international support benefits the agencies. However, the RELAX NG team has discussed plans to submit its specification to the International Organization for Standardization (ISO), which, if accepted, would give it more international standing as well.

Reuse, reuse, and reuse again

The guidelines encourage agencies at several points to try to meet their business needs with existing XML solutions before writing their own. For schemas and data items within schemas, the guidelines require agencies to search the Federal XML Registry (FXR) for existing suitable components in other federal applications. Agencies also need to search for suitable commercial vocabularies that can apply to their business needs. Some XML business vocabularies, such as Extensible Business Reporting Language and Human Resources XML, have been developed with participation from the public sector.

The guidelines consider data components suitable if they meet the needs of the business domain and follow the prescribed naming conventions. And if agencies decide to use data components from other federal schemas, they need to register their use of those components with the FXR. Royal says the FXR will be an important part of XML development in the federal government, but noted that it is still in development. The guidelines document will fill in the details about the FXR once it is operational.

But if you do need to develop your own solutions...

If an agency needs to develop its own schema, the guidelines encourage it to engage program managers and business domain experts in the task along with IT specialists, and it encourages initial business process modeling to better understand data exchange requirements. The document does not require a specific modeling method but says agencies may use the Unified Modeling Language (UML) for this purpose. The guidelines mention no other modeling methods.

Royal said the UML recommendations in the guidelines are aimed more at developers than program managers, in order to provide them with a common modeling language. The guidelines may need to clarify this point further, given that some agencies have recently found UML lacking (see Interoperate or Evaporate, 12 December 2001).

If agencies need to create their own vocabularies, the document offers detailed guidance on element naming. The guidelines recommend using the naming conventions found in ISO standard 11179, Specification and Standardization of Data Elements. That standard divides data elements into three parts:

  • Object class — a set of ideas or artifacts that can be identified and delineated, and with properties or behaviors that follow the same rules;
  • Property — a common condition found all members of an object class;
  • Representation — describes the manifestation of values in the element, such as datatype or unit of measure

ISO 11179 uses periods as delimiters between units, but the Federal XML guidelines recommend the approach used by Electronic Business XML (ebXML) core components that puts the three parts together as one string. The document specifies camel case for Federal XML components, with upper camel case (UpperCamelCase) assigned to elements and lower camel case (lowerCamelCase) applying to attributes.

What, no acronyms?

The document's naming conventions also follow ebXML's recommendation to avoid acronyms in tag names. It leaves the decision on the use of acronyms to the program managers rather than systems designers, and it urges that the decision-maker consider the need for communicating across multiple communities of interest. Well known federal agency acronyms, such as NASA and FBI, would likely survive in federal schemas.

The guidelines discuss the use of XML elements and attributes in schemas. The document advises agencies to use attributes only for metadata that will not be parsed and to provide additional business meaning for the element. Also, attributes should be short, not subject to further subdivision, apply to the entire contents of the element structure (including child elements), not contain data specific to a particular application or database.

While the document contains a lot of normative content (the thou-shalts and thou-shalt-nots with XML), most of the guidelines consist of explanations and examples. The drafters have cast the guidelines as a tool to encourage the use of XML to promote interoperability among federal systems. It's noted that the document will probably evolve over time.

The XML Work Group deserves credit for taking on many tough questions. Providing guidance means agreeing on rules and making choices, which are rarely easy tasks. The question now is whether the group made the right choices.

Sidebar: e-GIF, an alternative approach for XML in government

The United Kingdom has taken a more assertive and encompassing approach than the U.S. toward the use of XML in government systems. The U.K. has established an e-Government Interoperability Framework (e-GIF) that spells out technical policies and specifications for achieving interoperability across the public sector. In the U.K., e-GIF "defines the essential pre-requisite for joined-up and web enabled government. It is a cornerstone policy in the overall e-government strategy."

The main focus and objective of e-GIF is to adopt Internet and Web specifications for all government systems, with XML and XSL specifically noted as the core standards for data integration and representation. The goal of e-GIF is to adopt XML specifications that are already well adopted in the marketplace, to reduce the cost and risk for government systems, while keeping them aligned with global developments.

The e-GIF specifications deal mainly with infrastructure, but a team has also developed a core set of XML schemas for business use in public agencies. The Government Schemas Group, as it is called, also works with and monitors key standards bodies such as the W3C, IETF, and OASIS, much like its counterpart in the U.S., the XML Work Group.