Marking Up Bureaucracy

September 24, 2003

Paul Ford

If there is a perfect user of XML, it's the huge, sprawling United States government. With thousands of diverse offices, from the Navy to National Park Service, each federal agency routinely exchanges gigabytes-worths of documents and data with other offices, businesses, and citizens.

Organizations as large as the US government rarely move quickly, so at first it's surprising to see so much XML activity underway. Historically, however, many government organizations are not strangers to markup. There's a story, probably apocryphal, that when all the printed documentation for a battleship's systems was removed from the battleship, it rose a foot in the water. So, well over a decade ago, seeking to ameliorate its mind-boggling need for technical documentation, the Navy embraced XML's ancestor SGML and others followed suit. The Navy eventually contributed to numerous standards -- for example, the CALS tables in DocBook, and the HyTime standard (parts of which live on in XLink) were each influenced by the needs of the military.

But even within those government organizations relatively new to XML, awareness of the standard is increasing, in part because it's required by law. Joseph Chiusano, XML Thought Leader with consulting firm Booz Allen Hamilton, explained why: "the OMB Circular A-119 and the National Technology and Transfer Act of 1995 require federal agencies to first use VCSs, or Voluntary Consensus Standards'" to "carry out policy objectives." Increasingly, these VCSs are XML-based schemas based on standards established by the W3C, OASIS, and similar organizations.

Chiusano also explained that, in April, 2002, the General Accounting Office (GAO), published Challenges to Effective Adoption of the Extensible Markup Language. This document identified a number of problems with XML -- in particular the lack of a centralized registry, the risk of redundant schemas, and the lack of security -- but, finally, recommended that the US government as a whole "develop a strategy for governmentwide adoption of XML" to ensure that the technology is used across agencies.

In addition, the Government Paperwork Elimination Act of 1998 requires federal agencies to use electronic documents and accept electronic signatures as of October, 2003, another potential use of XML. Taken together, all of this means that not only is there a certain historical pressure to use XML that comes from the historical use of markup in organizations like the IRS and the DoD, but there's also a legal requirement for agencies to pay attention to XML and deploy it to manage their documents and data.

What's Happening Today

Right now, centralization is the exception, not the norm. Different XML applications are scattered across different government agencies. The DoD, EPA, IRS, and others create schemas as needed, and apply them internally.

In an effort to encourage centralization of all online government services, including those using XML, the White House created the E-government initiative, which divides government technology into three roles: Government-to-Government (G2G), Government-to-Business (G2B), and Government-to-Citizen (G2C).

Most effort has been focused on G2G. As described above, one of the major creators and consumers of markup is the the Department of Defense. Earlier efforts at standardizing schemas DoD-wide met with significant resistance, so now the DoD uses a "market-oriented" strategy to manage its own XML registry. According to Owen Ambur, co-founder and co-chair of the XML working group, "essentially, individual departments are encouraged to post schemas," and other departments are encouraged to work with existing schemas instead of inventing new ones with the hope that, over time, individual schemas will be identified as most useful and promoted broadly throughout government.

Much effort is also being applied to the E-Forms for E-Gov project, which is currently creating an infrastructure for using XForms, PDF, and related technologies to allow the myriad different federal forms to be filled out and signed electronically. This technology is expected to be useful both in G2G and G2B and will allow the processing of common forms, like passports applications, applications for federal assistance, and travel vouchers to be completely automated. Like many XML efforts within government, the E-Forms for E-Gov effort is very much in progress, and currently issues of security and schema design are being discussed, with no final recommendations issued.

Because the government comes into citizen's lives in so many ways, and because, on the whole, United States citizens have only limited Internet access, the deployment of XML technologies for G2C is the hardest to pin down. According to Mark Frautschi, a consultant specializing in knowledge management for government, some of the most promising work is emerging from places like the Universal Access Expedition Working Group. Started over two years ago without a set agenda, the group has been exploring "ways to live with section 508 of the Americans With Disabilities Act," which requires that web sites be completely accessible to users with special needs.

"With the emergence of a number of XML-tied technologies, you could get a lot of bang for the buck implementing for section 508," said Frautschi. "You could even have benefits for non-disabled users -- XML makes it better for everyone."

Much other work is currently under discussion, but very few individuals were willing to discuss upcoming implementations; that said, it can be inferred that many projects lean toward making publicly available information (like that found on FirstGov and on THOMAS) available via a public, web-services based API. And the occasional tantalizing PDF shows that sites which present information to the public are thinking in terms of taxonomies, which, along with agenda proceedings from a September 8, 2003 conference on "Semantic Technologies for eGov" indicate that the US Government is not shying away from the promise of the Semantic Web.

Some of the proposed applications of semantic technologies include:

  1. Coordinating data sets between the EPA and CDC to determine pesticide levels, the safety of drinking water, and air quality in different areas.

  2. Smarter search and better knowledge management for the NSF.

  3. The development of expert systems for Homeland Security applications by the Homeland Security Digital Library.

  4. Expert systems to ensure compliance with law by the FAA and DOT.

  5. Regulating imports and exports by the USDA.

In keeping with the idea of interoperability, most of what's happening with XML is taking place in open meetings, run across government divisions. "What's exciting is that there's been just enough relaxation of control for innovation to show up," said Frautschi. "A citizen should only need to enter a birth date once."

Getting to FEAPMO

As XML is increasingly used throughout government, it's only logical that there will be attempts toward centralizing documents, data sets, and schemas to eliminate redundancy. To that end, the registry, a well-designed site created by the XML Working Group, serves as the government's XML portal, and to a lesser degree, the (suprisingly undesigned) site provides insight into how the government might use XML as a data transmission language instead of solely as a document markup language.

These are the first steps towards centralization. But the next step is likely to be within the framework defined by FEAPMO, the Federal Enterprise Architecture Program Management Office.

FEAPMO is an initiative of the OMB, the Office of Management and Budget. The OMB is part of the White House and responsible for the federal budget; it's also chartered to implement E-Government.

In this capacity, the OMB works closely with the CIO Council, a group comprised of the CIOs and deputy CIOs of the largest federal agencies (along with officials from the OMB). The CIO council serves as "the principal interagency forum for improving practices in the design, modernization, use, sharing and performance" of federal government resources. Essentially, it's an IT think-tank for the entire government, with the power to create new working groups that will be useful government-wide -- the XML working group, for instance, which is responsible for, was created by CIO council.

The FEA in FEAPMO refers the Federal Enterprise Architecture, a means of organizing all IT investment across the government according to a set of "reference models," which will, it is hoped, make it easier to design and implement IT solutions for every agency, and will also make it possible assess the success of those solutions.

For instance, the top-level reference model is the Performance Reference Model, which defines how to measure the performance of your agency in relationship to other government agencies. Other reference models (there are five total) define processes for managing customers and partners, defining public government services, and so forth. Of interest to the XML community is the Data and Information Reference Model (DRM), which will define all of the data that supports government operations, categorizing the government's information along "general content areas."

Because this reference model is intended to provide an infrastructure for all government data interchange, XML is proving to be a natural choice. While very little about the DRM appears to be set in stone, Ambur said "it is my hope and expectation the XML Registry will become the embodiment of the DRM."

The XML Working Group is working toward this goal. "Our greatest success is helping to raise the visibility of XML," said Ambur. "We've brought focus to bear on the XML registry; you can't collaborate on data elements and schemas if people can't find them."

For the XML developer looking to understand where the government is taking XML, the FEA is the place to look. As more agencies deploy XML-based IT strategies, XML will no longer be an end in itself for the government, and will instead becomes a means to an end -- that end is the DRM component of FEA.

Footing the Bill

While these goals are large, the XML-specific IT budget is surprisingly small: $7 million for the 2004 fiscal year, spread over 3 divisions and 5 projects. Compare this to around $4.7 billion in spending, government-wide, on IT security alone.

Complicating matters is the fact that Congress routinely cuts funding for electronic government initiatives that aren't associated with a particular agency. "There isn't any payoff for [Congress] to fund government-wide projects," explained Ambur, which is why most XML projects are under the auspices of various agencies, rather than centralized. The implicit assumption of these efforts is that work being done can be leveraged government-wide in time, most likely under the umbrella of the DRM. And, despite the difficulty in funding government-wide projects, several people working with XML indicated that the current efforts around Homeland Security has made it easier to focus on data sharing and promote these technologies throughout security-related government agencies.

The effects of increasing automation using standardized data are hard to predict. "I think the number of dull jobs is going to go down," says Frautschi. "With one VoiceXML application, the dullest jobs can be replaced by machines." The increasing use of web services and interoperable schemas, along with efforts that allow for the automated submissions of forms, could make the government more accountable to itself and to the rest of the world.

Outside of strict accounting, the greatest benefit may be the increased access of citizens to government information, whether educational materials, law databases, or information on bills under discussion. As Ambur writes on his personal web site, "We should 'speak truth to power' but we must do more than that. Talk is cheap. The highest purpose to which we might dedicate ourselves in this life is to create a record of which we can be truly and truthfully proud." A very decent goal for the deployment of XML in the service of a government and its citizens.