Can XML Help Write the Law?

May 9, 2001

Alan Kotok

A Report from the Conference on Congressional Organizations' Application of XML

XML has spawned a number of new initiatives to improve the way enterprises, including government and not-for-profit organizations, do business. A meeting held on 24 April 2001 on Capitol Hill in Washington, DC focused on applying XML to the process of crafting legislation, with the potential at least of transforming the basic relationship between citizens and their elected representatives.

The meeting, organized by LegalXML and the House Committee on Administration, had speakers on the current ways of generating legislative documents and turning them into full-fledged laws and regulations. However, the meeting also discussed ways that the public, and the political process, could benefit from the wealth of data in government databases when linked to legislation made available in XML documents.

Early applications show encouraging results

The few uses of XML in legislation so far have shown some impressive results. Brian Breneman of the Breneman Group, talked about the State of Michigan's experiences applying XML to its legislative documents. Breneman served as the contractor that developed the Michigan system. In Michigan, the state legislature converts its compiled law to XML, which makes it easier to offer the documents online in HTML and PDF formats. An earlier project begun in 1995 captured legal documents in SGML, but XML offers more tools, and Web renderings that SGML didn't offer.

The Michigan experience helped state agencies improve the production and management of its legislative process, but the system also included features to encourage public access. Breneman said the Michigan system allows visitors to search the documents themselves, without the need for a legal researcher. Visitors can also draw from different source documents to build their own customized output documents.

This public-friendly application got a positive response from the Michigan public. Breneman reported over 6,000 people use the Michigan system every day. The site has drawn over 43 million hits since it began in September 1999, with some 7.5 million hits this year alone. The build-your-own-document application has resulted in Michiganders generating over 50,000 of these customized documents.

Power (and data) to the people

Patrice McDermott of OMB Watch, a public-interest research and advocacy group in Washington, DC, envisioned a standard government-wide XML vocabulary that would link legislative activities with government databases. This XML vocabulary would enable the public to see the relationship between legislative actions on one hand, with the actual results of those actions as expressed in government records, an idea that generated more than a little nervous laughter among the meeting participants.

McDermott presented a few examples of the potential power of this standard tagging scheme. A current OMB Watch project, the Right to Know Network, lets the public search several environmental and housing databases, including ones that contain information about the release of toxic substances into the air and banking community investment statistics under the Home Mortgage Disclosure Act. McDermott said that a standard legislative vocabulary would enable the public to link these statistics to legislators' committee or floor votes, as well as election-campaign contribution databases. That kind of machine-readable information would give the public much more power and add accountability to the political process.

McDermott also suggested that a standard XML vocabulary would give the public much more access to Federal records through Freedom of Information Act (FOIA) requests. A standard tagging scheme on databases would reduce the ability of agencies to hide behind the "practical obscurity" of government records, a legal term often cited in court cases involving public access to those records.

The Web gets citizens closer to the law

Ken Carson of MyCounsel.Com, an online legal resource site, said the Web has shortened the distance between citizens and the law, and both the law and legislation need to adjust to these new realities. Carson noted early advertisements for lawyers that often showed them sitting in front of a wall of law books, which implied the consumer needed the lawyer to get at the knowledge stored in those volumes.

Carson said the Web has changed all that. Before the Web the average citizen had little or no access to laws and legislation, now much of that information is available for free or low cost. Lawyers may still use the Lexis and WestLaw databases for legal research, but legal resource sites and forums provide citizens with more legal information than ever. Publishers like National Journal and Congressional Quarterly also provide low-cost clipping and bill-tracking services with information that used to be the monopoly of lobbyists.

Carson cited the Security and Exchange Commission's EDGAR database of publicly traded companies, an SGML application, as an example of this process. Edgar used to take special skills to search online, and brokerage houses reserved EDGAR searches as perks for their well-heeled customers. Now EDGAR is available on the Web and investment sites provide sophisticated research tools as part of their basic package.

Meeting the needs of the present

The first panel at the meeting discussed the current state of legislative and related systems, as well as the underlying procedures for legislative markup (the term for writing bills, as opposed to markup languages like XML). The session showed the challenge faced by LegalXML in applying XML to the current process of writing laws.

To do its work, Congress has a network of internal agencies that provide research, production, and management services to the House and Senate. These agencies include the Library of Congress, Congressional Research Service, a separate law library, and Government Printing Office (GPO). The Clerk of the House and Secretary of the Senate each provide administrative and technology services to their respective institutions. Many of these individual offices have worked with XML, with varying degrees of success.

Most American school children learn that Congress passes bills and the President signs them into law. But Executive agencies really do not carry out the laws until they go through the codification process. That process begins with the Federal Register, a daily publication of proposed rules and public notices, which gives the public a chance to comment on the actual enactment of the legislation. Once the proposed rules go through this public vetting, they get codified in the Code of Federal Regulations or CFR. The GPO prints the Federal Register daily.

Jim Hemphill of the National Archives and Records Administration (NARA) an independent Executive agency that publishes the Federal Register, said his agency had a tight deadline to get the Federal Register online in 1993, and the original version did not allow for searching. To help provide a search capability, NARA began working with SGML in 1996, but it wasn't until 2000 that it could generate an electronic Federal Register with SGML. NARA also converted its CFR database to SGML and is now taking the CFR to the next step with XML, a work still in progress. Hemphill said that because of the constant additions and amendments from the Federal Register, the CFR constantly changes, which makes converting it a particular challenge.

Hemphill noted that NARA needs to work with documents coming in from over 1,000 points of entry in the Executive agencies. Still unknown is whether the individual agencies could provide the content already in XML, or if NARA needs to provide that function. Hemphill said if NARA must perform the tagging, it would be difficult to build a scalable government-wide system.

Ink still needs to hit paper

It became clear in the meeting that the GPO plays a critical role, from the standpoint of both document production and information management. In the world of legislation, only printed documents have official status, so for all the talk of online services, legislation isn't real until the documents get printed. As Reynolds Reichart, the director of technology for the House Committee on Administration noted, the printed versions provide a common denominator that guarantees equal access to the information generated by Congress.

GPO produces the Congressional Record and Federal Register daily, with teams of editors, composers, printers, binders, and deliverers working around the clock. Any innovations or enhancements cannot compromise this production process; the books need to get out the door and on the Members' desks every morning.

Robert Winters discussed GPO's work with SGML, and how XML will likely be the next step. Winters said that GPO not only produces the printed copy but also the electronic equivalents. He said that GPO maintains all the text in a database, from which the composition system reads the data, compares it to the DTD, then generates the marked-up text. Winters said GPO needs to handle original data in 15 different formats, including DOS-based word processors still in use at some agencies.

Bill Reilly, another GPO speaker, said agencies need to pay for GPO services, but they get a 35 percent discount if they provide marked up text. So far only five agencies take advantage of this incentive. Reilly said each agency has different procedures and requirements. You cannot just flip a switch and expect the Federal government to convert to XML.

The job of crafting a standard vocabulary for legislation is made even more difficult by the lack of agreement on some of the basic components of legislation. Rich Greenfield of the Congressional Research Service discussed three different formats for citations, a form of legal references, on which the legal community still has no consensus. LegalXML has a separate work group devoted just to this issue.

21st century technology and 19th century processes

The setting for the meeting underscored both the problems and the potential of XML in the legislative process. The conference took place in the hearing room of the House Committee on Administration (occupied the following day by a hearing on election reform, starring Florida Secretary of State Katherine Harris of election 2000 recount fame), featuring an ornate raised wooden dais and leather-topped testimony table. But the room design would not allow for a screen and projection equipment, which meant speakers could not use computer-generated slides or any other audio-visual aids in their presentations. While the setting was impressive, it symbolized the difficulty of trying to accommodate a now-familiar artifact of 21st century life into the 19th century world of Congress.

But LegalXML faces a bigger challenge. Will its work with legislation stop with fixing the current document production system, or will it provide a vocabulary that helps make government data more accessible and the people's elected representatives more accountable? When asked this question, LegalXML Chairman Donald Bergeron said it is up to the LegalXML's task group assigned to define the requirements. The experience in Michigan shows the public makes good use of this kind of XML application. The American public at large deserves no less.