Electronic Publishing with XML
In this article, we describe the process of creating electronic publications using XML and related standards. This publishing procedure has been used to generate conference proceedings for the XML Europe 2001 Conference. We will describe the most important steps in this XML-based publishing process and highlight some of its advantages.
XML Europe 2001
Now in its seventeenth year, the XML Europe Conference was held this year in Berlin (May 21-25, 2001). Formerly known as SGML Europe, the conference was renamed SGML/XML Europe in 1998 and subsequently became XML Europe.
In the past, the proceedings for XML Europe have been available in both paper and electronic formats. For various reasons, the conference organizers, GCA, discarded the paper version this year and opted for an electronic publication only. This was distributed on CD-ROM to each of the conference delegates. Additionally, the GCA used this publication as the basis for an online version on their web site. XML technologies were used throughout the creation process.
An XML-based Publishing Process
Producing a publication using XML technologies involves a number of distinct steps: content creation, validation, and publication. These steps are discussed in the following sections and are applicable to the production of any publication (electronic or print) with XML.
Step 1: XML Content Creation
The first step in an XML-based publishing process is the creation or acquisition of content in an appropriate XML vocabulary. The vocabulary should be flexible enough to represent all common features (e.g. headings, sections, sub-sections, paragraphs, links) and advanced features (e.g. tables, figures and bibliography) of a publication. One possible vocabulary is DocBook XML, used to markup documents such as books, articles, and technical documentation in logical sections.
For the XML Europe conference, an XML DTD was developed that defines the structure of a generic conference paper. This is known as the GCAPaper DTD. Each author whose presentation abstract was accepted by the conference program committee was requested to submit the final paper in XML according to the GCAPaper DTD. The use of this DTD ensures a similar structure for each paper. Thus, all papers can be processed in an identical manner by the publishing process. Here is an example document.
<gcapaper id="s01-1" day="Tuesday" attendee="All"> <front> <title>The power of XML</title> <author refid="s01-1auth1"> <fname>John</fname> <surname>Smith</surname> <jobtitle>Senior Consultant</jobtitle> <address> <affil>Global Enterprises</affil> <city>Dublin</city> <cntry>Ireland</cntry> <email>email@example.com</email> </address> <bio id="s01-1auth1"> <para> <highlight>John Smith</highlight> - John is a senior consultant for Global Enterprises </para> </bio> </author> <abstract> <para>XML is a powerful language for defining markup languages for specific application domains. The XML Specification has been a W3C recommendation since February 1998.</para> </abstract> </front> <body> <para>Paper unavailable at press time.</para> </body> </gcapaper>
To support authors and facilitate the creation of papers in XML, a variety of tools were provided. These included dedicated XML editors (Epic by Arbortext and XMetal by SoftQuad) and extensions to Microsoft Word that allow content to be exported to XML (WorX by HyperVision and S4/Text by i4i). Each of these tools were made available under an evaluation license and were customized to produce XML content adhering to the GCAPaper DTD.
Step 2: Input Validation
Once the content for a publication is in XML, it needs to be validated against the publication DTD. This type of structural validation is a core feature of XML and can easily be performed using any validating XML parser. In addition to structural validation, it is also necessary to validate the contents of the publication logically. This ensures that elements in the DTD have been used in a consistent and correct manner (e.g. "Dublin" is marked as a city and not as a country). The content validation step is particularly important when the content originates from many sources.
Almost all papers submitted to XML Europe 2001 adhered to the GCAPaper DTD. An exception included Microsoft PowerPoint presentations, which had to be converted to the GCAPaper DTD structure before they could be included in the conference proceedings publication. Further validation of all papers was then required to ensure they adhered to specific authoring guidelines for the DTD.
|Want to know more about the processes used, or have your own experience with 100% XML publishing? Let us know by using the forum.|
|Post your comments|
The authoring guidelines accompanying the GCAPaper DTD specify the correct usage of elements in the DTD and also define naming conventions for cross-references and images used within each paper. Validation of authoring guidelines is especially important for conference proceedings as a variety of authoring tools are used to produce papers. Once all conference papers were received and validated, they were imported into a master document representing the conference proceedings publication.
Pages: 1, 2