Electronic Publishing with XML
Pages: 1, 2
Step 3: Producing electronic publication formats from XML
XML is predominantly used to define markup vocabularies for a specific application domain and in general has no default formatting styles, as is the case with HTML. Instead, stylesheets are used to associate presentational information with XML documents. The W3C has developed a stylesheet language specifically for XML known as the eXtensible Stylesheet Language (XSL).
XSL consists of a transformation language (XSLT) and a language for high-quality formatting and layout of XML documents known as XSL Formatting Objects (XSL-FO). The XSLT Specification has been a W3C Recommendation since 1999 and is widely used as a means of transforming content from XML to other formats (including, but not limited to, XML). XSLT processors are available in various programming languages. Recent versions of certain Web browsers also support XSLT processing.
The XSL Specification, which defines XSL-FOs, is considerably larger than XSLT and is currently a W3C Candidate Recommendation. As a result, software support for XSL-FOs is not as widespread as XSLT at present. A number of software tools are under development that will support XSL-FOs. One tool already available is FOP, an open source print formatter driven by XSL-FOs, from the Apache XML Project. Although FOP is still in development, and does not currently support the full XSL specification, it can be used to create PDF documents from XML content.
|
Related Articles |
For this year's XML Europe conference, the GCA chose deepX (the authors' Ireland-based company, specializing in electronic publishing with XML) to create a version of the conference proceedings for distribution on CD-ROM and publication on the GCA web site. These proceedings included HTML as well as printable PDF versions of each conference paper. The creation of these formats (and others such as eBook) was entirely achieved through the use of XSL.
XSLT was used in the production of the XML Europe conference proceedings to generate an HTML version of each conference paper. Additional information pages were generated for efficient navigation within the proceedings, including a table of contents, index pages, and a biography page for each author. FOP was used to create a PDF version of each paper and a single PDF document containing the entire publication. The XSL stylesheets used to create the PDF documents were based on the DocBook XSL stylesheets developed by Norman Walsh.
To demonstrate further the potential of an XML-based publishing process, deepX produced an eBook version of the conference proceedings, based upon the Open eBook (OEB) publication structure. The OEB format describes the content and structure of an electronic publication and is supported by most eBook hardware and software readers. In some cases an OEB publication is compiled into a hardware/software specific eBook format. XSLT was used to generate the OEB version of the conference publication. A variety of eBook implementations were demonstrated at the deepX booth at XML Europe 2001 using eBook reader software on a desktop computer and PocketPC (Microsoft Reader), as well as on a dedicated eBook device (CyBook).
Advantages
A publishing process based on XML has a number of significant advantages. With the XML Europe conference, the GCA has adopted such a procedure not only because of the XML focus of the conference but also for its many advantages. Content defined in XML is platform and software independent. It is also independent of a particular display format, since XML separates content from presentational information. This simplifies the generation of multiple formats from a single source using technologies like XSLT. In addition, this allows the content to be future compatible with emerging publication formats by defining an appropriate transformation to those formats.
Challenges
XML is much easier to learn, use, and process than its parent SGML, and it has been adopted by a wider range of applications domains than SGML. However, current support for XML as a publishing format is only provided by specialized software or as an export format within more common publishing tools. To facilitate the creation of content in XML requires tools that allow authors to produce structured information without having to change their current practices.
Although submissions for XML Europe 2001 were requested in an XML format, a small number did not adhere to this guideline. The submissions received included non-XML documents (e.g. Microsoft PowerPoint) as well as well-formed but invalid XML documents. To keep the publication process consistent, a significant amount of manual work had to be undertaken to correct several submissions. The process for generating the conference proceedings could be made more flexible by providing import filters from common authoring environments such as PowerPoint.
Conclusions
XML is considered an ideal technology around which to build a publishing process. It is a platform and software independent language that can be transformed into a variety of common publishing formats, including content formats used on the Web. The publishing process adopted by the GCA has proven very successful, and it's fitting that a conference promoting XML demonstrates one of its many useful applications.
Links
- Apache - http://xml.apache.org/
- deepX - http://www.deepX.com/
- DocBook XML - http://www.docbook.org/
- DocBook XSL Stylesheets - http://www.nwalsh.com/
- GCA - http://www.gca.org/
- OEBForum - http://www.openebook.com/
- W3C - http://www.w3.org/
- XML Europe 2001 - http://www.gca.org/papers/xmleurope2001/
Acknowledgments
The authors would like to thank Pam Gennusa for her efforts in the coordination and production of the conference proceedings for XML Europe 2001. Thanks also to Hewlett Packard Ireland and Cytale, who provided devices for demonstration purposes at the deepX booth at XML Europe 2001. More information about these devices (Jornada 548 and 720, CyBook) can be found on the companies' respective web sites.
Want to know more about the processes used, or have your own experience with 100% XML publishing? Let us know by using the forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- XML Publishing
2001-07-06 05:34:04 Greg Duncan [Reply]
Publishing to web, paper and cdrom from a single source is not new and exciting, the peiceshave been in polace for some time and organisations are doing this to 1200 dpi web press standard with automated systems.
The problem is to abstract the users from the XML by enabling the word processors to use XML as their native storage using user defined dtd's and mapping tools.
While ever we ask people to understand new technology (XML) and ask large corporations to purchase specific tools, we are going nowhere fast - we must enable the familiar tools that are already deployed and the staff are trained to use.
Greg Duncan
XML Australia
