XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XML Technologies: A Success Story

May 16, 2001

We've all heard stories of how new XML technologies have helped build immense corporate databases and complex, dynamic web sites. Well, this isn't one of those stories. This story is about how the Apache Software Foundation's XML tools helped improve this year's California Central Coast Section High School wrestling tournament.

Background

portion
of printed bracket

I've been using a computer to do the scorekeeping for the CCS wrestling tournament for the past ten years. The program, written in C for DOS, displays the bracket on the screen. As you enter results of matches, the winners are automatically advanced and the losers automatically dropped to the consolation bracket or out of the tournament. The program prints bracket sheets to an inkjet printer. A portion of the output is shown in the image at the right. A dot matrix printer produces mailing labels with the results of the matches. These labels are affixed to large wall charts.

Other labels containing the information for upcoming matches are affixed to pre-printed bout sheets such as the one you see at the right. Up until about 1998, the wall charts and bout sheets were designed for a label size of 2.5 inches by 15/16 inches. These labels have to be ordered in advance, and are hideously expensive, as they are a non-standard size. Recently, the wall charts were redesigned to permit the standard 3.5 inches by 15/16 inches, but the bout sheets weren't. Fate intervened when the CCS Events Coordinator asked me to send him a blank bout sheet to copy. This was my opportunity to create a new bout sheet template that would let me use the larger labels.

Since I'm using Linux and the CCS uses Windows, I needed a cross-platform solution. Adobe PDF format was the answer, and this is where Scalable Vector Graphics (SVG) and Formatting Objects to PDF (FOP) enter the story.

Creating the Bout Sheet

The bout sheet is not a typical text document; it's mostly a set of lines, empty boxes and a circle with minimal text labeling. Thus, I decided to use Scalable Vector Graphics (SVG) to describe the form, and use FOP as a wrapper to produce the desired PDF output. I took a ruler and an old bout sheet, redrew the lines, and measured the widths and locations of the boxes and text, and created the formatting objects XML file by hand. You may download the FO file and the resulting PDF file.

The advantages from the new bout sheet were small but significant. I was able to use cheaper labels, and they didn't overlap the area where the scorekeepers write the match statistics. This meant that the results were easier to read. But the real payoff from XML technologies came near the end of the tournament.

Printing the Results

A few years ago, I added code to the scorekeeping program to output the brackets as a Rich Text Format (RTF) file. The file didn't describe the bracket completely; once you loaded it into a word processor you had to set the font to a small size so that the contents would fit on the page width, and you had to use a monospace font since the RTF mirrored the screen display and bracket print code, which was also monospaced. It worked, but it was ugly.

I thought that it would be nicer to use a proportional font with true underlining and vertical rules, but I didn't know RTF well enough to achieve this effect. However, I had been learning about XSL Formatting Objects (XSL FO), and, at the beginning of the second day of the tournament, I realized that formatting objects, in conjunction with FOP, would do exactly what I wanted.

Data to XML

The first problem was converting the match data, stored in binary data files, to XML. Luckily, there's a three-hour break between the end of the consolation matches and the finals. I used part of this time to write a Perl script that would produce an XML file from by reading the files that describe the bracket and match results. I didn't want to go directly to formatting objects; doing that would only complicate the Perl script. I decided to create a simple ad-hoc XML notation that would be nearly a one-to-one correspondence to the data file structure. The result looked like this:

<line num="1">
   <cell num="0" short="yes"></cell>
   <cell num="1" underline="yes">Chris Jaworski</cell>
</line>
<line num="2">
   <cell num="0" short="yes">Bout 1</cell>
   <cell num="1" vbar="yes">(St. Francis)</cell>
   <cell num="2" underline="yes">Chris Jaworski</cell>
   <cell num="5" text="yes">CCS Championships</cell>
 </line>           

I associated a num attribute with each line and cell; this permitted me to skip empty cells.

XML to XSL FO

I could now use XSL Transformation (XSLT) to convert the ad hoc XML to XSL FO. The plan was to make each cell of the bracket into a <fo:block> inside an absolutely-positioned <fo:block-container>. By turning on the bottom and right borders of each block-container, I could construct the underlines and vertical bars of the bracket. You may download the XSL file that does the conversion. I used Xalan to do the transformation and FOP (0.16.0) to convert the formatting objects to PDF.

If this were a press release, I'd rhapsodize about how successful the whole process was and that would be the end of this article. That would not, however, be the entire story -- a little “post-game analysis” is in order.

The construction of the bout sheet, in fact, was entirely straightforward. I had to correct some minor errors, and I had to convert some of the units to points and do some minor adjustment to get the text right where I wanted it, but the SVG and FO themselves worked perfectly.

Converting the data files to XML was not a big problem; the biggest annoyance was having to look at code I had written years ago to figure out the internal data structure of the bracket.

Writing the XSLT file to convert to XSL FO was a bit more problematic and required quite a bit of experimentation. I encountered some problems during this process:

  • After a failed attempt to place a <fo:block break="page-after"> around a <fo:block-container>, I tried putting the break-after="page" in the <fo:block-container> element. However, block containers are not part of the text flow, so that doesn't work either. That's why the template for page has an added <fo:block break-after="page"> to force a new page to occur.
  • The names were too close to the vertical bars; I fixed it by adding a start-indent to each cell.
  • The names were too far above the underlines. I tried to use the alignment-adjust attribute, but it wasn't implemented in FOP 0.16.0. The vertical-align attribute didn't do the trick either, so I ended up adding a start-before to move the text closer to the line.
  • The title at the upper right of the bracket was longer than the cell width, it word-wrapped, only the first word showed up. That's why I had to add the wrap-option="no-wrap".

You can download a ZIP file that contains the XML and XSL files if you'd like to take a closer look. You can download the resulting PDF files from this directory.

Summary

So after all that work and trouble, was it worth it? Yes. It took me less time to produce the bout sheet with SVG than it would have taken to find a Windows machine, learn to use a drawing program, and produce a file that would have been in a proprietary format.

The bracket printout was also worthwhile, mostly as a learning exercise and also as a proof of concept. The PDF output also looks better than the RTF. Again, there was a time savings; it was easier for me to learn the syntax for formatting objects than it would have been for me to learn the RTF to produce an equally good-looking result in that format.

Finally, the fact that I was able to accomplish all of these tasks with open source software is the icing on the cake.