Using XSL Formatting Objects
The World Wide Web Consortium's specification for Extensible Stylesheet Language (XSL) comes in two parts:
- XSLT, a language for transforming XML documents, and
- XSL Formatting Objects (XSL FO), an XML vocabulary for specifying formatting semantics.
XSLT is easy to learn and use. With only a modest investment of time, developers can convert an XML file to an HTML file that users can display in their browsers. This explains why developers are greeting XSLT with great enthusiasm. XML.com's column "Transforming XML" is a great place to get started working with XSLT.
XSL Formatting Objects is itself an XML-based markup language that lets you specify in great detail the pagination, layout, and styling information that will be applied to your content. The XSL FO markup is quite complex. It is also verbose; virtually the only practical way to produce an XSL FO file is to use XSLT to produce a source document. Finally, once you have this XSL FO file, you need some way to render it to an output medium. There are few tools available to do this final step. For these reasons, XML FO has not caught on as quickly as XSLT.
Rather than explain XSL FO in its entirety, this article will give you enough information to use the major features of XSL FO. Our case study will be a short review handbook of Spanish that will be printed as an insert for a Spanish language learning CD-ROM. We'll use the Apache Software Foundation's FOP tool to convert the FO file to a PDF file.
Initialization
Since the XSL FO file will be an XML document, it must begin with the standard XML processing instruction and the FO root element.
<?xml version="1.0" encoding="utf-8"?> <fo:root>
The structure of the remainder of the document is:
- The layout master set, which consists of
- Descriptions of the kinds of pages that can occur in the document.
- Sequences in which those page formats can occur.
- The pages and their content.
Page Layouts
After the FO document's beginning <fo:root> tag, we have to describe what kinds of pages our document can have. Our document will have three kinds of pages shown in the diagram below. To accommodate the stapling area, the cover page and right-hand pages will have more margin space at the left. The content pages will also have a region for a header and footer.
Let's start out by specifying the page widths and heights and margins. The units below are all in centimeters, but you may use any of the CSS units, such as px (pixel), pt (point), em, in, mm, etc. Each of these specifications is called a simple-page-master and must be given a master-name so you can refer to it later.
<fo:layout-master-set>
<fo:simple-page-master master-name="cover"
page-height="12cm"
page-width="12cm"
margin-top="0.5cm"
margin-bottom="0.5cm"
margin-left="1cm"
margin-right="0.5cm">
</fo:simple-page-master>
<fo:simple-page-master master-name="leftPage"
page-height="12cm"
page-width="12cm"
margin-left="0.5cm"
margin-right="1cm"
margin-top="0.5cm"
margin-bottom="0.5cm">
</fo:simple-page-master>
<fo:simple-page-master master-name="rightPage"
page-height="12cm"
page-width="12cm"
margin-left="1cm"
margin-right="0.5cm"
margin-top="0.5cm"
margin-bottom="0.5cm">
</fo:simple-page-master>
<!-- more info will go here -->
</fo:layout-master-set>
The margins are areas which will not contain any printed output.
The Content Area
All of the printing occurs within the dotted lines in the diagram above. This is the page content area (officially called the page-reference-area), which can be divided into five regions as shown below.
Directions
Before continuing, we have to take a side trip to explain some terminology. When we set margins, we use words like top, bottom, left, and right. because everyone agrees which edge of a piece of paper is the top edge, left edge, etc. We will use different words when we talk about the content area, because not all languages are written left-to-right, top-to-bottom.
FO considers a page to be made up of two classes of elements: block elements (such as paragraphs) which begin on a new line, and inline elements (such as bold, italic) which don't. You can think of FO's block-progress-direction as the order in which paragraphs are placed on a page. The before-edge precedes a paragraph, the after-edge follows it.
The inline-progress-direction is the order in which characters are placed within a line. The start-edge precedes a line, and the end-edge follows it.
For Hebrew, as shown below, the start- and end- edges are the opposite of those used for English. (Arabic is written similarly.)
Japanese is sometimes written as shown below. The picture is from the XSL specification.
The advantage of using this new vocabulary is that it is language-independent. If you want a heading to be at the opposite side of the page from normal text, you set its text-align="end" so it appears like
An interesting headingHeadings set like the one above are unusual, and thus more likely to catch a reader's attention. |
If the document is later translated to Arabic or Japanese, you will be assured that the heading will still appear at the corresponding “opposite side” of the text. There will be no need to go through your document reversing left and right or switching them with top and bottom.
Specifying Region Dimensions
The cover page doesn't need a header or footer, so we need only specify information for the region-body by adding the information shown in bold below.
<fo:simple-page-master master-name="cover"
page-height="12cm"
page-width="12cm"
margin-top="0.5cm"
margin-bottom="0.5cm"
margin-left="1cm"
margin-right="0.5cm">
<fo:region-body
margin-top="3cm" />
</fo:simple-page-master>
The left and right pages will have a header and footer, so we must specify the extent of the region-before and region-after.
<fo:simple-page-master master-name="leftPage"
page-height="12cm"
page-width="12cm"
margin-left="0.5cm"
margin-right="1cm"
margin-top="0.5cm"
margin-bottom="0.5cm">
<fo:region-before extent="1cm"/>
<fo:region-after extent="1cm"/>
<fo:region-body
margin-top="1.1cm"
margin-bottom="1.1cm" />
</fo:simple-page-master>
<fo:simple-page-master master-name="rightPage"
page-height="12cm"
page-width="12cm"
margin-left="1cm"
margin-right="0.5cm"
margin-top="0.5cm"
margin-bottom="0.5cm">
<fo:region-before extent="1cm"/>
<fo:region-after extent="1cm"/>
<fo:region-body
margin-top="1.1cm"
margin-bottom="1.1cm" />
</fo:simple-page-master>
Important: The margins you set for the region-body must be greater than or equal to the extents of the the region-before and region-after (and the region-start and region-end if you use them - FOP does not currently support them.). If you do something like this:
<fo:region-before extent="1cm"/>
<fo:region-after extent="1cm"/>
<fo:region-body
margin-top="0.20cm"
margin-bottom="0.20cm" />
you can expect results like
Pages: 1, 2 |