XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Using XSL Formatting Objects

January 17, 2001

Table of Contents

Initialization

Page Layouts

The Content Area

Directions

Specifying Region Dimensions

Page Sequences

The Cover Page

Creating the PDF file

Beginning the Content Pages

Watch this space

The World Wide Web Consortium's specification for Extensible Stylesheet Language (XSL) comes in two parts:

  1. XSLT, a language for transforming XML documents, and
  2. XSL Formatting Objects (XSL FO), an XML vocabulary for specifying formatting semantics.

XSLT is easy to learn and use. With only a modest investment of time, developers can convert an XML file to an HTML file that users can display in their browsers. This explains why developers are greeting XSLT with great enthusiasm. XML.com's column "Transforming XML" is a great place to get started working with XSLT.

XSL Formatting Objects is itself an XML-based markup language that lets you specify in great detail the pagination, layout, and styling information that will be applied to your content. The XSL FO markup is quite complex. It is also verbose; virtually the only practical way to produce an XSL FO file is to use XSLT to produce a source document. Finally, once you have this XSL FO file, you need some way to render it to an output medium. There are few tools available to do this final step. For these reasons, XML FO has not caught on as quickly as XSLT.

Rather than explain XSL FO in its entirety, this article will give you enough information to use the major features of XSL FO. Our case study will be a short review handbook of Spanish that will be printed as an insert for a Spanish language learning CD-ROM. We'll use the Apache Software Foundation's FOP tool to convert the FO file to a PDF file.

Initialization

Since the XSL FO file will be an XML document, it must begin with the standard XML processing instruction and the FO root element.

<?xml version="1.0" encoding="utf-8"?>
<fo:root>

The structure of the remainder of the document is:

  • The layout master set, which consists of
    • Descriptions of the kinds of pages that can occur in the document.
    • Sequences in which those page formats can occur.
  • The pages and their content.

Page Layouts

After the FO document's beginning <fo:root> tag, we have to describe what kinds of pages our document can have. Our document will have three kinds of pages shown in the diagram below. To accommodate the stapling area, the cover page and right-hand pages will have more margin space at the left. The content pages will also have a region for a header and footer.

Page layout diagrams

Let's start out by specifying the page widths and heights and margins. The units below are all in centimeters, but you may use any of the CSS units, such as px (pixel), pt (point), em, in, mm, etc. Each of these specifications is called a simple-page-master and must be given a master-name so you can refer to it later.

<fo:layout-master-set>
    <fo:simple-page-master master-name="cover"
        page-height="12cm"
        page-width="12cm"
        margin-top="0.5cm"
        margin-bottom="0.5cm"
        margin-left="1cm"
        margin-right="0.5cm">
    </fo:simple-page-master>

    <fo:simple-page-master master-name="leftPage"
        page-height="12cm"
        page-width="12cm"
        margin-left="0.5cm"
        margin-right="1cm"
        margin-top="0.5cm"
        margin-bottom="0.5cm">
    </fo:simple-page-master>

    <fo:simple-page-master master-name="rightPage"
        page-height="12cm"
        page-width="12cm"
        margin-left="1cm"
        margin-right="0.5cm"
        margin-top="0.5cm"
        margin-bottom="0.5cm">
    </fo:simple-page-master>

    <!-- more info will go here -->
</fo:layout-master-set>

The margins are areas which will not contain any printed output.

The Content Area

All of the printing occurs within the dotted lines in the diagram above. This is the page content area (officially called the page-reference-area), which can be divided into five regions as shown below.

Regions of the page content area

Directions

Before continuing, we have to take a side trip to explain some terminology. When we set margins, we use words like top, bottom, left, and right. because everyone agrees which edge of a piece of paper is the top edge, left edge, etc. We will use different words when we talk about the content area, because not all languages are written left-to-right, top-to-bottom.

FO considers a page to be made up of two classes of elements: block elements (such as paragraphs) which begin on a new line, and inline elements (such as bold, italic) which don't. You can think of FO's block-progress-direction as the order in which paragraphs are placed on a page. The before-edge precedes a paragraph, the after-edge follows it.

The inline-progress-direction is the order in which characters are placed within a line. The start-edge precedes a line, and the end-edge follows it.

For Hebrew, as shown below, the start- and end- edges are the opposite of those used for English. (Arabic is written similarly.)

Hebrew written right-to-left

Japanese is sometimes written as shown below. The picture is from the XSL specification.

Japanese written top-to-bottom, right-to-left

The advantage of using this new vocabulary is that it is language-independent. If you want a heading to be at the opposite side of the page from normal text, you set its text-align="end" so it appears like

An interesting heading

Headings set like the one above are unusual, and thus more likely to catch a reader's attention.

If the document is later translated to Arabic or Japanese, you will be assured that the heading will still appear at the corresponding “opposite side” of the text. There will be no need to go through your document reversing left and right or switching them with top and bottom.

Specifying Region Dimensions

The cover page doesn't need a header or footer, so we need only specify information for the region-body by adding the information shown in bold below.

    <fo:simple-page-master master-name="cover"
            page-height="12cm"
            page-width="12cm"
            margin-top="0.5cm"
            margin-bottom="0.5cm"
            margin-left="1cm"
            margin-right="0.5cm">
            <fo:region-body
                margin-top="3cm" />
        </fo:simple-page-master> 

The left and right pages will have a header and footer, so we must specify the extent of the region-before and region-after.

    <fo:simple-page-master master-name="leftPage"
            page-height="12cm"
            page-width="12cm"
            margin-left="0.5cm"
            margin-right="1cm"
            margin-top="0.5cm"
            margin-bottom="0.5cm">
            <fo:region-before extent="1cm"/>
            <fo:region-after extent="1cm"/>
            <fo:region-body 
                margin-top="1.1cm"
                margin-bottom="1.1cm" />
        </fo:simple-page-master>

        <fo:simple-page-master master-name="rightPage"
            page-height="12cm"
            page-width="12cm"
            margin-left="1cm"
            margin-right="0.5cm"
            margin-top="0.5cm"
            margin-bottom="0.5cm">
            <fo:region-before extent="1cm"/>
            <fo:region-after extent="1cm"/>
            <fo:region-body 
                margin-top="1.1cm"
                margin-bottom="1.1cm" />
        </fo:simple-page-master> 

Important: The margins you set for the region-body must be greater than or equal to the extents of the the region-before and region-after (and the region-start and region-end if you use them - FOP does not currently support them.). If you do something like this:

<fo:region-before extent="1cm"/>
<fo:region-after extent="1cm"/>
<fo:region-body
    margin-top="0.20cm"
    margin-bottom="0.20cm" />

you can expect results like

Text overwrites heading

Pages: 1, 2

Next Pagearrow







close