Printing from XML: An Introduction to XSL-FO
Dave Pawson is the author of XSL-FO: Making XML Look Good in Print
One of the issues many users face when introduced to the production of print from XML is that of page layout. Without having the page layout right, its unlikely that much progress will be made. By way of introducing the W3C XSL Formatting Objects recommendation, I want to present a simplified approach that will enable a new user to gain a foothold with page layout.
The aim of this article is to produce that first page of output -- call it the "Hello World" program -- with enough information to allow a user to move on to more useful things. I'll introduce the most straightforward of page layouts for XSL-FO, using as few of the elements needed as I can to obtain reasonable output.
One of the problems is that, unlike the production of an HTML document
from an XML source using XSLT, the processing of the children of the root
elements is not a simple
xsl:apply-templates from within a
root element. Much more initial output is required in order to enable the
formatter to generate the pages.
Let's look at the processing necessary to get from your XML document to a PDF printable document. First, the XML must be fed to an XSLT processor with an appropriate stylesheet (developed below) in order to produce another XML document which uses the XSL-FO namespace and is intended for an XSL-FO formatter. The second stage is to feed the output of the first stage to the XSL-FO formatter, which can then produce the end product: a printable document, styled for visual presentation.
XML -> XSLT XSL-FO -> XSL-FO printable document engine document formatter document ^ | XSLT stylesheet
This approach has the advantage that the XML source document is still format neutral and may be used with other XSLT stylesheets to produce other media.
The XSL-FO Document
We need to be aware of the initial target of the XSLT transformation, the XSL-FO document. The document you are producing, which is fed to the XSL-FO formatter, contains a small number of elements:
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"> <fo:layout-master-set>  <fo:simple-page-master master-name="simple" >  <fo:region-body/> </fo:simple-page-master> </fo:layout-master-set> <fo:page-sequence master-reference="simple">  <fo:flow flow-name="xsl-region-body">  content  </fo:flow> </fo:page-sequence> </fo:root>
Let's look at each of the identified elements in turn.
 In order to layout content on a page, the formatter needs
to know what sizes it has to deal with. The
contains the 
simple-page-master which contains
this information, e.g. whether you use a European A4 page size or an
American US-letter size. It also contains the
element, which may be seen as the main body of the page layout.
 In order to support complex pagination, the
page-sequence element is used. For a simple page layout, very
little content is required here, other than to refer back to a particular
page definition (the
Also within the
page-sequence element is a
flow element . The idea of a flow may or may not be
familiar to you. I came across it using desktop publishing packages, where
I poured text into page areas to build up columns for a college magazine,
hence the content flowed into page areas.
Identifying which region of the page to pour the text into is the
rationale for the
xsl-region-body. This differentiates the
body of the page from the outer areas (margins, header, footer etc.) of
the page. Finally, some content , which is a child of the main
flow. Simple text cannot be inserted here, since the formatter would have
to guess what you wanted to do with it, so the real content for the flow
would take the form of
<fo:block>content</fo:block> which defines a
block of text (rectangular in shape, big as you like, taking a full list
of defaults for everything) which will be placed as the first item on the
In order to get a better grasp of all this, let's fill out, minimally, how it might fit into a stylesheet whose task is to take a simple XML document and produce another XML document, which is then fed to an XSL-FO formatter.
A basic XSLT stylesheet to produce XSL-FO is shown below.
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"  xmlns:fo="http://www.w3.org/1999/XSL/Format">  version="1.0"> <xsl:output method="xml"/>  <xsl:template match="/"> ....  </xsl:template> Other templates go here.  </xsl:stylesheet>
In  and  we see the namespaces, respectively, of the XSLT and FO content in this document, which differentiates transformation requests from output content.
If the XSLT engine sees content in the FO namespace, it simply writes it to the output, which is exactly what we want.  says that we want the output document to be valid XML, which is just what an XSL-FO document is, an XML document.  is the root template, which fires first, hence this is the point at which we add the essential outline content mentioned above.
Finally, at , we can start to add useful processing. We can now combine the two snippets above to do something useful. What we have below is a complete XSLT stylesheet, which is used by the XSLT engine to produce a valid XSL-FO document.
Pages: 1, 2