XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Excel Reports with Apache Cocoon and POI

January 22, 2003


Generating professional reports within a web application can be a difficult problem to solve. However, by combining two Apache projects (Cocoon and POI) you can produce Excel reports from a pure Java server application. The key to this solution is to embrace Excel on the client and deploy a Java solution on the server.

Why Use Microsoft Excel?

Excel is the business world's spreadsheet of choice. While most readers of this article are software technologists, many of the projects and solutions you're developing are meant for use by non-technologists in a corporate environment, in which Microsoft Office is the dominant software suite. Your end user is probably already familiar and skilled with Office. Providing reports as Excel spreadsheets allows the data to be manipulated to meet the end user's needs. As long as a sufficiently similar report and associated data are available, most end users can manipulate the report to obtain the desired results.

Although Excel is a Microsoft Windows application, its binary file format is well known and may be manipulated by many low cost solutions, including the pure Java Apache POI project. Rendering reports in Excel does not require that the web application runs Windows, only that the client runs Windows.

The Solution in a Nutshell

Basic Solution Design
Figure 1: Basic Solution Design

A report is defined in two parts: an Excel "Report Template" document and a solution specific "Report Scheme" XML document.

Excel "Report Template"

In addition to using Excel to render the report, this solution uses Excel as a client GUI with which to format the report. An Excel document supplies all the static text, desired data cell formatting, colorization, delineation lines, at-a-future-date formula manipulation, and all elements that are independent of the actual data. This strategy eliminates the need to create some other GUI. Since Excel is used for both the GUI and the rendering mechanism, the solution is feature rich and defect free.

TOMCAT Installation from cocoon-reports.war (Tested with Tomcat 4.1.12)

  1. Download war file from http://candlelightsoftware.com/downloads/cocoon-reports.war
  2. Place cocoon-reports in ~jakarta-tomcat-4.1.12/webapps/cocoon-reports.war
  3. Start up Tomcat
  4. Place browser URL to http://localhost:8080/cocoon-reports You should now see selection for the two demo report. This will also un-archive the war file so the additional configuration and setup steps below can be performed.
  5. Create MySql database table: mysql < ~/cocoon-reports/demo.sql on command line.
  6. Adjust Cocoon Sitemap file:
    1. Adjust URL for database if needed: see line 230 of sitemap file
    2. Adjust line 237 of sitemap file (i.e. parameter absolute-path) to reflect your installation.

XML "Report Scheme" File

The Excel file does not meet all needs for a report generator. In particular, three other components are needed to produce useful practical reports:

  1. The primary database query statement to retrieve the data. In this article's examples, standard SQL query statements are executed against the MySQL database. However, with slight modification of the solution, XQuery statements could be used against an XML database or SOAP web services over the net.
  2. Placement instructions of the resulting query data into the Excel spreadsheet.
  3. Many practical reports need input parameters. For example, it makes little sense to write unique report for each and every month. Instead, a more practical solution is for the report to accept a parameter to specify which month it should process.

A few other tidbits are tossed into the report schema file like title, description, and label.

Generation Solution

This solution is based on the Apache Cocoon Framework. The main mechanism to create a report is a Cocoon pipeline, which processes the two report definition documents (template and scheme), accepts optional input parameters, and then serializes the report into a binary stream compatible with Excel. For those unfamiliar with Apache Cocoon, XML.com published an introductory tutorial last year.

Rendering Machinery

This solution is driven primarily by HTTP request-response pair communications typically delivering HTML content. However, if response mime type is set to application/vnd.ms-excel, and the content is recognizable by Excel, the MS Internet Explorer browser will show the content using Excel. To the end user, it is a very smooth transition.

A Simple Example

Let's look at a very simple example report: calculate the daily sales total over a specified time period for a specific staff member. Both this example and the next assume that some financial transaction data is available in a relational database. Source code for both examples is available as a zipped download.

The Data

The two examples presented in this article both assume a relational database containing a single table of financial transactions is available as shown in Figure 2 below. 1000 entries in this table are supplied in the downloadable software. Note for simplicity only a single table is used, and it is not fully normalized.

Figure 2: Example Data

Id Staff Type Date Amount
1 mel Milk 2003-01-19 9.99
2 george Hardware 2003-01-19 0.10
3 kathy Produce 2003-01-13 5.72
4 mel Cannedfood 2003-01-23 1.21
5 cynthia Dairy 2003-02-13 9.68
6 george Bread 2003-01-08 3.71
7 george Bread 2003-01-01 1.19
8 jasson Bread 2003-02-20 1.06
9 cynthia Cannedfood 2003-02-24 8.62
10 jasson Hardware 2003-02-21 4.34

The Report Template

A Excel Report Template document, as show in figure 3, created by the report author, forms the base of the solution. All static text, cell formatting, colorization, etc. is supplied here by simply using Excel as a GUI. In this example, we will create a table of dates and associated dollar amounts.

Simple Example Report Template in Excel
Figure 3: Simple Summary Example Report Template

Input Parameters

For this report, the user must supply three parameters: Start Date, End Date, and Staff Member Name. The associated section in the Report Scheme file defining these input parameters looks like this:

    <title>Staff Person Name</title>
    <option value="george">George Fumpter</option>
    <option value="mel">Mellannie Miller</option>
    <option value="jasson">Jasson Joskophie</option>
    <option value="cynthia">Cynthia Clark</option>
    <option value="kathy">Kathy Kompton</option>
    <title>Start Date</title>
    <title>End Date</title>

The resulting HTML page rendered by the Cocoon input parameter pipeline will appear to the user as show in Figure 4 below.

Simple Example HTML Input Parameter Page
Figure 4: Simple Example Input Parameter Page

Primary Query

The second section of the Report Scheme document supplies the actual SQL text. In this case, as with many reports, input parameters must be supplied. This solution provides several tags to achieve this. The two tags shown here are <rtp:get-select> and <rpt:get-date>. These tags will be replaced with the selected values before the document is sent to the SQLTransformer for processing.

    SELECT SUM(amount) as sum, date
      FROM transactions
        staff = '<rpt:get-select>staff-person</rpt:get-select>'
        AND transactions.date BETWEEN
      GROUP BY date

Pages: 1, 2

Next Pagearrow