Excel Reports with Apache Cocoon and POI
Generating professional reports within a web application can be a difficult problem to solve. However, by combining two Apache projects (Cocoon and POI) you can produce Excel reports from a pure Java server application. The key to this solution is to embrace Excel on the client and deploy a Java solution on the server.
Why Use Microsoft Excel?
Excel is the business world's spreadsheet of choice. While most readers of this article are software technologists, many of the projects and solutions you're developing are meant for use by non-technologists in a corporate environment, in which Microsoft Office is the dominant software suite. Your end user is probably already familiar and skilled with Office. Providing reports as Excel spreadsheets allows the data to be manipulated to meet the end user's needs. As long as a sufficiently similar report and associated data are available, most end users can manipulate the report to obtain the desired results.
Although Excel is a Microsoft Windows application, its binary file format is well known and may be manipulated by many low cost solutions, including the pure Java Apache POI project. Rendering reports in Excel does not require that the web application runs Windows, only that the client runs Windows.
The Solution in a Nutshell
|Figure 1: Basic Solution Design|
A report is defined in two parts: an Excel "Report Template" document and a solution specific "Report Scheme" XML document.
Excel "Report Template"
In addition to using Excel to render the report, this solution uses Excel as a client GUI with which to format the report. An Excel document supplies all the static text, desired data cell formatting, colorization, delineation lines, at-a-future-date formula manipulation, and all elements that are independent of the actual data. This strategy eliminates the need to create some other GUI. Since Excel is used for both the GUI and the rendering mechanism, the solution is feature rich and defect free.
TOMCAT Installation from cocoon-reports.war (Tested with Tomcat 4.1.12)
XML "Report Scheme" File
The Excel file does not meet all needs for a report generator. In particular, three other components are needed to produce useful practical reports:
- The primary database query statement to retrieve the data. In this article's examples, standard SQL query statements are executed against the MySQL database. However, with slight modification of the solution, XQuery statements could be used against an XML database or SOAP web services over the net.
- Placement instructions of the resulting query data into the Excel spreadsheet.
- Many practical reports need input parameters. For example, it makes little sense to write unique report for each and every month. Instead, a more practical solution is for the report to accept a parameter to specify which month it should process.
A few other tidbits are tossed into the report schema file like title, description, and label.
This solution is based on the Apache Cocoon Framework. The main mechanism to create a report is a Cocoon pipeline, which processes the two report definition documents (template and scheme), accepts optional input parameters, and then serializes the report into a binary stream compatible with Excel. For those unfamiliar with Apache Cocoon, XML.com published an introductory tutorial last year.
This solution is driven primarily by HTTP request-response pair
communications typically delivering HTML content. However, if response
mime type is set to
application/vnd.ms-excel, and the content
is recognizable by Excel, the MS Internet Explorer browser will show the
content using Excel. To the end user, it is a very smooth transition.
A Simple Example
Let's look at a very simple example report: calculate the daily sales total over a specified time period for a specific staff member. Both this example and the next assume that some financial transaction data is available in a relational database. Source code for both examples is available as a zipped download.
The two examples presented in this article both assume a relational database containing a single table of financial transactions is available as shown in Figure 2 below. 1000 entries in this table are supplied in the downloadable software. Note for simplicity only a single table is used, and it is not fully normalized.
Figure 2: Example Data
The Report Template
A Excel Report Template document, as show in figure 3, created by the report author, forms the base of the solution. All static text, cell formatting, colorization, etc. is supplied here by simply using Excel as a GUI. In this example, we will create a table of dates and associated dollar amounts.
|Figure 3: Simple Summary Example Report Template|
For this report, the user must supply three parameters: Start Date, End Date, and Staff Member Name. The associated section in the Report Scheme file defining these input parameters looks like this:
<parameters> <input-select> <title>Staff Person Name</title> <label>staff-person</label> <init>george</init> <option value="george">George Fumpter</option> <option value="mel">Mellannie Miller</option> <option value="jasson">Jasson Joskophie</option> <option value="cynthia">Cynthia Clark</option> <option value="kathy">Kathy Kompton</option> </input-select> <input-date> <title>Start Date</title> <label>start-date</label> <init>2003-01-01</init> </input-date> <input-date> <title>End Date</title> <label>end-date</label> <init>2003-01-10</init> </input-date> <parameters>
The resulting HTML page rendered by the Cocoon input parameter pipeline will appear to the user as show in Figure 4 below.
|Figure 4: Simple Example Input Parameter Page|
The second section of the Report Scheme document supplies the actual SQL text. In this case, as with many reports, input parameters must be supplied. This solution provides several tags to achieve this. The two tags shown here are <rtp:get-select> and <rpt:get-date>. These tags will be replaced with the selected values before the document is sent to the SQLTransformer for processing.
<sql:execute-query> <sql:query> SELECT SUM(amount) as sum, date FROM transactions WHERE staff = '<rpt:get-select>staff-person</rpt:get-select>' AND transactions.date BETWEEN '<rpt:get-date>start-date</rpt:get-date>' AND '<rpt:get-date>end-date</rpt:get-date>' GROUP BY date </sql:query> </sql:execute-query>
Pages: 1, 2