XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Getting Started With Cocoon 2

Getting Started With Cocoon 2

July 10, 2002


Cocoon 2, part of the Apache XML Project, is a highly flexible web publishing framework built from reusable components. Although reusability is an oft-touted quality of software frameworks, Cocoon stands out because of the simplicity of the interface between the components. Cocoon 2 uses XML documents, via SAX, as its intercomponent API. As long as a component accepts and emits XML, it works.

The purpose of this article is to provide an overview of Cocoon 2's functionality and to get you started writing small applications using it.

What is Cocoon 2?

Cocoon 2 is an XML publishing framework. What does that mean? It is neither a database that stores XML content, nor a J2EE application server that provides web server facilities to serve the content. Instead Cocoon 2 fits architecturally between these two layers. It is framework for processing content. The processing of content is achieved by an assembly line or pipeline of components. These assembly lines are defined by the designer.

Simple Examples

Let's start with the easy case. A document written in XML is stored in a file (file.xml), processed by an XSL stylesheet (stylesheet.xsl), and then served up as HTML. A Cocoon pipeline suitable for this task is shown in figure 1.

Figure of three stage

All pipelines begin with a generator. In figure 1, the generator reads a file from the file system and turns it into an XML SAX stream. The middle component, in this case an XSL transformer, applies the HTML presentation tags, accepting an XML stream and emitting one, too. Finally the end component, a serializer, terminates the stream and outputs the contents in HTTP format. This three-stage pipeline applies to as many or all of the pages in a given site as defined by the user. This example may seem trivial in that two out of the three components have to do with starting and ending the pipeline, but it illustrates the simplest situation.

Figure of four stage pipeline

Figure 2 depicts a more typical situation. Pages contain both static content and dynamic content obtained from a database. The new component introduced here is the SQL Transformer. SQL statements embedded in the original XML document are processed and replaced with an XML result set tree fragment. For example, if the source content document (i.e. file.xml) contains:

      SELECT CONCAT(lastName, ', ', firstName) as name, age
          FROM guest WHERE status = ARRIVING;

Then a possible document coming out of the SQL Transformer would be

      <name>Bush, George</name>
      <name>Jackson, Michael</name>
      <name>Einstein, Albert</name>

The key architectural advantage here is that the source file becomes a very condensed business logic document. We are neither concerned nor have to deal with the JDBC API. Instead, the starting document content has become the business problem at hand.

Finally, suppose a local database contains a list of stock symbols that we wish to obtain the current market prices. This page could be part of a portal like Yahoo. The business problem could be solved with the multi-component pipleline show in figure 3 below:

Figure of six stage pipeline

The XML input fragment to the SOAP transformer my look something like

<soap:query url="http://www.mystock.org:8080">

For this example note that an intermediate XSL transformer is used to prepare the SQL transformer output to the exact format required by the SOAP transformer. XSL can be used for a wide variety of tasks far beyond that of HTML presentation.

Pages: 1, 2

Next Pagearrow