XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Introducing Cocoon 2.0

February 13, 2002

A Short History of Apache Cocoon

It took two years, but we finally released Apache Cocoon, the second generation. Cocoon started simply enough. In 1998 Jon Stevens -- of Apache JServ, Turbine, Velocity, Anakia, and Tigris Scarab fame -- and I created scripts that managed the automatic update of the java.apache.org site. The scripts were dead simple: iterate over all the CVS modules that java.apache.org had under the /docs and copy them to the right place.

The problem was that people were continously messing up the docs. Few people want to write documentation for open source projects; when they do, you thank them and don't complain about coherence of style and stuff like that. Or you won't have any docs at all.

The solution was obvious: we needed a way to separate style from content. In late 1998 the first XSL working draft was released and IBM made a Java XSL processor, LotusXSL, available. I downloaded both and started to play around with what was later called XSLT. While playing with this stuff, I quickly grew tired of typing a command line, moving to the browser to see the result, over and over. I wanted a less tedious change-transform-reload cycle.

So I wrote a servlet that handled the tedious bits for me; I could modify the stylesheet, hit reload on the browser, and the servlet would handle everything. This was at the very end of 1998 and Ron Howard's movie Cocoon was playing on the television, which explains the weird name only partially. I believed at the time that these technologies were a key part of the future of the Web, so a cocoon was just what was needed to allow them to incubate and grow stronger.

Apache Cocoon 1.0 was a servlet, about 100 lines of code, that used XML4J (later Apache Xerces) and LotusXSL (later Apache Xalan) to transform an XML file with an XSL stylesheet. At that time, XSLT, XPath and XSL:FO were still part of one big spec. I didn't think it was very useful for anyone else so I kept it on my disk for a few months. Then, around March 1999, on the jserv-dev mail list somebody was asking about XSL, and I said that I'd written a servlet that did all that transformation on the server side. Many people asked for it, so I requested a formal vote and the Apache Cocoon project was started under the java.apache.org umbrella.

The 1.0 version contained very little code, but lots of examples and some simple docs that explained what XSL was and why I thought it was important to learn it. After its release, people started joining active development, and we turned a small servlet into a full XML-based publishing system, which is now used in many production sites around the world.

But Cocoon 1.x was designed when the XML world was very young and experience was very small and it was based under several design choices that turned to be very limiting. So, around November 1999, I expressed the intention to work on the next generation (what people started calling Cocoon2 or simply C2) to solve all those architectural issues.

Cocoon 2.0

Comment on this article Are you using Cocoon 2.0 to build dynamic XML-based sites? Share your experience in our forums.
Post your comments

It took two years and three different project leaders to finish Cocoon 2.0 but we made it. It's an XML framework that raises the usage of XML and XSLT technologies for server applications to a new level. Designed for performance and scalability around pipelined SAX processing, Cocoon offers a flexible environment based on the separation of concerns between content, logic and style. A centralized configuration system and sophisticated caching enable you to create, deploy, and maintain rock-solid XML server applications.

Cocoon was designed as an abstract engine that could be connected to almost anything, but it ships with servlet and command line connectors. The servlet connector allows you to call Cocoon from your favorite servlet engine or application server. You can install it beside your existing servlets or JSPs. The command line interface allows you to generate static content as a batch process. It can be useful to pre-generate those parts of your site that are static, some of which may be easier to create by using Cocoon functionalities than directly (say, SVG rasterization or applying stylesheets). For example, the Cocoon documentation and web site are all generated by Cocoon from the command line.

Component Pipelines

Cocoon is now based on the concept of component pipelines. Like a UNIX pipe, but instead of passing bytes between STDIN and STDOUT, Cocoon passes SAX events.

The three types of pipeline components are generators, which take a request and produce SAX events; transformers, which consume SAX events and produce SAX events; and serializers, which consume SAX events and produce a response. A Cocoon pipeline is composed of one generator, zero or more transformers, and one serializer. As with UNIX pipes, a small number of components give you an incredible number of possible combinations. Think of active Lego bricks for XML manipulation.

Cocoon ships with a number of these components which were donated over the years by users and developers. If a component is general enough, we'll ship it with Cocoon. Some of Cocoon's generators include the following. FileGenerator acts as a parser, reading a file (or any other URL) and producing SAX events from it. DirectoryGenerator reads a directory listing, formats it as XML and produces SAX events. ServerPagesGenerator generates dynamic XML from XSP server pages. JSPGenerator is similar but parses the result of a JSP page. VelocityGenerator is also similar but uses Velocity as a template language.

Some of Cocoon's transformers include XSLTTransformer, which transforms a SAX stream depending on a given XSLT stylesheet. XIncludeTransformer augments the SAX stream by processing the xinclude namespace and including external sources into the stream. I18NTransformer transforms content based on a i18n dictionary and some language parameter.

Some of Cocoon's serializers include XMLSerializer, which streams the SAX events into XML. HTMLSerializer streams SAX events into browser-compatible HTML. TextSerializer streams only the textual SAX events, useful for non-XML languages like code or CSS or VRML. PDFSerializer produces a PDF stream out of XSL:FO SAX events, using Apache FOP. And SVG2JPGSerializer, which produces a JPG stream out of SVG SAX events, using Apache Batik.

Pages: 1, 2

Next Pagearrow