Apache XML Project Launches

November 10, 1999

Edd Dumbill

This Tuesday saw the launch of the Apache Software Foundation's latest undertaking, the Apache XML Project. The new effort has as its core aim the creation of open source, commercial quality tools and solutions for XML.

Most noted for their Apache web server project, the Apache Software Foundation (ASF) has turned its attention to XML in response to a growing demand for open source XML and XSL tools. The project is being seeded by the contribution of XML tools from commercial vendors, including Sun and IBM, and existing open source projects. Not since Netscape's decision to release the source code to Mozilla has the open source world seen such a significant commitment by commercial tools vendors.

Project Aims: A Close Link to Standards

In addition to its main aim of providing XML tools, the Apache XML Project states that it will seek to provide feedback to standards bodies from an implementation perspective. The sheer intellectual weight of the group should give it a significant voice in the W3C and IETF. This, in turn, should deliver advantages to the standards bodies. A major factor in the recent return of XHTML 1.0 to the HTML working group was the community feedback from XML developers. Over the last few months the desire to be able to express such feedback to the W3C has been growing in the XML community.

Being standards-based is an issue of the utmost importance to the Apache XML Project. In recent issues of we have discussed the importance of both open source software and open standards in continuing innovation and interoperability on the Web. The XML Project's web site includes a link to the W3C's site right alongside the link to the main Apache site, a clear signal of their commitment to the W3C.

Brian Behlendorf, President of the Apache Software Foundation, says the new XML tools are intended to be definitive with respect to XML standards: "Building a solid reference suite of applications and libraries for managing XML will help ensure consistency of implementation between free and commercial software, and reassure developers that XML is a reliable choice for building applications upon."

Both IBM and Sun, the major vendors contributing to the project, are keen to flag it as a commitment on their part to open standards. Additionally, the presence of Tim Bray, one of the editors of the XML 1.0 specification, on the team of coordinating project managers further reinforces the link with XML standards.

The other stated aim of the project is to be a focus for all Apache XML-related activity. The inclusion of Cocoon, Apache's XML/XSL server framework, in the new project underlines this aim.

Major Products and Contributors

The Apache XML Project currently expects to release their XML parser and XSLT processor by the end of 1999. This achievement has been helped to a great extent by the contributions of vendors and freeware authors.

The project's XML parser, Xerces, is based on the XML4J and XML4C parsers from IBM. The fully validating parser, to be released as both Java and C++ components, implements DOM levels 1 & 2 in addition to SAX2. An added benefit of the use of IBM's parser as the base technology is early support for the XML Schema working draft. Beta downloads of this product are already available, and a usable release should be available by the end of this year.

Xalan, an XSLT processor, also to be available soon, is based around the LotusXSL processor from IBM and Lotus. This processor has been in use for a while as part of the Apache Cocoon project. The first release will be in Java. A C++ component is planned, but no schedule is available as yet.

Cocoon is a Java-based XML publishing framework written by Stefano Mazzocchi and the Java-Apache community. Started in January 1999, Cocoon is essentially a proof-of-concept work that will be refined and re-engineered in the Cocoon 2 project.

The fourth project to be integrated under the Apache-XML umbrella is FOP, James Tauber's XSL formatting objects processor. This technology converts XSL formatting objects documents into PDF documents using XML documents, a DOM tree, or SAX events as inputs.

The above four products constitute the immediate sub-projects being pursued by the Apache XML Project. Contributions from other vendors and projects will feed into these and other future sub-projects.

Sun has donated its Java "Project X" XML parser, and its experimental XHTML parser. Features and functionality from Sun's parser will be merged into the Xerces project. The donation of the XHTML parsing technology is an interesting move—although not currently represented explicitly in the four projects, the Apache XML Project also has client-side development in its sights, in which the XHTML technology could play a part. The Project's ambitions towards client-side development are certainly welcome: open-source browsing technology is lagging significantly compared to server-side and commercial offerings.

DataChannel has contributed its XPages technology for aggregating disparate data sources into XML. Additional contributions come from the open-source projects XSL:P (an XSLT processor) and OpenXML (XML and HTML parsers)—both affiliated with ExOffice, a new venture centered around open source and XML.

The organizational structure of the Apache XML Project is similar to that used by Apache's other efforts. Overall management is provided by a Project Management Committee, which includes representatives from the Apache group, the founding vendors, and freeware authors.

As with the Apache web server, development is open to all. More information for those wishing to get involved with the project is available from the site.

A Sensible Step for IBM and Sun

The technology donations from IBM and Sun are the latest in a continuing trend, started by the free distribution of Java, for companies to give away their development platforms. Both Sun and IBM have looked to garner developer attention for their technologies by releasing them for free and providing online developer support.

Integrating their XML efforts into a single open-source project seems to make a lot of sense for IBM and Sun. There is no clear reason for them to maintain disparate, competing XML parsers. Sun especially seems to have lost its edge in XML development since the departure of David Brownell, the chief developer of their XML parser, earlier this year.

Additionally, Sun and IBM already have a history of collaboration with the Apache Software Foundation, in the Jakarta project. Jakarta is taking the Apache JServ work to its next level, a commercial-quality Java server platform. In many ways the XML work mirrors ideas already in action on this project.

Two major vendors who have not yet thrown their hat into the ring are Microsoft and Oracle. Arguably, both of these companies have more to gain from pursuing their own XML offerings that integrate tightly with their respective platforms.

Oracle is not ruling out entry into the Apache XML Project at a later date. Steve Muench, Oracle's XML Evangelist, said "We definitely may participate in in the future. Anything that raises awareness of the power of combining relational databases, XML, and XSLT for data and content delivery and exchange is an exciting development for the industry."

Muench also pointed out that since Oracle is also committed to Java and W3C standards, the Apache XML components could be used against Oracle.

With the current pressure for standards-compliance, it would certainly behoove any database or application server vendor to ensure this level of plug-and-play compatibility.

A Great Boost for XML and Open Source

The announcement of the Apache XML Project, and the contribution of code and resource from IBM and Sun, is a milestone for both XML and open source development.

For XML, it means that a large group of developers is pulling together to ensure that reliable, standards-based XML tools will become as commonplace as, say, the GCC compiler is now. This will lead to wider adoption of XML technologies, and the consequent increase in openness and interoperability between applications.

For the open source world, this project is a huge vote of confidence from commercial vendors that the open source paradigm can provide the environment in which to build commercial quality, standards-conformant, and stable tools.