OSCON 2002 Perl and XML Review

August 21, 2002

Kip Hampton


O'Reilly & Associates' Open Source Convention (or OScon) is an amazing event. Its more than just the first-class sessions or the chance to socialize with like-minded folks from all over the world -- the place fairly crackles with creative geek energy. Code is hacked, plans are made, and every conversation seems loaded with hints about where various technologies are and where they are going. It's a great place to take the pulse of the Open Source movement.

This month, in place of the module introduction or other technical tutorial that usually appears here, I want to offer a few observations about the state of Perl-XML World in general, based on my experiences at the recent OScon 2002.

Cultural Barriers Diminishing

It's not at all surprising that XML has been a bit of a tough sell to the Perl community. For some other languages, XML neatly fills a large void in their text processing facilities, a void that Perl did not really have. And the white-hot hype, which is often advanced by those selling applications that use XML, has sent many Perl coders scrambling for the hills; or, depending on their personal bent, for the torches and pitchforks.

From the other side, much of the XML World has ignored Perl based on the misconception that it is little more than a muscular shell scripting language.

I am pleased to say that I saw evidence of growth in both the Perl and XML communities. Of the Perl developers I spoke with, most seem to have learned that using XML in their applications is neither magic nor poison -- rather, they are beginning to see it as just another technology and set of tools that can be really good for some cases and really bad for others.

At the same time, the growing success of Open Source tools like Bricolage and AxKit is proving to some in the larger XML World that Perl and XML can be a powerful combination. It was not uncommon to hear these tools mentioned in the same breath as their Java counterparts.

XSLT No Longer Considered Harmful

Since XSLT is being used in instances where Perl has historically been king -- transforming and generating markup, templating, etc. -- many Perlgeeks have seen XSLT as a competing, even threatening technology. In his Perl and XSLT Success Stories session, Dan Brian put that all to rest by illustrating how Perl and XSLT can be combined to solve complex, real-world problems by taking advantage of the best features of both languages.

People Want More (Perlish) XML APIs

As it stands now, the Perl-XML world offers stable, complete implementations of the three most common XML APIs: the W3C's Document Object Model (DOM), the W3C's XPath Language, and the Simple API for XML (SAX). In addition to these, we also have implementations of the PYX interface and a number of Perl-specific APIs (XML::Twig, XML::Simple, among others). Despite this smorgasbord of choices, many Perl developers are still largely dissatisfied with their options for accessing and creating the contents of XML documents.

This dissatisfaction with the existent XML APIs seems to revolve around two related points:

  1. The common interfaces, especially DOM and SAX, are too low-level and, as a consequence, make easy things hard.
  2. The design and syntax of the common interfaces favor another language's needs and conventions and do not reflect the needs and expectations of Perl coders.

It's hard to argue with complaints that the standard XML APIs are not high-level enough. SAX provides an excellent bridge between Perl's other data processing facilities and XML, but the hoops that you often have to jump through to maintain context while processing some documents can get very ugly very fast. The code required to generate well-formed XML documents by directly firing the appropriate SAX events is tedious and error-prone in all but the simplest cases.

Also in Perl and XML

XSH, An XML Editing Shell

PDF Presentations Using AxPoint

Multi-Interface Web Services Made Easy

Perl and XML on the Command Line

Introducing XML::SAX::Machines, Part Two

Similarly, W3C DOM's goal of language-neutrality means that it has layers of abstraction that make many simple tasks seem much harder to perform than they should be: the data you want always seems to be at least two method calls from wherever you are.

In general, complaints about the foolishness of implementing "Java interfaces", as some have called W3C DOM and SAX, in Perl often sound like manifestations of the Not Invented Here syndrome, but I think there's a valid question lurking about. Why aren't there more XML API modules which are both complete in their ability to access or create all parts of an XML document and friendly to Perl and its culture?

I believe it's important to have support in Perl for the standard APIs, but there is nothing that says we cannot also have more interfaces -- even if only convenience extensions to existing ones -- that reflect a uniquely Perlish approach. We finally have a strong, credible foundation; now it's time to make the complexities and low-level details of reading and writing XML invisible to those that neither know nor care about them. Modules like XML::Twig, XML::Simple, and XML::Generator::PerlData point the way forward.

The Times They Are A-Changin'

The questions I heard people asking about Perl and XML this year shared one significant difference from the ones that I've heard before. Last year people were asking how, or whether it was possible at all, to do some specific task using XML in Perl. This year they were asking about the best or easiest or fastest way to perform some task. The focus seems to be shifting away from basic XML processing and applications and toward finding ways to combine the tools and modules that we have now to create higher-level tools.

This trend reflects a significant shift in the Perl-XML landscape. Developers are getting a grip on the nuts-and-bolts of XML processing in Perl, even if they don't like the current interfaces, and are beginning to learn which modules and APIs are best used in which cases. What they want now is a way to use each of those tools to its best advantage, and to do so in ways which don't force them to duplicate parts of their code because the underlying implementations do not communicate.

I typically avoid making predictions, but I expect that the next generation of XML Perl modules will reflect the growing sophistication among Perl-XML developers in general, and that a significant result of that will be better integration between the capable tools that we already have.


Overall, I was thrilled by what I saw pertaining to Perl and XML at this year's OScon. There is always room for improvement, but interest in Perl's XML processing facilities continues to grow and that trend shows no signs of stopping anytime soon. New developers are joining the community, experienced developers are getting smarter and are learning to do more with less, and the tools that are being created are stable and useful enough to turn the heads of the technical decision makers at more than a few large companies. Taken together, these point to a vital, evolving community with a promising future. I can't wait to see what the next year brings.

Finally, I thank all the folks who went of their way at OScon to express to me how much they enjoy this column. I was astonished by the number of regular readers, and I was both gratified and humbled to learn that the things we have touched on here have been so helpful to so many.