XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Functional Programming and XML
by Bijan Parsia | Pages: 1, 2, 3

ErXML

Erlang is generally considered to be the industrial poster child of FP. Born at Ericsson, designed to handle huge, fault-tolerant, highly concurrent applications, evolved under pressure of mission-critical, production telephony apps, Erlang is a worker's FP language (though, there is a good bit of interesting academic research done on and with it). Erlang is a very friendly language to get going in. Indeed, Erlang has a lot of the feel of Smalltalk (okay, "of Python") -- simple core language, dynamic typing, and extensive libraries.

Erlang has many of the syntactic and semantic characteristic features of FP: higher order and referentially transparent functions, list comprehensions, single assignment variables, pattern matching, and so on. It has a very clean and easy to use module system. Its outstanding feature, of course, is seamless support for concurrency. In fact, many Erlang advocates wax enthusiastic about "process oriented" design.

At the Sixth International Erlang/OTP User Conference, Ulf Wiger introduced what looks to become Erlang's canonical XML processing tool, XMerL (there is an earlier toy parser and a binding for the Edinburgh LT XML toolkit as well). Though still in early beta, XMerL is nevertheless immediately useful and has already been used to implement an XML datastore on top of Mnesia, Erlang's distributed, fault-tolerant database system.

Here are three reasons to start playing with XMerL:

  1. Erlang makes a good starting ground for playing with XML and FP. Those with Scheme or Lisp experience should find it comfortable enough, and it's "non-parenthesized" enough not to turn off others. Probably the trickiest bit is the necessity of using tail-recursion for loops, but, in Erlang, I haven't seen it to be as contorting in practical use as the standard Scheme examples.
  2. The XMerL set itself has several interesting features, such as a highly customizable parser (with hooks at various stages of the entire parsing process and within parsing events).
  3. The Erlang distribution comes with a number of applications suited for building large scale web sites, most notably Mnesia and Inets (an Apache-compatible HTTP server). Other Erlang-based projects of interest include Eddie (distributed web farm software), the various Bluetail products, and IDX-xmensia, an XML datastore. What starts as a prototype or toy today can easily turn into a robust production application in Erlang. Erlang is good that way.

XMerL is not as up to snuff with regard to standards compliance as the latest and greatest mainstream XML tools. But it, as part of a general Erlang solution, can fill many practical niches.

Uncharted Lands

There's a lot more going on in the FP world, and even in these particular examples, than I've touched on. Ideas from the functional programming world have always percolated into mainstream practice, but we seem to be reaching a point where many FP techniques and tools are poised for wholesale -- or at least retail -- acceptance. For example, James Clark's recently proposed Trex validating language for XML is (in his own words):

basically the type system of XDuce with an XML syntax and with a bunch of additional features (like support for attributes and namespaces) needed to make it a practical language for structure validation

While it's clear that as long as James Clark is around most of us will have the chance to benefit in some ways from FP innovations, there is still much to be gained from direct familiarity with FP techniques. Even if the tools get imported, there remains the issue of understanding them -- getting a good grip on these imported tools is often furthered by a more general familiarity with the practices and theories which produced them.

One serious lack in the FP world is large quantities of good and gentle documentation, especially of the sort designed to get programmers steeped in other paradigms (e.g, imperative, OOP) up to speed. Much of the jargon of FP is unfamiliar, being derived from advanced logical and mathematical terminology (indeed, there is a current thread on comp.lang.functional with the subject, "A Plea for Plain Functional English and Syntax"). While the situation is continually improving, it's important to recognize that deep and full mastery of FP is not a trivial endeavor even in the most optimal circumstances.

Fortunately, one doesn't need a deep and full mastery of FP to benefit from studying it. I've found that, at least for dealing with XML, a little bit of dabbling goes a long way.