The Future of XSLT 2.0

March 21, 2007

I recently wrote a weblog entry about the directions that I saw with XML, and while it has proved to be fairly popular, it has also generated a fair number of comments that really deserve more detailed examination. One of these comments — and one that I've been planning to write about for a while anyway — has to do with my statement that XSLT 2.0 is increasingly being used as a "router" language, replacing such applications as Microsoft's BizTalk Server.

However, in the long run, as the world increasingly chooses XML as the preferred data transport story, other technologies, such as XSLT 2.0, will likely end up making most of it functionality redundant and useful at best on the edge cases. Most databases now produce XML, either directly, through specialized extensions, or via an XQuery layer. Given that databases are also increasingly being sequestered behind data abstraction layers, this also means that from the standpoint of an external application, such databases are simply another vector for supplying XML in a given format (and increasingly, for consuming XML being sent to them).

SQL does not articulate clearly the serialization of content. Database vendors use this to their advantage, wrapping access to such databases to their own internal APIs. And for the most part, even in cases such as MySQL, the primary serialization format is either an explicit API wrapper or a text output that is highly vendor-dependent.

This has resulted in a rather remarkable services industry built almost exclusively on "translating" between SQL output (or input) and some formal presentation layer. It can be argued that XSLT itself is simply another example of a translation layer; and, to be honest, if the language is used improperly, that statement is actually quite true.

But XSLT has a paradigm-altering function called document(), as well as another interesting capability called parameters. The document function can work on static XML content, but it can also use the GET protocol (through query strings) to retrieve content from web services. Parameters can be set from the hosting language to determine these web services invocations and, additionally, can be calculated from within the XSLT and passed in the same manner.

However, there are several problems with this approach. For starters, creating such query strings in the first place and passing them in is painful because you have to use the rather cumbersome call-template syntax, wrap the results in a variable, then pass the variable into the document() function. There are few checks to handle error conditions, and once you create the output, you can't necessarily use that output as the input to some other action. This is because the output is an XML fragment rather than an XML node.

Thus, while it has certainly been possible to use XSLT in this fashion, you have all too often been forced to rely upon inconsistently implemented extensions such as the node-set() function. Indeed, most of the really interesting things you could do with XSLT 1.0 came down to these self-same extensions, which again raised questions regarding whether there really was much benefit in using XSLT 1.0 in the first place.

Important New Features

However, much if not most of this concern evaporates with XSLT 2.0, which I believe has significantly advanced the state of the art. Here are some of the new features:

xsl:function. This element makes it possible to create XSLT functions that can then be placed in special namespaces and invoked from within XPath expressions, which makes it far easier to modularize XSLT functionality in order to turn XSLT into a formal "programming language."
Formal XPath extension mechanism. XSLT now has a formal (and consistent) means of invoking methods written in other languages from within an XSLT expression.
unparsed-text() and unparsed-text-available(). The unparsed-text-available() method solves one of the biggest problems of working with the document() — dealing with situations where the URL is unable to retrieve content — by checking a URL to insure that it is in fact capable of retrieving something.
unparsed-text(). This solves another problem: loading non-XML content into an XSLT transformation. This works on any content, including any binary or textual data. SOAP web services pass a fair amount of information in the headers, and this content should be passable as a bundle to the transformation; this means that XSLT can in fact be used to process these.
Sequences. One of the reasons why XSLT 2.0 took so long to get out the door was the limitations of XPath. It turns out not to be possible to legally create an internal node-set() function. After considerable effort, what emerged was the decision to support general sequences of objects that could be either atomic data types or XML objects. This change enabled more sophisticated groupings: set operations (union, intersection, difference) and collapsing lists, including numeric iterations.
Numeric iterations. You can now use expressions such as (1 to 10) that will return increasing iterative values, reducing the need for recursive expressions dramatically and consequently simplifying the code base for any number of different operations.
Regular expressions. Both XSLT 2.0 and XPath 2.0 contain support for regular expressions, as well as a number of string functions for taking advantage of regexes. For instance, the tokenize() function can split a string into a sequence based on a regular expression (or straight text), making it much easier to split apart lines and fields in CSV files, extract data from irregular phone number formats, perform actions if two words are within a given number of characters of one another, and so forth. This also makes it generally possible to use XSLTs for general schema validation, and gives a considerable leg up in the generation of rich Schematron output.
result-document and output. This element makes it possible to send content (and not necessarily just XML content) to a file or web service, independent of the final output mechanism used by the transformation itself. The two limitations that result-document faces are the fact that these are asynchronous POST events, and that you can control only a very limited number of HTTP headers (depending upon the implementation).
Inline control keywords. XSLT 2.0 now supports a number of XQuery extensions (not the entire set, but a fair number) for doing things, such as iterating with a for loop or performing various actions based upon conditional statements, directly within XPath. This can reduce file sizes considerably, and generally makes for code that is somewhat easier to read.
Character maps. With XSLT 2.0, you can now create character maps that let you map certain character sequences to some output form. Character maps replace the rather cumbersome (and often poorly used) disable-output-escaping to insure that specific entities (such as the less-than "<" symbol) stay preserved properly in output. This actually proves very useful for creating intermediate XML structures that can nonetheless be processed through other XSLT calls, and even more, is useful for generating output files that resemble XML but are not quite identical (such as jsp pages, which might have inline <% %> elements), such as:

<jsp:setProperty name="user" property="id" value="<%= "id" + idValue %>"></jsp:setProperty>

I think this feature is a useful one, because it effectively opens up XSLT to the world of generating processing logic in most web server languages.

Tunnel parameters. Parameterization has always been at odds with the recursive nature of XSLT and has often proven an impediment to modularization. In general, if you passed parameters through elements, it meant that the called templates had to declare those parameters, even if the only reason was to pass the parameters on to some other called template down the recursion. However, in 2.0, you can now invoke a with-parameter call with the tunnel attribute set to yes. When this happens, only the template that actually needs the parameter value specifically needs to declare the parameter — not any of the intervening templates.
Datatypes. For die-hard XSLT programmers, datatypes are something of a mixed blessing. If you specify that certain variables should be considered to be of a specific datatype (with the whole XSD simple type set supported), the operations done on these will then reflect the datatype in question. Additionally, if your XSLT processor is schema-aware, then such types are automatically assigned into the infoset and all operations work on the presupposition that the operands have known types.
Standalone processing. It is also now possible to invoke an XSLT transformation without needing an additional XML file on which to operate. This isn't a big issue, but it works quite effectively in routing systems.

XSLT 2.0 as Router

An XML message enters a system to be processed in some manner. One of the more fundamental distinctions in programming models on the Web has to do with the question of where intent is located; that is, where does the responsibility for indicating what should be done with a message reside. In REST mode the intent resides solely within the URL: the message itself contains the associated data, but doesn't by itself contain the relevant processing intent. In RPC mode, on the other hand, the responsibility for processing resides primarily within the envelope (typically a SOAP message), which may also include parameters, all of which are intended to invoke a method in some other language such as Java or C#.

Now, suppose, for a moment, that you created an XSLT2 transformation and bound it to one or more external objects under appropriate namespaces. It is not generally possible to invoke an XSLT transformation that is created by another transformation in one pass, regardless of the version (nor should it be, for security reasons). However, it is certainly possible to create a dual pass system, the first of which constructs from the incoming message one or more transformations that invoke the appropriate external class method calls in standalone mode, which then in turn either pass the results to the output stream or generate secondary streams that pass newly created XML to different places.

There are several benefits to this approach. First, what you are passing is initially XML, to be transformed as XML. This means that while you are doing this transformation you can also apply tests that will determine whether the incoming data is not only well-formed but business valid, and that protects the system from potentially serious attacks. If Schematron is also generating XML, then you can send messages back up the pipe (if such communication exists) to outline in a user friendly form what has gone wrong. Second, it also makes it easier to stop potentially expensive server operations from being invoked if such calls exceed some parameter (such as the rate of automated requests coming from a given client). Since you're processing the information as XML, you're not putting the integrity of your system at risk for insertion attacks.

The first process also makes it possible to create batches consisting of multiple jobs: once a given job (a command call, for instance) or some additional processing is made, then the job is removed from this stack, and the XSLT can then choose to process the remainder of the job stack based upon some conditional expression emerging from the previous job's processing. In other words, XSLT at that point acts as a job-control language, based not only upon the incoming data but also upon the results of processing that data.

Such invocations could (and generally should) be done asynchronously. The first XSLT passes the initially processed XML to a second asynchronous transformation using result-document, which would then be retrieved as a message from a set of queued messages. In this particular case, the effective routing could be done solely within the first XSLT, with little need to create multiple synchronous chains of transformations. This system uses XSLT as a message router. That it can also serve as a validation system is not accidental. One of the powers of XML is that you can check for the validity of XML without the danger of instantiating the object in live form.

XSLT 2.0, XQuery, and XForms

Over the years I've written a great deal about XQuery. I have to admit that I was not originally very taken with it and saw it as being a somewhat awkward replacement for XSLT. One possibility that I thought about was the idea of using XQuery to retrieve content that could be transformed by XSLT into the appropriate format. Until I saw the eXist database, I assumed that these would occur in separate processes. With eXist, however, you can make an XQuery call to the transform:transform() extensions to invoke a transformation on an XML node. With a quick download and a little fiddling with some of the configuration files, you can switch from the default Xalan transformer to Saxon 8.9, enabling XSLT2, XPath2, and XQuery all in the same system. And then you can turn your XQueries directly into web services.

This makes for an incredible combination, in part because you never leave the XML context. You don't have to spend a large amount of time writing different configurations of transformer objects, XML Document resources, pipes, parameters, or the like. You basically end up working with XML pretty much throughout the entire process. The combination of this with the ability to work with the various server objects (request, response, session, etc.) essentially gives you the entire application context in a single XQuery program.

This becomes especially important when working with XForms. I find it increasingly difficult not to work with XForms, to be honest, even given some of the complexities involved in different implementations. With XForms, you can build the data model XML on the client side, send it up to an XQuery that will validate and process it, and then this object can in turn be passed off to a transformation to generate another XForms instance, an XHTML report, or an SVG chart of some sort. XSLT2 works well in building such input templates, again giving you fine-grain conditional control and the establishment of interface capabilities. I think the "X" model — XQuery + XSLT2 + XHTML + XForms — will likely prove a potent one in the future. It has already gaining the attraction of various industries, especially in the medical, insurance, government, and education sectors.

The Future of XSLT

Predicting the future of any technology, especially one as esoteric as XSLT, is an exercise fraught with risk. XSTL 2.0 is generally easier to learn than its predecessor, is considerably more powerful, makes most of the right moves with regard to extensibility, and already has some first-class implementations in place. Microsoft recently announced that it will be producing an XSLT 2.0 processor, and I wouldn't be surprised if other XSLT implementation owners aren't at least evaluating the option. It does what a good second version is supposed to do, in that it solves most of the problems of the first version without introducing a whole raft of new ones.

Overall, I see it becoming far more heavily used within the next couple of years as implementations proliferate, especially when you consider that XSLT 1.0 implementations now exist for very nearly every platform in use today, making it a remarkably successful cross-platform solution. As someone who has wrestled with run-away recursive stacks, clunky called template invocations, and implementation headaches, it couldn't come soon enough.