Sign In/My Account | View Cart  
advertisement


Listen Print

XSLT UK 2001 Report
by Jeni Tennison | Pages: 1, 2, 3

Building and Maintaining the DocBook XSL Family

Norm Walsh took to the stage again, this time in his role as maintainer of the DocBook stylesheets. They can be used to transform DocBook into HTML or XSL-FO and include support for XML vocabularies derived from DocBook as well as full DocBook itself. The DocBook stylesheets are a huge undertaking, currently around 1000 templates.

They are designed to be modular, to be customizable through parameterization and the use of template documents, and to have a literate programming style. They use extensions to address some of the problems.

Norm focused on the problem of supporting internationalization within DocBook stylesheets. The main issue with internationalization is the use of different text or labels within the output. For example, in English chapters are labeled with 'Chapter' and in French they are labeled with 'Chapitre'. Norm has addressed this by having an external XML document that holds translations for the words in this generated text which is used as a named template to access and insert them as appropriate.

However, it's not just the terminology that differs between languages, it's also the arrangement of generated phrases, the formatting of numbers, and so on. Norm's current solution is to use a regular expression string, which holds escape codes such as %t, to insert the value of different pieces of text.

Norm thus separates the translation of words from the arrangement of phrases and both of these from the stylesheets themselves. This means that translators do not have to worry about learning XSLT or even DocBook in order to translate and rearrange terms. Norm admitted, though, that it does mean that translators have to learn the escape codes, and that XSLT doesn't cope with string processing of this kind very well. If he has to introduce more escape codes (there are currently three), he will try another approach.

XSLT and Databases: A Compelling Combination for Web Apps

Steve Muench, an XML Evangelist from Oracle, is a member of the XSL Working Group, a member of the team developing the XSQL Servlet, and author of Building Oracle XML Applications. The basis of Steve's talk was the equation

SQL + XML + XSLT = WOW

Earlier in the day, we'd heard from Mike Kay about the problems with XSLT applications reading in large documents. Steve's approach to this problem is to use a blend of technologies, so that the information in the XML source that's used in the transformation is the information that you want to see. It's a common problem for people to access hundreds of rows from a database as XML and filter it using XSLT down to the few rows that they're actually interested in. Steve presented XSQL as a solution for those problems.

The basis of this approach is to have data (or documents) stored within databases and have an application that dynamically produces XML based on this data, which imports data within XML into the database. The XML produced from the database can then be restructured with XSLT to produce the view that's required. Oracle's XML SQL Utility gives a mapping between an Object View of the structure of the database and XML output, either through DOM or SAX. The XSQL Servlet supports this by processing requests, accessing the database, and increasing performance by pooling connections, pooling stylesheets, and caching XPaths.

These techniques and tools make up XSQL Pages, a server-side processing framework like Cocoon, but based on databases rather than file structures. XSQL Pages hold queries in a loose XML structure (actually textual SQL queries as the body of an xsql:query element). While there is a command-line utility to use them, the functionality comes to the fore when they're used with JSP. This allows the parameterization of the SQL queries through URLs, so that the same XSQL Page can be used to access different parts of the database.

Steve showed many examples of XSLT being used to format the same data in different ways: as HTML tables, as new SQL queries for inserting data into different databases, and as bar charts using SVG. All together, it was a very convincing demonstration of the power of the approach.

Charlie, An XML Application Framework

The next talk was also about a framework for XML using XSLT. Petr Cimprich, of Ginger Alliance, presented Charlie as a solution to the problems of performance with server-side applications (like XSQL and Cocoon) and of portability with client-side applications (like Internet Explorer).

Charlie can sit on a client machine and acts as a kind of proxy, taking connections from clients through handlers and translating them into actions. These actions can involve accessing information on a server through data drivers, which currently support file access, HTTP, SQL and SOAP.

Schematron: validating XML using XSLT

Leigh Dodds from XMLhack.com and Ingenta, and author of the XML-Deviant column on XML.com, spoke next about Schematron, a user-centered schema language that uses XPath expressions to describe the rules governing a particular XML vocabulary. It differs from grammar-based schema languages and is better when it comes to hard documents (such as those containing multiple namespaces) or constraints that are difficult to express (such as those governing accessibility).

Leigh talked through some of the syntax of Schematron, including the differences between assert, which checks conformance, and report, which highlights features in the XML document. Diagnostics, introduced in Schematron 1.5, give additional information to the user, and patterns group rules together to allow validation phases. Validation doesn't have to be a single process but, rather, an iterative one tied to the authoring process.

Schematron schemas are transformed using XSLT into a particular Schematron implementation, a stylesheet that can be run with an instance document to give information about the validity of that instance document. These implementations may give user diagnostic information to help with authoring, or RDF descriptions of errors for application processing, or anything that can be produced using XSLT.

For the future, Leigh talked about integration with XML Query, access of Schematron schemas through RDDL, and its use in authoring environments.

Short Papers

There were a series of short papers and presentations. The most important of these was the announcement by Sharon Adler, a co-chair in the XSL Working Group, that XSLT 1.1 is officially "on hold" so that the WG can focus on XSLT 2.0 and XPath 2.0. The Working Group target for XSLT 2.0 is the end of the year.

Francis Norton, from iE Ltd, spoke about using schemas, in particular XML Schema, as a documentation of requirements of XML documents, and he talked about producing HTML documentation from XML schemas. He also showed how to use Schematron to encode things like intradocument constraints, and how Schematron can be integrated with XML Schema through the appinfo element, using XSLT to pull out the Schematron schema as required.

I spoke a little about Extensions to XSLT (EXSLT) and the website that Jim Fuller, Uche Ogbuji, Dave Pawson and I have set up. The aims are to standardize extension functions and elements and to provide a repository of implementations for the extensions to help implementers and authors. Anyone can get involved. See http://www.exslt.org for more details.

Ken Holman, from Crane Softwrights Ltd, and member of the OASIS committee on XSLT conformance, presented the OASIS test suite. He showed off his new Chrysler with "XML GURU" license plates. The various discretionary items within the XSLT Recommendation mean that putting together a test suite is not straightforward. Anyone can submit test cases; the committee will decide whether they are acceptable and put together a normative package. Each XSLT processor implementer will submit a definition of the choices they have made on the discretionary items in the XSLT Recommendation, and stylesheets will be used to construct a configured test suite tailored to the particular implementation. The committee's work is currently fairly far behind schedule; for now they are focusing on xsl:number to try out the process.

Markup Meets Middleware

The first talk of the second day was given by Wolfgang Emmerich from University College London and Zühlke Engineering AG. He talked about some work he'd done to support trading within a German bank. The existing trading system had cross interfaces between various systems, and the goal of the project was to introduce a common infrastructure with a central hub managing communication between the systems.

Wolfgang described the integration issues on two levels -- the logical integration of information (how to gain a common data format), and the reliable transfer of data across the system. Wolfgang comes from a CORBA background, and in the initial phase of the project, they experimented with both an IDL/CORBA and an XML/XSLT solution to the problem. However, the IDL/CORBA approach had a number of drawbacks:

  • it is hard to accommodate change when using IDL because of the dependencies between different parts of the representation
  • CORBA may not be a standard technology in 2 years time
  • the mappings between different data formats are hard, and need to be done by IDL specialists rather than the people who know the business rules

Thus they decided to use XML/XSLT as the basis of the system. They drew upon several existing standard DTDs, such as FpML and FixML, with transformations from the proprietary formats being used by the bank systems into the common FixML. The XSLT was supported with extension functions to carry out conversions and validations, especially of dates. The aim of the system was to support 100,000 transfers a day; 10 seconds per trade delivery. While they originally stored information about the mappings between values within XML files, there were problems with the speed of this approach, and they moved to a database to alleviate them. They also used cached, compiled stylesheets to improve the speed of the solution.

Wolfgang finally gave a useful summary of the real world benefits of XML/XSLT. They had originally been worried about using open source XSLT processors in a real world system but have found them to be very good quality. They estimate that the current solution is four times more cost efficient than the original system, and that there will be a complete return on investment by the end of the year.

Using XSLT to Derive Schemas from UML

The next speaker was Mario Jeckle from DaimlerChrysler Research and Technology. As a representative of a car company, he justified his presence at an IT conference by pointing out that IT is critical in car development. The central theme of Mario's talk was that developing XML and XML schemas should be transparent to users. If XML is the next ASCII, then users shouldn't have to care about the fact that they're using it.

Mario pointed out several problems that need to be addressed when developing XML vocabularies. They need to be flexible to accommodate changes, developed quickly, coherent with legacy systems, accurate, have a good style to make them usable, be integrated with other systems, and be reusable. As an approach, Mario discussed taking UML diagrams and turning these into XMI, which is an XML vocabulary designed to represent UML diagrams. This XMI can then be transformed -- using XSLT, naturally -- to XML Schemas.

Some of the talk was spent outlining the syntax of UML, which is a standard graphical modeling language that is supported by many products but has no standard textual, portable representation. XMI was developed as an interchange format for UML, and it can be used as a basis for code generation, model assessment, and checking modeling guidelines, using the information within the UML models in a programmatic way.

However, UML is not a static standard. The developers of XMI needed an easy way to keep the XMI schema synchronized with UML as it developed. Fortunately, the structure of UML can be described in UML itself, as meta-level models known as the MOF. Therefore, given an automated means of transforming a UML model into an XML Schema, the developers of XMI could automate the generation of the XMI schema from the MOF.

Unfortunately, Mario didn't have much time to go into the technical details of the generation of schemas from UML models, but it seemed to be a fairly straightforward process. The transformation they use supports UML data types but also allows XML Schema data types to be used within UML models. One of the issues of UML models is that they don't have a distinct starting point for navigation around the model, whereas XML documents have to have a single document element. For this reason, the XML Schemas the transformation produces allow any nesting method.

Pages: 1, 2, 3

Next Pagearrow