XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

REST Reporting

February 16, 2005

Producing paper reports is a fundamental requirement of many applications. As more systems are exposed as services, REST, XSLT, and the mighty URI can create a reporting approach that has a number of advantages over traditional, database-direct reporting engines. These benefits include:

  • Reporting from the business level of an application
  • Combining the data from multiple services
  • Building a flexible reporting framework that is largely platform and vendor independent

Though REST-based reporting is simple in concept, there are number of issues that can impact the performance of RESTful reporting. This article discusses some of these issues and how they relate to XSLT, REST service design, and toolset selection.

Assembling the Data

Often, the data required to build a report is not available from a single webservice response, and the data must be composited from a number of webservice calls. As a team, REST and XSLT are especially useful tools for accomplishing this. If you are unfamiliar with REST or using XSLT to transform multiple input documents, you may want to first review these XML.com articles:

An example of a REST-based report is this hypothetical status report of open work-orders. Additional details, such as maps and photographs, are included for work-orders that are over-budget.

Example report that consumes a number of REST services Example report that consumes a number of REST services

The construction of this report requires the compilation and transformation of data from a variety of REST resources, each identified by a URI:

  • [1] A listing of open work orders from http://workmgmt/workorders/?status=open
  • [2] A detail of one specific work order from http://workmgmt/workorders/161803
  • [3] Basic information and photo about the asset from http://assetmgmt/summary/23044
  • [4] An SVG transform of GML feature data from http://assetmgmt/maps/23044?radius=2000&layers=parcel,street,utility

The key to making this happen is XSLT's document function.

A Closer Look at XSLT's Document Function

In the XSLT for the example report, we conditionally obtain detailed work-order information with a request using a relative URI [1].

<xsl:template match="m:workorder">

.. show work order information ..

  <xsl:if test="m:actualCost &gt; m:budgetCost">

    <-- Get work order detail -->  
    <xsl:apply-templates 
      select="document(@workorderID,.)/m:workorder" [1] 
      mode="workorder-detail"/>

  </xsl:if>
  
</xsl:template>

Using relative URIs with XSLT's document function requires some care. When expressed as a string, relative URIs are relative to the stylesheet's base URI, and when expressed as a node, they are relative to the node's parent document. When using relative URIs, you can also use the second argument of the document function, which is a node from the document that will be the base of the relative URI.

As an example, consider the resolution the relative URI 161803 from within the XSLT document file://d:/workmgmt/xslt/budgetReport.xsl.

Input Document URI Document Request Resource Returned
http://workmgmt/workorders/ document('161803') file://d:/workmgmt/xslt/161803
http://workmgmt/workorders/ document(string(@workorderID)) file://d:/workmgmt/xslt/161803
http://workmgmt/workorders/ document(@workorderID) http://workmgmt/workorders/161803
http://workmgmt/workorders document('161803',.) http://workmgmt/161803
http://workmgmt/workorders/ document('161803',.) http://workmgmt/workorders/161803

Dynamic XML and Base URIs

If you are performing a server-side XSLT transformation on a dynamically created XML source, there may be no base URI from which to derive relative URIs. If there is a base, it may be something unfortunate like c:\winnt\system32\.

One way to deal with this is to explicitly set the base URI of the input XML document. The source of the identity of the base URI depends on your transformation engine; Java engines often use the javax.xml.transform.Source.getSystemId() method, and .Net uses on the XmlReader.BaseURI property.

In some cases (like with many .Net XmlReader implementations), you cannot set the value of the base URI directly. Fortunately, a quick customization the XML source can allow you to explicitly set the base URI. The following example inherits XmlNodeReader, which can be used in .Net to read DOM nodes into an XPathDocument to be used as input for a XSLT transform:

public class ContextXmlNodeReader : XmlNodeReader {

  private string _baseUri;
  
  public ContextXmlNodeReader(XmlNode node, string baseUri) : base(node) {
    _baseUri = baseUri;
    }

  public override string BaseURI  {
    get {return _baseUri;}
    }

  }

Handling Local REST Requests with URI Resolvers

The example report in this article shows more detail for some work-orders by making additional requests to the REST service. If a lot of details are shown, generating the report can require a lot of potentially costly HTTP calls. When using XSLT on the server-side, you can avoid having to use an HTTP request for each request by implementing a custom URI resolver implementation to handle local REST GET requests.

URI resolvers are used by many XSLT processors to interpret and return resources identified in XSLT bits like the <xsl:include> element or the document() function. Custom URI resolvers let you intercept these URI requests and handle them directly. This is done by overriding methods such as XmlResolver's GetEntity (.NET) or URIResolver's resolve method (java). In the case of our example, all requests for URIs beginning with "http://workmgmt/workorders" would be handled directly by the business logic of the work-order REST service. All other URI requests trigger the default behavior of the URI resolver and result in an HTTP GET request.

When your REST service supports server side XSLT transforms, it can sometimes be useful to use a REST handler that can act as a URI resolver to handle local GET requests and as a HTTP handler to handle remote and non-GET requests.

Java and .NET UML for a REST handler Java and .NET UML for a REST handler

URI Structure

Relative URIs can help you maintain context and simplify requests when creating a report detail. While transforming the work-order information from http://workmgmt/workorder/161803/, obtaining detailed labor charging information that is located at http://workmgmt/workorder/161803/laborDetail can be obtained simply by calling document('laborDetail',.).

The relative REST URIs in this article use Generative URIs. Generative URIs have an understandable structure that can be used to build URIs to resources, for which you do not already have an explicit link. This is in contrast to opaque URIs, which require an index, search service, or some other link topology in order to be known.

Even though generative URIs can be very useful in developing ad-hoc relationships between resources and extending services, reliance on generative URIs is justifiably controversial. A good reason for this controversy is that reliance on generative URIs is only as robust as the assumption that the URI structure will not have to change.

Putting It on Paper

XSLT can generate a number of output formats that can be used to generate printed output including: Plain Text, HTML, or XML formats that are native to some newer word processing applications. We'll take a closer look at RTF and XSL-FO, which are widely supported, provide control over printed page layout, and allow embedded graphics.

RTF

At first glance, an RTF document appears to be an evil forest of 80's text markup. It is however, logical, well documented, and widely supported by a variety of word processors. Since RTF is entirely text based, you can generate it directly from XSLT and send it to the client. With IE and MS Office, using the application/msword mime-type will display the RTF directly in the browser.

The RTF specifications, which are bound to the releases of MS Word, are not light reading: Version 1.8 (Word 2003) is the most current at the time of this article, but Version 1.5 (Word 97) is probably the most appropriate to use in practice. Sean Burke's RTF Pocket Guide is a good and gentle introduction to the basics of RTF. Also, you can avoid a lot of RTF coding by building a template document in your favorite word processor, saving it as RTF, and carefully cutting and pasting it into your XSLT templates.

Despite the fact that it is cheap and fast, RTF is seldom my first choice when XSL-FO is also an option. Even though RTF can control printed output with decent precision, complex layouts or data-driven graphics can quickly become very complicated to express. Version issues, limited debugging feedback (malformed RTF often simply causes MS Word to crash), and tricky syntax make RTF a less-than-ideal match for XSLT and REST reporting.

Hello XSL-FO

XSL-FO is a W3C specification purposely designed for the job of generating printed output from XML data. A good introduction to XSL-FO is at Printing from XML: An Introduction to XSL-FO.

XSL-FO is normally not intended to be viewed directly by end-users; rather, a post-processor reads the XSL-FO to produce paper-ready output. Typically, the output of these post-processors is PDF, but it can also be a bitmap, postscript, or even RTF. These post processors are often called Formatters or Formatting Object Processors (FOPs).

Currently XSL-FO enjoys the support of FOPs from a number of organizations and vendors. At the time this article, they include Adobe, Altsoft, Antenna House, Apache, Chive, Ecrion, RenderX, and Visual Programming. Since the input (XSL-FO) and output (PDF) of these FOPs are stable, we are finally in the enviable position of being able to swap vendors to meet our performance, platform, and pricing needs without breaking our reports. Many of these FOPs also support proprietary extensions, such as XML tags for charting or control over PDF specific features like bookmarks.

Images, XSL-FO, and Formatting Object Processors (FOPs)

The XSL-FO specification does not directly deal with graphics. However, many FOPs also support graphics in the form of embedded SVG. SVG content can be exported from a graphics software package and placed in the XSLT, or SVG content can be produced from a data transformation, like a chart from numerical data or a map from GML.

Even though SVG is often associated with vector graphics, it can also be a vehicle for embedding photographs into your reports. Many APIs that serialize database or object data as XML will serialize binary image data as base-64 text. This image data can be embedded in a XSL-FO report by using XSLT and the following approach:

  • [1] Wrap the image base-64 text into a URI using the data url scheme.
  • [2] Use the <svg:image> element to size and transform the image as necessary.
  • [3] Wrap the SVG document in a <fo:instream-foreign-object> element.
<-- Inline image in XSL FO --> 
<fo:instream-foreign-object                        [3]
  content-width="0.8in" 
  content-height="0.6in">
  
  <-- SVG Data -->
  <svg:svg width="0.8in"
    height="0.6in" 
    viewBox="0 0 640 480" 
    preserveAspectRatio="xMidYMid"
    >
	
    <-- Embedded JPEG Image -->
    <svg:image                                     [2]
      width="640"
      height="480" 
      transform="rotate(90 320 240)" 
      xlink:href=".."[1]
      />
	  
  </svg:svg>
  
</fo:instream-foreign-object>

FOPs vary greatly in their SVG support. If you are shopping for a FOP, and you need graphics support, it is a good idea run a number of large test documents through trial software before you make a purchase.

Getting the Most Out of Your FOP

FOPs often have APIs and connector kits that you can use to tie the FOP directly into your application or publishing framework. However, high-end FOPs are not cheap, and you may want to install your FOP as a service that can be shared by many applications, regardless of their native platform.

A straightforward approach is to use a RESTful FOP service, where you post XSL-FO to the FOP service and get a binary PDF as a response.

FOP Service Diagram FOP Service Diagram

Another RESTful approach is to pipeline the XSL-FO through the FOP server. This approach lets you bind a FOP to a service simply by creating a URI.

FOP Pipeline Diagram FOP Pipeline Diagram

Consider the URI to the PDF of the example report at the beginning of this article:
http://fopserver/pipe.ashx/workmgmt/workorders/?status=open&xsl=budgetReportFO.xsl.
The sequence of actions to produce this report would be:

  • [1] The client requests the report from the URI
    http://fopserver/pipe.ashx/workmgmt/workorders/?status=open& xsl=budgetReportFO.xsl.
  • [2] The FOP pipeline service, located at http://fopserver/pipe.ashx, uses the remainder of the URI to build a request to http://workmgmt/workorders/?status=open&xsl=budgetReportFO.xsl.
  • [3] The work managment service applies the budgetReportFO.xsl stylesheet to the local data at http://workmgmt/workorders/?status=open and responds with XSL-FO data to the FOP server.
  • [4] The FOP server converts the XSL-FO to PDF and sends it to the client.

A Final Note About Security

Performance will normally be the primary driver in deploying FOPs, Services, and XSLT engines; however, if access to some resources is restriced, security may become the controlling factor in the design of your reporting framework.

When restricted resources are part of a REST report, questions to consider are "How am I going to authenticate to the REST resources?" and "Whose credentials should be used?" Even though URIs provide a standard method of identifying resources, your resources may not have a shared approach for authentication. The document() function does not provide a mechanism for basic authentication, and you may have to choose another approach, such as using tickets or setting up your URI resolver to preauthenticate or support negotiated network authentication. Also be aware of whose credentials are being used by the XSLT engines and URI resolvers, since they may be able to access resources not intended to be viewable by the reports' end users.