Multi-Interface Web Services Made Easy
Editor's Note: Kip Hampton will be presenting a tutorial on Perl and XML at the upcoming O'Reilly Open Source Convention.
There is little doubt that the hype associated with web services has reached astronomical proportions. Notably missing from the current flood of information, however, is a nuts-and-bolts examination of how to build applications which provide both browser-based access for human users and programmatic access for automated clients. This month we will take a look at just how easy it is to build these multi-interface services using Perl and XML.
This is not about the relative merits or weaknesses of SOAP, XML-RPC, or REST, nor will it attempt address the reasons why you might choose one and not another. The goal here is to demonstrate that, with a little forethought and a few Perl modules, you can easily create useful Web applications that can accessed from any or all of these types of clients.
We have a lot to cover, so let's get straight to business.
For this month's sole code example we will build a Web interface to
my
XML::SemanticDiff module. For those unfamiliar with it,
XML::SemanticDiff compares the contents of two XML
documents while ignoring common things like formatting differencesC
and uncommon things like otherwise identical elements and attributes
with divergent namespace prefixes bound to the same URI.
Before we begin, a basic understanding of Christian Glahn's
CGI::XMLApplication is suggested before proceeding. It's
not a requirement; you will still be able to follow along, but having
a more
detailed introduction can only help. If you are impatient and
decide to skip reading that previous column, it is enough to
understand that the typical
CGI::XMLApplication application consists of three parts:
a tiny CGI script that connects the client to the application, a Perl
module that handles the heavy-lifting, and one or more XSLT
stylesheets that transforms DOM tree returned from the Perl module
into something palatable for the requesting client.
Understanding CGI::XMLApplication's basic architecture is
important because, from a high level, it is exactly the same model
used by Paul Kulchenko's wildly popular
SOAP::Lite module. The key to providing simple, multi-client access lies in
understanding how these two fine modules can be used together.
First, we will look at the base module that both
CGI::XMLApplication and
SOAP::Lite will use to compare the files uploaded to the server:
package WebSemDiff; use strict; use CGI::XMLApplication; use XML::SemanticDiff; use XML::LibXML::SAX::Builder; use XML::Generator::PerlData; use vars qw( @ISA ); @ISA = qw( CGI::XMLApplication );
|
Related Reading
|
After importing the necessary modules and declaring the package's
inheritance from
CGI::XMLApplication, we need to implement the methods required to
make the browser interface work.
The browser interface has two states: a default state that prompts
the user to upload two XML documents to compare, and a result state
that shows the result of the comparison (or any errors that may have
occurred while processing). The selectStylesheet()
method returns the path to the appropriate stylesheet that will
transform the DOM tree built by our application. To keep things on
track we will not look at two stylesheets,
(semdiff_default.xsl and
semdiff_result.xsl) themselves; but they are available
in this month's sample code if you are curious.
sub selectStylesheet {
my ( $self, $context ) = @_;
my $style = $context->{style} || 'default';
my $style_path = '/www/site/stylesheets/';
return $style_path . 'semdiff_' . $style . '.xsl';
}
By default, the required getDOM() method is expected to
return an XML::LibXML::Document object. This document
object, which we will create later, is transformed by the XSLT
stylesheet set by the selectStylesheet() method before
delivering the result to the browser.
sub getDOM {
my ( $self, $context ) = @_;
return $context->{domtree};
}
The getXSLParameter() method provides a way to pass
values from this class out to the stylesheets (the values are
available via <xsl:param> elements). Here we just push
all the request parameters, leaving it to the stylesheet to
pick and choose which fields are relevant.
sub getXSLParameter {
my $self = shift;
return $self->Vars;
}
With the low-level details out of the way we will now create the event callbacks which will be called in response to the state of the browser interface. Since the default state is a simple prompt that requires no application logic or special processing, we need only implement the callback for the result state.
# event registration and event callbacks
sub registerEvents {
return qw( semdiff_result );
}
sub event_semdiff_result {
my ( $self, $context ) = @_;
my ( $file1, $file2, $error );
my $fh1 = $self->upload('file1');
my $fh3 = $self->upload('file2');
$context->{style} = 'result';
After setting the appropriate style for the application state, we retrieve the filehandles that contain the uploaded XML documents. We check to see that both are defined and, if so, we convert them to plain scalars.
if ( defined( $fh1 ) and defined( $fh3 ) ) {
local $/ = undef;
$file1 = <$fh1>
$file2 = <$fh3>;
Next we create the DOM tree that contains the results of the
comparison by calling the compare_as_dom()
method. Wrapping this call in an eval block ensures that
we can safely capture any parsing errors encountered while processing
the uploaded documents. We will look at the details of the
compare_as_dom() and
dom_from_data() methods shortly.
eval {
$context->{domtree} = $self->compare_as_dom( $file1, $file2 );
};
if ( $@ ) {
$error = $@;
}
}
else {
$error = 'You must select two XML files to compare
and wait for them to finish uploading';
}
if ( $error ) {
$context->{domtree} = $self->dom_from_data( { error => $error } );
}
The compare_as_dom() method returns undef
if the two documents are identical. If no DOM object was returned and
no error were occurred, we create a document with a single
<message> element telling the user that the
document are semantically the same.
unless ( defined( $context->{domtree} )) {
my $msg = "Files are semantically identical.";
$context->{domtree} = $self->dom_from_data( { message => $msg } );
}
}
Having completed the single event callback we can move on to writing the core methods which both it and the SOAP dispatcher will share.
First, we will create the compare() method. Not much
more than a wrapper for the XML::SemanticDiff method of
the same name, it accepts two scalars containing the XML documents to
be compared and returns the results, if any, as an array reference.
sub compare {
my $self = shift;
my ( $xmlstring1, $xmlstring2 ) = @_;
my $diff = XML::SemanticDiff->new( keeplinenums => 1 );
my @results = $diff->compare( $xmlstring1, $xmlstring2 );
return \@results;
}
We will finish up the WebSemDiff class with a couple of
handy convenience methods.
The dom_from_data() method creates an
XML::LibXML::Document object (an XML document in the
form of a DOM tree) by processing a reference to any common Perl data
structure through XML::Generator::PerlData and hooking
that generator to XML::LibXML::SAX::Builder to populate
the tree. Recall that we call this method in the result event
callback to create the DOM tree containing the appropriate messages
if an error occurred, or if the documents being compared are
identical.
sub dom_from_data {
my ( $self, $ref ) = @_;
my $builder = XML::LibXML::SAX::Builder->new();
my $generator = XML::Generator::PerlData->new( Handler => $builder );
my $dom = $generator->parse( $ref );
return $dom;
}
Finally, we will create the compare_as_dom() method. A
simple wrapper for the last two methods, it returns the results of a
semantic comparison between two documents as a DOM object.
sub compare_as_dom {
my $self = shift;
my $diff_messages = $self->compare( @_ );
return undef unless scalar( @{$diff_messages} ) > 0;
return $self->dom_from_data( { difference => $diff_messages } );
}
1;
With the foundation now in place, we need only create the CGI script
that will provide access to the various clients, which is where the
architectural overlap between CGI::XMLApplication and
SOAP::Lite really pays off.
#!/usr/bin/perl -w
use strict;
use SOAP::Transport::HTTP;
use WebSemDiff;
if ( defined( $ENV{'HTTP_SOAPACTION'} )) {
SOAP::Transport::HTTP::CGI
-> dispatch_to('WebSemDiff')
-> handle;
}
else {
my $app = WebSemDiff->new();
$app->run();
}
Yes. That's all there is to it.
|
SOAP::Lite's dispatch_to() method connects
the SOAP plumbing to a given module (or directory of modules). In
this case, it allows us to reuse the same
WebSemDiff class that also implements the browser
interface. Sharing that module means that the publicly visible CGI is
nothing more than a request broker that provides access the methods
in a single application class based on the type of client making the
connection. Users accessing the application through a Web browser are
prompted to upload two XML files and the posted data is run through
the compare_as_dom() method to obtain the result while
SOAP clients have direct access to
compare_as_dom, as well as the lower-level
compare(), and other methods.
|
| |
Now that we have a working (if not totally complete and sanity-checked) application, let's connect a few clients to it, compare two XML documents, and check out the results.
In the interest of clarity we will keep the documents being compared
simple. We'll call the first doc1.xml
<?xml version="1.0"?> <root> <el1 el1attr="good"/> <el2 el2attr="good">Some Text</el2> <el3/> </root>
and the second, doc2.xml
<?xml version="1.0"?> <root> <el1 el1attr="bad"/> <el2 bogus="true"/> <el4>Rogue</el4> </root>
A request to /cgi-bin/semdiff.cgi prompts the user to
upload two documents:
and after the files are compared, the results are given:
SOAP::Lite provides both a server and a client
implementation. We will use it here to create the client that
connects to the SOAP interface of our application. For brevity's
sake we will skip over the parts of the client script that are
concerned with argument processing, opening and reading the XML
files to compared, and focus on the SOAP related parts. The complete
script is available in this month's sample code as
soap_semdiff1.pl.
#!/usr/bin/perl -w
use strict;
use SOAP::Lite;
...
my $soap = SOAP::Lite
-> uri('http://my.host.tld/WebSemDiff')
-> proxy('http://my.host.tld/cgi-bin/semdiff.cgi')
-> on_fault( \&fatal_error );
my $result = $soap->compare( $file1, $file2 )->result;
print "Comparing $f1 and $f2...\n";
if ( defined $result and scalar( @{$result} ) == 0 ) {
print "Files are semantically identical\n";
exit;
}
foreach my $diff ( @{$result} ) {
print $diff->{context} . ' ' .
$diff->{startline} . ' - ' .
$diff->{endline} . ' ' .
$diff->{message} .
"\n";
}
Passing this script the paths to our two tiny XML documents produces the following result:
Comparing docs/doc1.xml and docs/doc2.xml... /root[1]/el1[1] 3 - 3 Attribute 'el1attr' has different value in element 'el1'. /root[1]/el2[1] 4 - 4 Character differences in element 'el2'. /root[1]/el2[1] 4 - 4 Attribute 'el2attr' missing from element 'el2'. /root[1]/el2[1] 4 - 4 Rogue attribute 'bogus' in element 'el2'. /root[1] 5 - 5 Child element 'el3' missing from element '/root[1]'. /root[1] 5 - 5 Rogue element 'el4' in element '/root[1]'.
As an alternative, we could use SOAP::Lite's
autodispatch mechanism to make the code a little easier
to read:
use SOAP::Lite +autodispatch => uri => 'http://my.host.tld/WebSemDiff', proxy =>'http://my.host.tld/cgi-bin/semdiff.cgi', on_fault => \&fatal_error ; my $result = SOAP->compare( $file1, $file2 ); print "Comparing $f1 and $f2...\n"; # etc ..
Fans of the
REST Architecture will appreciate the fact that our application
(and indeed, all applications built using
CGI::XMLApplication) offer a the ability to access the
untransformed XML used to create the browser interface by including a
"pass thru" parameter either in the query string of a
GET request, or as a POSTed field.
#!/usr/bin/perl -w
use strict;
use HTTP::Request::Common;
use LWP::UserAgent;
my ( $f1, $f2 ) = @ARGV;
usage() unless defined $f1 and -f $f1
and defined $f2 and -f $f2;
my $ua = LWP::UserAgent->new;
my $uri = "http://my.host.tld/cgi-bin/semdiff.cgi";
my $req = HTTP::Request::Common::POST( $uri,
Content_Type => 'form-data',
Content => [
file1 => [ $f1 ],
file2 => [ $f2 ],
passthru => 1,
semdiff_result => 1,
]
);
my $result = $ua->request( $req );
if ( $result->is_success ) {
print $result->content;
}
else {
warn "Request Failure: " . $result->message . "\n";
}
sub usage {
die "Usage:\nperl $0 file1.xml file2.xml \n";
}
This script (restful_semdiff.pl in the sample code) prints the following XML document to STDOUT
(formatted here for readability).
<?xml version="1.0" encoding="UTF-8"?>
<document>
<difference>
<context>/root[1]/el1[1]</context>
<message>
Attribute 'el1attr' has different
value in element 'el1'.
</message>
<startline>3</startline>
<endline>3</endline>
</difference>
<difference>
<context>/root[1]/el2[1]</context>
<message>
Character differences in element 'el2'.
</message>
<startline>4</startline>
<endline>4</endline>
</difference>
...
</document>
Also in Perl and XML |
|
OSCON 2002 Perl and XML Review PDF Presentations Using AxPoint |
Careful readers will have noticed that we did not touch on
XML-RPC at all. There are two
reasons. First, the XML-RPC client and server interfaces provided by
SOAP::Lite are nearly identical to those used for SOAP,
so showing the example code would add little value to the overall
package. Second, unlike SOAP clients, XML-RPC clients have no
standardized, unambiguous HTTP header associated with their
requests. This means that our CGI request broker would have to resort
to some level of voodoo to differentiate between XML-RPC clients and
regular Web browsers. Detecting XML-RPC requests might be possible by
checking for a combination of a POST request and a
Content-Type of "text/xml", but, at best, this solution
seems brittle and naive and would only cloud the example code
(assuming it works at all). If you know a more robust way to detect
requests from XML-RPC clients, please share your knowledge by posting
a comment to this article.
We've covered a lot of ground this month and have glossed over a number of details in an effort to keep things focused. The complete, working application and all client examples are available in the sample code if you need clarification.
Putting aside the debates about which architecture is best for
implementing automated Web services, or whether or not those services
add anything new to Web technology, the bottom line is that if you do
the Web for a living, chances are good that you will be asked about
your knowledge of Web services. It is my sincere hope that this
introduction to how SOAP::Lite and
CGI::XMLApplication can be combined to create clean,
modular solutions that support access via SOAP, REST, and HTML
browser will give you a head start.
SOAP::Lite HomepageXML-RPC Info PagesXML::SemanticDiff DocumentationXML.com Copyright © 1998-2006 O'Reilly Media, Inc.