Multi-Interface Web Services Made Easy
by Kip Hampton
|
Pages: 1, 2
SOAP::Lite's dispatch_to() method connects
the SOAP plumbing to a given module (or directory of modules). In
this case, it allows us to reuse the same
WebSemDiff class that also implements the browser
interface. Sharing that module means that the publicly visible CGI is
nothing more than a request broker that provides access the methods
in a single application class based on the type of client making the
connection. Users accessing the application through a Web browser are
prompted to upload two XML files and the posted data is run through
the compare_as_dom() method to obtain the result while
SOAP clients have direct access to
compare_as_dom, as well as the lower-level
compare(), and other methods.
|
| |
Now that we have a working (if not totally complete and sanity-checked) application, let's connect a few clients to it, compare two XML documents, and check out the results.
In the interest of clarity we will keep the documents being compared
simple. We'll call the first doc1.xml
<?xml version="1.0"?> <root> <el1 el1attr="good"/> <el2 el2attr="good">Some Text</el2> <el3/> </root>
and the second, doc2.xml
<?xml version="1.0"?> <root> <el1 el1attr="bad"/> <el2 bogus="true"/> <el4>Rogue</el4> </root>
Access From Web Browser
A request to /cgi-bin/semdiff.cgi prompts the user to
upload two documents:
and after the files are compared, the results are given:
Access From A SOAP Client
SOAP::Lite provides both a server and a client
implementation. We will use it here to create the client that
connects to the SOAP interface of our application. For brevity's
sake we will skip over the parts of the client script that are
concerned with argument processing, opening and reading the XML
files to compared, and focus on the SOAP related parts. The complete
script is available in this month's sample code as
soap_semdiff1.pl.
#!/usr/bin/perl -w
use strict;
use SOAP::Lite;
...
my $soap = SOAP::Lite
-> uri('http://my.host.tld/WebSemDiff')
-> proxy('http://my.host.tld/cgi-bin/semdiff.cgi')
-> on_fault( \&fatal_error );
my $result = $soap->compare( $file1, $file2 )->result;
print "Comparing $f1 and $f2...\n";
if ( defined $result and scalar( @{$result} ) == 0 ) {
print "Files are semantically identical\n";
exit;
}
foreach my $diff ( @{$result} ) {
print $diff->{context} . ' ' .
$diff->{startline} . ' - ' .
$diff->{endline} . ' ' .
$diff->{message} .
"\n";
}
Passing this script the paths to our two tiny XML documents produces the following result:
Comparing docs/doc1.xml and docs/doc2.xml... /root[1]/el1[1] 3 - 3 Attribute 'el1attr' has different value in element 'el1'. /root[1]/el2[1] 4 - 4 Character differences in element 'el2'. /root[1]/el2[1] 4 - 4 Attribute 'el2attr' missing from element 'el2'. /root[1]/el2[1] 4 - 4 Rogue attribute 'bogus' in element 'el2'. /root[1] 5 - 5 Child element 'el3' missing from element '/root[1]'. /root[1] 5 - 5 Rogue element 'el4' in element '/root[1]'.
As an alternative, we could use SOAP::Lite's
autodispatch mechanism to make the code a little easier
to read:
use SOAP::Lite +autodispatch => uri => 'http://my.host.tld/WebSemDiff', proxy =>'http://my.host.tld/cgi-bin/semdiff.cgi', on_fault => \&fatal_error ; my $result = SOAP->compare( $file1, $file2 ); print "Comparing $f1 and $f2...\n"; # etc ..
Access From A RESTful Client
Fans of the
REST Architecture will appreciate the fact that our application
(and indeed, all applications built using
CGI::XMLApplication) offer a the ability to access the
untransformed XML used to create the browser interface by including a
"pass thru" parameter either in the query string of a
GET request, or as a POSTed field.
#!/usr/bin/perl -w
use strict;
use HTTP::Request::Common;
use LWP::UserAgent;
my ( $f1, $f2 ) = @ARGV;
usage() unless defined $f1 and -f $f1
and defined $f2 and -f $f2;
my $ua = LWP::UserAgent->new;
my $uri = "http://my.host.tld/cgi-bin/semdiff.cgi";
my $req = HTTP::Request::Common::POST( $uri,
Content_Type => 'form-data',
Content => [
file1 => [ $f1 ],
file2 => [ $f2 ],
passthru => 1,
semdiff_result => 1,
]
);
my $result = $ua->request( $req );
if ( $result->is_success ) {
print $result->content;
}
else {
warn "Request Failure: " . $result->message . "\n";
}
sub usage {
die "Usage:\nperl $0 file1.xml file2.xml \n";
}
This script (restful_semdiff.pl in the sample code) prints the following XML document to STDOUT
(formatted here for readability).
<?xml version="1.0" encoding="UTF-8"?>
<document>
<difference>
<context>/root[1]/el1[1]</context>
<message>
Attribute 'el1attr' has different
value in element 'el1'.
</message>
<startline>3</startline>
<endline>3</endline>
</difference>
<difference>
<context>/root[1]/el2[1]</context>
<message>
Character differences in element 'el2'.
</message>
<startline>4</startline>
<endline>4</endline>
</difference>
...
</document>
Conclusions
Also in Perl and XML |
|
OSCON 2002 Perl and XML Review PDF Presentations Using AxPoint |
Careful readers will have noticed that we did not touch on
XML-RPC at all. There are two
reasons. First, the XML-RPC client and server interfaces provided by
SOAP::Lite are nearly identical to those used for SOAP,
so showing the example code would add little value to the overall
package. Second, unlike SOAP clients, XML-RPC clients have no
standardized, unambiguous HTTP header associated with their
requests. This means that our CGI request broker would have to resort
to some level of voodoo to differentiate between XML-RPC clients and
regular Web browsers. Detecting XML-RPC requests might be possible by
checking for a combination of a POST request and a
Content-Type of "text/xml", but, at best, this solution
seems brittle and naive and would only cloud the example code
(assuming it works at all). If you know a more robust way to detect
requests from XML-RPC clients, please share your knowledge by posting
a comment to this article.
We've covered a lot of ground this month and have glossed over a number of details in an effort to keep things focused. The complete, working application and all client examples are available in the sample code if you need clarification.
Putting aside the debates about which architecture is best for
implementing automated Web services, or whether or not those services
add anything new to Web technology, the bottom line is that if you do
the Web for a living, chances are good that you will be asked about
your knowledge of Web services. It is my sincere hope that this
introduction to how SOAP::Lite and
CGI::XMLApplication can be combined to create clean,
modular solutions that support access via SOAP, REST, and HTML
browser will give you a head start.
Resources
- Download the sample code.
- XML and Modern CGI Applications
- The
SOAP::LiteHomepage - The
XML-RPCInfo Pages - REST Wiki Pages
- Additional
XML::SemanticDiffDocumentation
Have you developed any techniques for easily building web applications with multiple interfaces? Share your experience in our forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- A. Locksmith Los Angeles 877-364-5264 Locksmith in Los Angeles (877) 364-5264
2008-12-21 12:42:26 services123 [Reply]
A. Locksmith Los Angeles 877-364-5264 Locksmith in Los Angeles (877) 364-5264
- Los Angeles Locksmith 323-678-2704
2008-11-21 10:40:09 Movingcompany [Reply]
Locksmith Los Angeles 323-678-2704
lockout services locks installation rekey doors locks
24 hour locksmith in los angeles
- Security
2002-05-09 03:32:10 Chris Morris [Reply]
This code very elegantly passes user-supplied data to deeper and deeper levels, without any filtering.
1) return $style_path . 'semdiff_' . $style . '.xsl';
can be attacked with the double dot/poisoned null exploit..
2) sub getXSLparameter passes user data to XSLT style sheets without examining it. Stylesheet authors are not used to looking at security issues - what if they use one of these parameters as the name of an included file?
3) SOAP::Lite has a serious security hole, see
http://www.phrack.com/show.php?p=58&a=9
Is it responsible to publish code with so many security deficiencies? If your aim is to illustrate how to use some new technologies, you could at least put a comment in:
#TODO validate data here
Your fundamental point - how easy it is to wrap an existing service for remote access - is exactly the source of the security issue, which is "What facilities do we want to expose?". The big advantage of the REST paradigm is it invites you to begin by thinking about that.
- Security
2002-05-09 06:59:31 Kip Hampton [Reply]
chrishmorris wrote:
"return $style_path . 'semdiff_' . $style . '.xsl'; can be attacked with the double dot/poisoned null exploit.."
Careful examination reveals that the $style_path property is set internally within the unexposed application class ( via $context->{style} ) and is at no time exposed to the world (or based on data passed in from the user).
"sub getXSLparameter passes user data to XSLT style sheets without examining it. Stylesheet authors are not used to looking at security issues - what if they use one of these parameters as the name of an included file?"
Putting aside your presumptons about what XSLT stylesheet authors typically do or think about, no external document can be included (accidently or otherwise) into a stylesheet via an <xsl:param/> element. Aslo, any key/value pairs passed to a stylesheet processor (which I'm sure your're aware is neither a script interpreter, nor able to call other executables) that are not explictly addressed in the stylesheet *by name* are ignored so, assuming that one did hack in a "mystery field" into the POST, it would have no effect whatsoever on the stylesheet transformation, or its result.
That said though, yes, it is good practice, for production code to explicitly pass only that data to the XSLT processor that the stylesheet requires.
"SOAP::Lite has a serious security hole, see http://www.phrack.com/show.php?p=58&a=9"
Which is fixed in the current version (.55). See soaplite.com
"If your aim is to illustrate how to use some new technologies, you could at least put a comment in:
#TODO validate data here"
You mean like: "Now that we have a working (if not totally complete and sanity-checked) application..." ? Maybe I need to dust off the <blink> tag?
Yes, it is true, in this column I often do presume that the reader is smart enough to take the code samples as intended; that is, as merely illustrative of a specific concept and not something to be dropped as-is into production. I also realize that treating my readers as capable peers that do not require each jot and tittle to be pre-chewed for them puts me at odds some of the accepted conventions of technical writing.
You do raise a very good point about security and "web services". Dillegence in this area key-- especially as more and more services become available.
Thanks for reading, and for your comments.
-kip
- Middleware and software contracts
2002-05-13 04:31:55 Chris Morris [Reply]
Thank you for your thoughtful reply.
You rightly object to my statement that "Stylesheet authors are not used to looking at security issues". It isn't a question of anyone's skills or character - it is a question of the division of responsibilities between modules.
As you say, <xsl:include> does not accept parameters. However, document() and <xsl:document> do, and unfortunately some people work around the limitation in include by writing a stylesheet whose output is another stylesheet. So it seems likely that there will be some exploitable XSLT created.
Your perl script knows that the parameters are user input, and the XSLT will know whether one of them is being used as a file path. One or the other has to clean the input. Since the perl is taking care of / hiding the network interface, it seems to me that it is responsible for security.
I agree that $context->{style} can't be exploited in your script. But you are publishing model code here. The context object that CGI::XMLApplication uses has the power and risk of global variables. If someone decides for other reasons to copy the CGI parameters into $context - much like you did with the XSLT call - then this could create an exploit.
Thinking about these issues has made me reconsider a script I recently wrote which attempts to match the CGI parameter names to SQL column names.
I think the issue is one of middleware design, and goes wider than security. To make middleware powerful, we try to make it transparent. We look for a design that enforces few preconditions, and promises few postconditions. If we succeed, any design contract must be agreed between the outer layers of the sandwich. Unfortunately such contracts can get neglected. At least, the documentation of the middleware must point out to its clients which responsibilites still lie with them.
It is good to hear that the problem in SOAP::Lite is fixed. I think it came about by starting with considerations of power, transparency, and elegance, instead of starting with the question "What do we want this component to do?".
I agree that you couldn't cover everything in a short article ... I hope you agree that published code is fair game for criticism!
- Middleware and software contracts
- Security
