January 16, 2002
This week the XML-Deviant dips into both the XML-DEV and xml-dist-app mailing lists to examine claims that the weight of web services are about to bring your network to it's knees.
An Evil Secret
This discussion has spilled over between XML-DEV and the xml-dist-app mailing list, the public forum for discussion of the XML Protocol Working Group. The discussion was prompted by the publication of a recent opinion piece on ZDnet. The article, while generally in favor of SOAP as means to integrate applications, claims that there is an "evil little secret" hidden in the closet of the web services community.
The so-called secret is that because SOAP is so verbose (a "fat protocol" as the author termed it) it causes a great deal of network overhead. And that's not all. Because XML is plain text it's insecure. So you're going to suffer a performance hit when encrypting and decrypting all these heavyweight transmissions. The misconception that textual markup is inefficient and binary is much better has long been a debating point in the community. And not surprisingly this article has generated a fair amount of feedback on both mailing lists.
The resulting discussion is interesting in two regards: first, it provides some objective feedback on the points raised in the article and, second, it's highlighted some issues with web services that could use additional scrutiny.
Not Fat, Just Big Boned
The initial disagreements were over whether SOAP really does introduce overheads simply because it's XML. Commenting on XML-DEV, Tim Bray reiterated a recurrent theme from previous discussions, that it's overall system performance that counts.
What you care about is the performance of the whole system. What proportion of that performance is due to the delays in pumping the RPC messages back and forth, and which proportion is consumed by business logic at the endpoints of the transaction? When somebody does some quantitative work showing that in a significant real-world application, the number is high enough to be a problem worth addressing, then it's worth addressing.
Members of xml-dist-app were also quick to point out that other architectural constraints were likely to introduce limiting factors far more quickly than too many pointy brackets. In response to general claims about the overhead of SOAP services, Joshua Allen pointed out that these are true of any RPC system:
Take "XML" out of this, and everything you say is still true. Synchronous RPC over a WAN is rough, no matter how it is encoded. I don't see any evidence at all that the RPC people wrap with XML is worse than the crazy things people do with RMI and DCOM.
Encoding is one of the least important issues in most cases, so if you are implying that finding a less "fat" encoding will solve the problems you mention, you are misleading people.
Disregarding the assertion that a binary format would be more efficient, Jeff Bone also suggested that RPC, particularly finely-grained calls, suffers from some general scaling problems.
...Some bright guys at IBM Almaden have demonstrated that the performance impact of XML encoding vs. an optimized binary encoding (for example, previously posted here and elsewhere) is usually essentially insignificant compared to other performance considerations.
The problem is, I would assume, that "Web services" tend to be chatty, with lots of little round trips and a subtle statefulness between these individual communications. And that's a function of failing to realize that the API call model isn't well-suited to building communicating applications where caller and callee are separated by a medium (networks!) with variable and unconstrained performance characteristics/latency.
This theme set the focus for the later discussion. While SOAP and XML may not themselves be cause for concern, there may be issues about the general scalability of RPC as a means for building services.
Messaging versus RPC
A general concern was aired during the discussion over the vision of web services advocated by much of the industry, which seems to rely on RPC rather than messaging. Asynchronous messaging was suggested by several people as an alternative to RPC which removed the scalability issues. Yet asynchronous messaging systems aren't a paradigm that many developers are familiar with.
Anne Thomas Manes stressed that the architecture for a web service should be dictated by its requirements, and that there is a challenge involved in introducing this alternate model.
The application architecture that you choose to use should be determined by the requirements of your application. Some applications lend themselves most easily to a request-response architecture, while others favor an asynchronous model. Sometimes you can build your application with either architecture. Sometimes, though, you really need to use one and not the other...
As I see it, the biggest challenge is that a huge majority of developers have never developed an asynchronous application. Lot's of people who are playing with SOAP and .NET these days are using request-response because it's what's familiar (or it's the only way they know). Meanwhile, all the tools out there are designed to serve this majority, so the tools promote the use of request-response also.
Points choed on XML-DEV by Gavin Thomas Nicol, who noted that development tools make it extremely easy for developers to duck this architectural decision.
[I]f you use fatter protocols, you need to take that fatness into account in the design... it skews the set of applications away from synchronous fine-grained RPC (which is what a lot of people are doing) to coarser-grained, possibly asynchronous RPC/messaging (which most people aren't doing). I know this, but I'm not sure that developers at large do... especially as most of the tools make it *trivial* to wrap any old object up in RPC/SOAP (I remember the very cool Visual .NET demo where they took a plain 'ol COM object and made it a web service with but a few clicks).
In a followup Nicol noted that achieving good web service performance involves dealing with a lot of old issues:
...to use them well will require skills that many people today simply don't have... and to get good performance will take a lot more effort than people might think. They might appear "cool and new", but the problems, the solutions, and the complexities are actually pretty old.
Summarizing, then, it seems that while web services are being carried forward on a wave of new technologies, they exhibit the same characteristics and suffer the same problems as the more 'conventional' distributed systems that people have been building for years. On the positive side this means that there should be a wealth of experience for developers to draw upon. On the negative side, the the rush to provide "point and click" web service deployment tools may be obscuring this viewpoint.
Also in XML-Deviant
There were several attempts to distill some best practice guidelines from the discussion, which met with some resistance. Anne Thomas Manes argued that mandating how a technology should be used is a fruitless exercise.
While I agree that many Web services should be designed using an asynchronous, connectionless architecture, I believe that SOAP can and should be used however people decide to use it. SOAP is simply a messaging envelope. SOAP messages can be sent over a variety of protocols. They can be sent and consumed by pretty much any type of application.
Making mandates about the proper use of a technology will generally get ignored by the general public...People will use technology wherever that technology can help them solve the problem at hand.
The general guidance to use coarse-grained services was also disputed by Amy Lewis, who outlined a simple scenario in which more complex web services are developed from several fine-grained underling services. This scenario does seem to offer the greatest potential for the development of a wide diversity of web services and is perhaps a prerequisite for achieving some of the larger aims of the web services and Semantic Web efforts. Sean McGrath presented a rather poetic vision for how services should interact:
In nature, powerful functionality emerges from the bottom up. The queen in the ant hill is not a monarch. There is no "top down management" and the functionality did not emerge from a top down design.
This is the really interesting about the connectedness inherent in what people call "web services". We will build 'em small and put them out there. We will not know -- because we cannot know -- how the services we build will be inter-connected. The results will surprise, delight and terrify us in equal measure.
Treasure simplicity, keep module interactions local, facilitate interconnection. That is all we need to do. Nature will do the rest.
While it is perhaps too early to begin defining best practices for web services, this discussion and the suggestions summarized by Roger Costello do provide some useful reference points. Developers should take some time out from keeping up to date with the dizzying array of web services technologies to spend time exploring alternate service architectures.