WSDL Tales From The Trenches, Part 1
Recently I retrofitted WSDL to a set of existing web services. A customer had a server running and there was a client implementation. The client and server team had been working closely together and now the time had come for another client implementation by a development team on the other side of the globe. A clear specification of the services was needed, and that's what WSDL is for. So I set out to make explicit what was previously implicit. It turned out to be an instructive experience, reaffirming some good old software engineering practices and uncovering a new set of problems specific to web services, WSDL, and XML Schema.
There were clearly some design flaws at the outset which were hard to pinpoint. It is likely that these mistakes would not have been made if the designers had formally written down their service definitions in WSDL. So this is the core message of the series: write WSDL up front, do not generate it as an afterthought, as is often suggested by vendors.
In all the attention that has been lavished on web services, it is often difficult to distinguish between vision and reality. This series will be short on vision and long on reality. It will not provide an overview of WSDL, and it also assumes familiarity with W3C XML Schema. This first article in the series considers what sound software engineering practice and distributed computing experience offer to web service design. I review some of the important design decisions that a web service designer must make and offer some advice to guide the process. The rest of the series is about how to represent the design. In the second article I shine a light in some of the dark corners of the WSDL 1.1 specification, leaving out data type definitions, which are the subject of the third article. I look at WXS it from the perspective of someone who uses it to specify the data which will be sent across the web service interface.
Designing Web Services
I use the term web services specifically for a kind of technology which focuses on interoperability, which implies that it has been standardized: heterogeneous systems only work together by virtue of open standards. Are web services the solution to your problem? Web services do not earn you street credibility anymore, so you better have some other reason for using them. If you control both ends of the pipe, chances are that there are other, better suited technologies out there. We are only at the dawn of the era of open standards in distributed computing. The price for support of web services in this early adopter's market remains high. That price is paid by poor performance, high development cost, and degraded security.
The argument that web services are "firewall friendly" is misleading. Traditional firewalls protect corporate assets against intruders exploiting weaknesses in application software by only opening ports that run hardened services. Web services, on the other hand, expose application software via these ports. In other words, they subvert security by giving outsiders the access to applications that the network-level firewall was designed to deny. Contrary to being firewall-friendly, a new breed of firewall is needed. There are a number of new entrants in this market offering products to target the problem. The traditional firewall vendors are also starting to take note. But it is early days and the technology has yet to prove itself.
Even if you go with web services, remember that you do not need to use SOAP for everything. For example, I have seen forms being submitted as an argument to a SOAP-RPC request. I have seen forms being returned as part of a SOAP-RPC response. I fail to see where the added value of SOAP might be in these cases. Surely plain XML or HTML over HTTP would be simpler?
Do Not Generate WSDL
Is WSDL 1.1 fit for human consumption? The commonly offered advice is that WSDL should be generated, not written by hand. I take issue with this. It may be true that simple, demonstration services can be developed this way, but this approach falls down when applied to larger systems. Even a designer working on her own will soon lose an overview; it's even more difficult in the context of cooperative development by heterogeneous, geographically separated teams. Large distributed systems need designing; throwing them together and hoping they will work invites disaster.
When distributed components may be developed in a variety of programming languages, you need a language-neutral Interface Definition Language (IDL) to specify how services must be invoked. CORBA has one and so does DCOM. IDL is a contract between the service requester and provider, but it only captures syntax. Semantics are not covered; IDL leaves the question of what the service does unanswered.
WSDL is the IDL of web services. It defines how to invoke web services. It also specifies the responses that you may receive, both when the invocation succeeds and when it fails. A complete WSDL specification nails down the format of messages, the protocols used, and the addresses at which the services reside. Unfortunately, getting it said in clear, crisp WSDL is still no guarantee for the quality of the design. Like all IDLs, WSDL is strong on syntax and weak on semantics. Nonetheless, do not neglect this task as, at the end of the day, it is the semantics that matter; syntax merely serves to unlock them.
Designing the Interfaces
Before sitting down to write WSDL, make a clear agreement with the stakeholders about what the web service should do. Write use cases, unambiguously capturing how the web service interacts with its environment. An actor invokes a web service to achieve a goal. Write the "sunny day" scenarios that deliver the goal, but also include the "rainy day" scenarios that fail to deliver. What guarantees can the system extend when the goal succeeds? When it fails? Where does the responsibility of the client end and the server's responsibility begin?
The importance of getting the requirements right cannot be overemphasized. This is the time to think about the value proposition. What is it exactly? What data are needed from the client and what must be supplied by the server? Making a mistake here is expensive.
Here is a war story about an update web service taking a list of installed software packages and the amount of free persistent memory as parameters. The service should return a list of packages to upgrade. If there is sufficient free memory, the problem is easy. But what happens if there isn't? What upgrades should be selected? Should the server assume that the old versions will be removed by the client to make room for the new ones? These are hard questions leading to complex algorithms and buggy implementations. Of course, the system should never have been specified like this. The server should serve the list of packages, the dependencies between them, and their memory requirements. Figuring out which packages to install should be the client's job. Offloading that responsibility to the server does not add value for the client, but it does add cost to server development.
So run a cost-benefit analysis at an early stage. You can ignore technology details: consider the service to be a black box. But do not compromise on capturing the details of the interactions between the service and its environment.
Next consider how the use cases can be realized. Is there going
to be a single interface (WSDL 1.1's
offering all functionality? Or are there several interfaces?
Each interface may be offered at several endpoints, but should
normally be indivisible. That is, an endpoint should offer all
the functionality of an interface or none. Similarly, the
interface should be semantically coherent. These are
guidelines; there is nothing in the WSDL spec stopping you from
slicing it differently. They simply help maintain sanity.
Are any supporting interfaces needed? How and when will they be invoked? What is the nature of the dependency? There are proposals for languages to address service choreographies, to allow services to hyperlink to other services, but standards are some way off. In other words, you are in uncharted territory.
Does your service return a result to the client? Must the client know whether the service completed successfully? Must the client call the service synchronously? Or asynchronously? Or will you provide a version for both invocation styles?
Pages: 1, 2