May 15, 2002
Unlike today's Web, web services can be viewed as a set of programs interacting cross a network with no explicit human interaction involved during the transaction. In order for programs to exchange data, it's necessary to define strictly the communications protocol, the data transfer syntax, and the location of the endpoint. For building large, complex systems, such service definitions must be done in a rigorous manner: ideally, a machine-readable language with well-defined semantics, as opposed to parochial and imprecise natural languages.
It is possible to define service definitions in English; XML-RPC and the various weblogging interfaces are a notable example. But XML-RPC is a very simple system, by design, with a relatively small set of features; it's ill-suited to the task of building large-scale or enterprise applications. For example, you can't use XML-RPC to send an arbitrary XML document from one system to another without converting it to a base64-encoded string.
Almost all distributed systems have a language for describing interfaces. They were often C or Pascal-like, often named similarly: "IDL" in DCE and Corba, MIDL, in Microsoft's COM and DCOM. The idea is that after rigorously defining the interface language, tools could be used to to parse the IDL and generate code stubs, thus automating some of grungier parts of distributed programming.
The web services distributed programming model has an IDL, too; and as you can probably guess, it's the Web Services Definition Language, WSDL. It's pronounced by spelling out the letters or saying ``whizz-dell,'' which nearly rhymes with ``diesel.''
WSDL derives from two earlier efforts by a number of companies; the current de facto standard is a W3C Note submitted by IBM and Microsoft. There's a web services description working group, which is creating the next version of the note for eventual delivery as a W3C standard. So far the group hsa published a requirements document and some usage scenarios. One reason to like the requirements document is that it renames some of WSDL's more confusing terms.
I find WSDL to be a frustrating mixture of verbosity -- most messages are essentially described three times -- and curious supposedly helpful defaults, such as omitting the name of a message in an operation. I'll use the now much-discussed Google WSDL to point some of these out.
But, first, let's look at the state of web services programming and IDLs. In the classic IDL world, the definitions were processed by an IDL compiler to generate stubs for clients, which look like local function calls, and dispatch routines for the server that invoke the developer's code. When new applications were developed, the interfaces were designed from scratch, and all the benefits of ``contract'' programming were possible, including clean and regular definition of function semantics.
But back in the real world, these distributed systems usually had to interact with existing systems. Often the project involved ``remoting'' an existing application by putting an RPC interface on an existing service. In cases like this, the IDL files could resemble compiler torture-tests, as network-oriented interface languages were coerced into supporting legacy applications.
I mention this because it's about the stage at which WSDL and web services are today. The most widespread tools take an existing Java class or COM object and then generate a WSDL definition. This is backwards and half-assed. It's backwards because -- as we should have learned with earlier infrastructures -- the right thing to do is write the interface first. It's half-assed because while everybody's generating interfaces, nobody is capable of automatically consuming them.
So how did we get here? I can think of two way. First, the vendors recognize that folks aren't going to throw out their existing code. Just like they put an HTTP front-end on legacy applications when the Web became popular, developers are now going to want to put a web services front-end on their existing code. The reason why we don't yet have good client development -- i.e., WSDL parsing -- is that it requires being able to turn arbitrary XML Schema definitions into useful stubs, which is hard for a couple of reasons.
First, it's not clear how to fit web services programming into existing client frameworks. If you're using SOAP RPC, then all of the classic problems of IDL-based computing come back: memory management, transient network errors, etc. If using SOAP to send XML documents, then new issues such as DOM and SAX support must be dealt with.
Second, WSDL prefers to use XML Schema to define the data to be transferred, and understanding XML Schema requires a significant amount of effort. As the experiences of the ``soapbuilders'' group (a mailing list of SOAP toolkit providers, working on achieving interoperability across their implementations) has shown, it can require a great deal of work just to be able to properly handle XML Schema's primitive types.
I think there's a third reason, but one that nobody will admit in public. No vendor wants to spend the enormous effort involved in developing client-side WSDL toolkits when Microsoft can practically wipe them off the desktop by providing one of their own. Yes, I realize that this ignores peer-to-peer and servers talking to subservers, but I still stand by the statement.
It's time to examine parts of
GoogleSearch.wsdl, which is part of the Google developer's kit. A WSDL description is a set
- Type definitions, contained in the
typeselement, used to describe the data being exchanged, and can be any description language, although -- and I swear I'm not making this up -- "WSDL prefers" XML Schema.
- Message definitions, appearing as multiple
messageelements. As we'll see, message definitions are where we get the first hints that WSDL exceeds the 80/20 rule of flexibility and complexity.
- Operation definitions, appearing within a
bindingelement, which confusingly defines something called a ``port.''
- A service definition, contained in the
serviceelement. This defines the endpoint (URL) where the server can be found, and -- by referring to the binding, err, port, -- specifies how to communicate with it.
The Google WSDL file defines a SOAP RPC interface, which means it follows the encoding
rules found in Section 5 of the SOAP 1.1 specification. I'll avoid the SOAP vs. REST
discussions now, other than to mention that RPC is a familiar programming model to
developers. Conceptually, the
GoogleSearchResult element resembles the
following fragment of a C/C++ object:
bool documentFiltering; char* searchComments; int estimatedTotalResultsCount; bool esimateIsExact; ResultElementArray resultElements; int _numresultElements;
Note that my hypothetical Schema to C mapping required the addition of a new element to keep track of the size of the array.
More interesting is the way the interacting specs require Google to define the
ResultElementArray datatype. According to the SOAP RPC encoding rules, arrays
are written by generating each element inside a container. XML Schema requires the
to be declared as its own type. SOAP 1.1 requires arrays to have a defaultable attribute
that declares the type and size; SOAP 1.2 rightly divides this into two separate attributes.
I say ``rightly'' because XML Schema doesn't have a way to let you default an attribute
value in the SOAP 1.1 style. Because of this, WSDL provides its own
attribute that does provide a default.
Taking all of this together, the fairly straightforward
array requires the following contortions:
<xsd:complexType name="ResultElementArray"> <xsd:complexContent> <xsd:restriction base="soapenc:Array"> <xsd:attribute ref="soapenc:arrayType" wsdl:arrayType="typens:Resultelement"/> </xsd:restriction> </xsd:complexContent> </xsd:complexType>
More from Rich Salz
Given all of this complexity, we shouldn't be surprised that Google apparently missed
text that said the element should have been named
It's also hard not to look at that fragment and despair. All that complication, just to say "we're sending an array." Unfortunately, since WSDL is caught between two other specs, there seems little else that could be done. The WSDL authors couldn't change SOAP, since they were defining a use for it, and one can only imagine the howls if they tried to modify XML Schema.
Let's now look at some message definitions. The following two definitions define a request message and its response. Because they are labeled as ``opname'' and ``opnameResponse,'' WSDL will let us default those names later on.
<message name="doGetCachedPage"> <part name="key" type="xsd:string"/> <part name="url" type="xsd:string"/> </message> <message name="doGetCachedPageResponse"> <part name="return" type="xsd:base64Binary"/> </message>
In the list above, I said we have our first hint about WSDL's excessive flexibility.
message is intended to be an abstract definition -- that specifies
nothing about the bytes on the wire. As the spec concedes, however, "in some cases,
abstract definition may match the concrete representation very closely or exactly."
sending XML the representation is exact, and it should be possible to omit
message's altogether. (A Google search for "optmize the common case" finds
over 300,000 hits.)
message element also shows too much flexibility. The individual message
parts can be specified in-line, they can reference a type from the
section, it can have a mix of name and type declarations, and so on. Can anyone look
doGoogleSearch message and the
GoogleSearchResult datatype, and
give a good, practical rationale for the style differences?
And why don't all
message elements appear in their own container?
operation is defined as a set of message exchanges. WSDL supports
two-party communication, and four operation types are defined (single incoming, single
outgoing, incoming request with response, outgoing request with reply), although only
obvious two are currently supported: client sends message, client sends message and
Here is an abstract operation definition -- remember, we don't yet know how bytes appear on the wire -- that uses the earlier message formats:
<operation name="doGetCachedPage"> <input message="typens:doGetCachedPage"/> <output message="typens:doGetCachedPageResponse"/> </operation>
The messages have a name; WSDL defaults them as described above.
Finally, we're ready to bring these abstract messages and datatypes down to earth.
done in the
binding element, which has its own set of
<binding name="GoogleSearchBinding" type="typens:GoogleSearchPort"> <soap:binding style="rpc" transport="http://schemas.xmlsoap.org/soap/http"/> <operation name="doGetCachedPage"> <soap:operation soapAction="urn:GoogleSearchAction"/> <input> <soap:body use="encoded" namespace="urn:GoogleSearch" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </input> <output> <soap:body use="encoded" namespace="urn:GoogleSearch" encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/> </output> </operation>
This is where the fairly clean WSDL elements fall apart into a set of nasty special-case
elements. For that reason alone, I can appreciate why
binding is its own
element, but I still think the separation comes at the cost of too much redundancy
repetition. For example, notice the duplication of attributes in each
code element is an example of excessive flexibility. First, we've already
declared our intent to use SOAP RPC encoding, through the
style attribute in
the previous element. While SOAP defines document and RPC styles, WSDL doubles this
define ``document/encoded'', ``document/literal'', ``rpc/encoded'', and ``rpc/literal''.
service element ties the abstract messages and their concrete
realization together with an endpoint (in this case, a SOAP URL).
So we now know what to send, and where to send it. Next month we'll write some code to do just that.