WSDL First

July 22, 2003

Will Provost

Web services vendors will tell you a story if you let them. "Web services are a cinch," they'll say. "Just write the same code you always do, and then press this button; presto, it's now a web service, deployed to the application server, with SOAP serializers, and a WSDL descriptor all written out."

They'll tell you a lot of things, but probably most glorious among them will be the claim that you can develop web services effectively without hand-editing SOAP or WSDL. Does this sound too good to be true? Perhaps the case can be made that in some cases SOAP has been relegated to the role of RPC encoding, that it's no more relevant to the application developer than IIOP or the DCOM transport.

When it comes to WSDL, though, don't buy it. If you're serious about developing RPC-style services, you should know WSDL as well as you know WXS; you should be creating and editing descriptors frequently. More importantly, a WSDL descriptor should be the source document for your web service build process, for a number of reasons, including anticipating industry standardization, maintaining fidelity in transmitting service semantics, and achieving the best interoperability through strong typing and WXS.

Development Paths

The willingness in some quarters to minimize the visibility of service description betrays a more basic and troubling bias, one which has to do with code-generation paths and development process. It assumes that service semantics are derived entirely from application source code. There are two viable development paths for RPC-style service development: from implementation language to WSDL and vice-versa. In fact, to start from the implementation language is the weaker strategy.

Each of these options actually implies a broader scenario. Everyone agrees that WSDL is the proper starting point when building clients, so we choose between two visions of web services development:

— Implementation language first (Impl-to-WSDL), which results in a path from service semantics to a client that understands them that passes through a WSDL descriptor:

Impl-to-WSDL Path

For example, WSDL descriptors express service semantics in a neutral format so that interfaces defined in Java, for instance, to be mapped and understood by a client built using .NET.

— WSDL first, in which both service and client rely on code generated from a central descriptor:

WSDL-to-Impl Path

In this case all development flows from the WSDL document, which still functions as a neutral format for service semantics and type information.

Ah, Temptation...

Certainly, the Impl-to-WSDL path calls to us. It makes a certain kind of sense to start from code written in one's native language, especially in the case of pilot projects, in which there's already a lot to learn, and in "wrapping" efforts which build a SOAP interface for legacy code.

Commercial tools naturally push in this direction, too. For one thing, the Impl-to-WSDL path supports the sort of RAD approach that is central to the value proposition of IDEs and application servers alike. Sales and marketing folks want to be able to demonstrate that little or no coding and a lot of code generation add up to a complete web service. And WSDL is yet another enterprise standard, and companies don't get rich selling standards, even ones to which they adhere.

It's also a matter of what you're used to. Existing distributed object computing platforms -- and that's what RPC-style web services already are and will increasingly be -- vary in this regard. EJB programmers are accustomed to building Java interfaces as their source documents for service semantics and JavaBeans as their serializable types. CORBA developers, on the other hand, are more familiar with the idea that some IDL is the natural starting point, from which language-specific artifacts are generated. DCOM/ATL developers have seen this approach, too, although the Visual Studio environment blurs this line by dint of its RAD tooling; one most often defines an interface by filling in a dialog box.

Consider these platforms -- excluding EJB as language-specific -- as precedents. The other two derive interoperability between programming languages by relying on generation from an IDL. This is the proven pattern. Why are we walking away from it for web services? WSDL is IDL for web services, and it should be used as such.

Missing the Point of Web Services

The primary problem with the Impl-to-WSDL approach is that the service implementer assumes that he or she is the ultimate authority on service semantics. For services deployed on an intranet and meant for use within a company or division, this might be a safe assumption, but more generally it's just a naïve one.

Progressively more vertical standards

The common creation and deployment of widely available business-to-business SOAP services is not so far off. In this context, starting with the implementation language misses the basic point of web services: component interoperability based on progressively more vertical standards. It's generally understood that we get incrementally better interoperability (read: usability) as we add XML, WXS, and WSDL to the technology mix. Better still -- if we control the WXS and WSDL content -- we are positioned to specify service semantics appropriate to smaller scopes: per industry, per business activity, per community, per partnership.

And there's the rub. Consensus about business semantics will demand expression in the neutral language, which is WSDL. It's neither interesting nor useful to build services from implementation-language interfaces and serializable types in the face of this (welcome) trend. The web services architecture is not a wrapping technology; or, rather, it won't be for much longer.

Lost in the Translation

At a tactical level, another argument for using WSDL first is related to type mapping and data binding. RPC-style services require mappings between WSDL (including WXS) and the implementation language, so that SOAP elements can be translated to method arguments and return values. Let's first understand that the fidelity of these mappings isn't perfect; the loss of type information is common in translating between programming languages and WSDL, in either direction.

Concerns, then, are when this mapping occurs, how often, and in what directions. The answers vary by path, and the difference is telling:

  • Using WSDL first, there will be many mappings, all flowing from the same original document.
  • When the original semantics are written into the service implementation, there will again be several mappings, but now they occur serially and in different directions -- that is, to and then from WSDL.

Therefore, the Impl-to-WSDL path places a much greater burden on the various language mappings. Using WSDL first, no component is ever more than a single generation away from the source document, and whatever imprecision exists in a given mapping -- while potentially irritating to the programmer -- will have limited impact.

Going the other way, though, one encounters two problems: multiple serial translations pose a greater risk of lossage, and these several mappings will each introduce different losses. After the initial Impl-to-WSDL mapping does some damage, a successive WSDL-to-Impl pass will erode a whole other set of model details. Thus the passage from implementation to WSDL to implementation threatens to degrade the semantic definition significantly unless the mappings are really tight.

Are they? Not by a long shot. Even same-language mappings are not reversible; after a round-trip through WSDL the type model will have metastasized. To get viable round-tripping, you must resign yourself to using a tiny subset of the types and semantics commonly used in distributed programming today.

In fairness, this is only a snapshot of the current state of the art and much of this will be cleaned up over time. The major vendors continue to put a lot of effort into supporting clean round-trips in their mappings. But no vendor even claims to support round-trips perfectly. Further, this capability is a bit of a pipe dream, even in theory. The WSDL-WXS type model simply doesn't align that cleanly to any programming language. More to the point, it isn't meant to express the same things. Weakly-typed collections, for example, are wonderful programming tools, but nearly hopeless in data binding. Support for enumerated types is suspect, too, although for different reasons in different languages.

Finally, a side benefit of mapping from WSDL-WXS is that names as well as types can be preserved, reminding us that "descriptors" should indeed describe and not just define. Java developers who've worked Impl-to-WSDL via JAX-RPC are familiar with the results of mapping methods: nice descriptive parameter names are translated to string0, string1, and booleanVal0. It's illustrative to note that the JAX-RPC specification requires that these names be carried precisely into the WSDL descriptor, yet the current implementations all refuse to do so. Why? Java Reflection doesn't provide this information. So it's not realistic to expect tools to map this successfully. Going WSDL-to-Impl (or, in fact, from .NET, where type information doesn't suffer this shortcoming), there's no such problem.

Here's an example. This is current state of the art, starting from Java:

She Sells Seashells

Best Interoperability

Finally, WSDL first offers a clear advantage in interoperability of generated components. Under the WS-I Basic Profile, and in all typical practice, web services rely on WXS as the fundamental type model. This is a potent choice. WXS offers a great range of primitive types, simple-type derivation techniques such as enumerations and regular expressions, lists, unions, extension and restriction of complex types, and many other advanced features.

To put it simply, WXS is by far the most powerful type model available in the XML world. It's more flexible than relational DDLs and much more precise and sophisticated than the type system of many programming languages. Why would we choose to use anything else to express service semantics?

What good are WXS's advanced features if they can't be mapped to the implementation language? First, let's not confuse the lack of a native language feature with the ability to build that feature using the language. WXS enumerated types mapped into flyweight classes in Java (the standard JAX-RPC) are a perfect example of this. Using WXS one can easily build an enumerated type into a descriptor, and it will be well supported in the generated Java code. Note that there is no means of describing an enumerated type in the Java-to-WSDL direction.

Secondly, as I said in the previous section, it's important to think ahead and to anticipate that those features that don't enjoy great support at the moment -- lists, unions, disjunctions and restricted complex types are a few common examples -- will be absorbed into a growing kernel of universal types that already includes most primitives and simple structs.

What's the payoff? Using WSDL first, services and clients share and enforce the same vision of message content,and that vision can include the strong, precise types of WXS.


For new service development, and even for most adaptations of existing enterprise code assets, the WSDL-to-Impl path is the most robust and reliable; it also fits the consensus vision for widely available services based on progressively more vertical standards. It does a better job of preserving service semantics as designed, and it offers best interoperability based on the rich type model of WXS.

So dial up your favorite Web-services tools vendor. We should be getting the same facility and productivity using WSDL first as we already see for Impl-to-WSDL development, if not better. Let the cry go forth: WSDL first!

Thanks to Michael Stiefel of Reliable Software and Bob Oberg of Object Innovations for .NET clarifications.