WSDL Tales From the Trenches, Part 3
August 5, 2003
This article is the third and final part of the WSDL Tales from the Trenches
series, and in it I concentrate on the data in web services. More specifically, I
the type definitions and element declarations in the
types element of a WSDL
document. Such types and elements are for use in the abstract messages, the
message elements in a WSD.
WSDL does not constrain data definitions to W3C XML Schema (WXS). However, alternatives to WXS are not covered in this article: the goal of the series is to provide help and guidance with current real-world problems, and I have not seen any of the alternatives to WXS being used for web services on a significant scale to date. This may change in the future: while only the WXS implementation is discussed in the WSDL 1.1 spec, it was always the intention of the WSDL designers to provide several options. The WSDL 1.2 draft's appendix on Relax NG brings this closer to realization.
Data modeling with WXS is not for the faint-hearted. It presents a lot of pitfalls. This article will point some of these out and helps you avoid them. At the very least, it should caution you to tread carefully. I will not attempt to explain WXS. There is a wealth of good texts that do so; this article focuses on how to do basic data modeling for web services. Many of the more advanced topics are avoided.
Importing data definitions
Data may be defined directly in the
types element of a document containing
abstract messages. The recommended practice, however, is to import a separate document;
previous installment discussed the increased readability, extendibility and opportunities
for reuse this brings.
This can be done by using WSDL's
import element or by using WXS's. Although
they have the same name, they are different elements as they reside in different namespaces.
In order to distinguish them, I will refer to WSDL's element as
xsd:import to denote that of WXS. I will explain the difference between
them with examples of both mechanisms.
wsdl:import to import another WSDL
document that only contains data definitions. In other words, the only top level
Note that the WSDL 1.1 specification's example
2, a stockquote service, does not do either of these: it uses
to import a schema at top level. However, WS-I (draft) basic profile clarifies the
mechanism in rules
2001 to 2004 and castigates the W3C Note for "... incorrectly show[ing] the WSDL
import statement being used to import WXS definitions".
The above examples are in essence the same as those the WS-I basic profile offers
correction to WSDL 1.1's, except that in the basic profile examples the imported and
importing element have the same target namespace. In the case of
this is wrong; the WXS spec does not allow it. In the case of
is unfortunate; as pointed out in the previous installment, this is bad style and
have been disallowed.
If it takes several documents to define a schema with a single namespace,
xsd:redefine should be used.
Schema Design Styles
This section is about what data definitions should be exposed and which should be hidden. The trade off is between the potential for reuse and a narrow interface.
2203 of the basic profile stipulates that abstract message parts, bound to a concrete
message transporting an RPC invocation, should be defined using the
2204 states that abstract message parts used in document-style invocation should have
element attribute. If you are using SOAP, it is a good idea to try to stick
to these rules, even though it makes a mockery of the "abstract message" doctrine.
Therefore, there must be an exposed type definition for data passed as a parameter
to an RPC
invocation and an exposed element declaration for a document-style invocation. In
case, this means that the
types element may well end up with mainly element
declarations and little or no type declarations. It looks confusing but, as Roald
said, "what I mean and what I say are entirely different things."
The Russian doll design style defines root elements globally. Elements that cannot be a document's root are defined as the need arises and so are attributes and types; these definitions are nested in the definitions that use them. Such definitions nested inside another definition are said to be local and cannot be reused in other definitions, neither by other components in the same schema nor by external components. Moreover, type definitions are anonymous and cannot be referenced.
A salami slice, on the other hand, declares all elements globally. A third design style is referred to as Venetian blinds. Venetian blinds define all types globally but only expose the elements that can be used as root element of a document.
Clearly, none of these styles is optimal with respect to the trade off presented.
it is instructive to contrast the three styles with respect to the set criteria. In
services context, the equivalent of appearing as a root element of a document is to
the value of the
element attribute on an abstract message
Since neither Russian doll nor salami slice exposes types, they cannot be used if
to do RPC style invocation. Venetian blinds, on the other hand, works with both RPC
document style invocation. Venetian blinds encourage the reuse of types since it defines
them all globally. However, some types may not be intended for reuse while their global
definition makes the interface less narrow.
For a document style web service, Russian doll could not be improved upon if the only objective were a narrow interface. It does not score well on the reuse front though. Salami slice sits at the other end of this spectrum with a high score for reuse and a low one for narrowness.
Namespaces were discussed briefly in the previous installment. There we asked the question, what goes into the WSD's target namespace. Here I address the question what goes into a W3C XML Schema namespace. The rules were briefly reviewed in the previous article, but here we go into more detail with the aid of some examples.
Elements, types and attributes that belong to a namespace are said to be qualified. The declaration of a target namespace is a necessary, but not sufficient condition for elements, types and attributes to be qualified. So when are they qualified and when unqualified?
Let us deal with types first, they are easy: globally defined types, both simple and complex, are always qualified. Locally defined types are anonymous and so there is no way of referencing them; the question to which namespace they belong is purely academic.
Global element declarations are also easy: globally declared elements are qualified.
To illustrate what we know so far, this instance
document is validated by this schema. We see
indeed that the 2 globally defined elements
are part of the target namespace; the locally defined
Collection element is
Whether or not attributes and locally defined elements are qualified is governed by
form attribute. The attribute can take 2 values:
unqualified. Therefore, in order to qualify the
element in our previous example, it can be reworked as so. You will find
that it validates this document.
form is not a required attribute, neither when declaring attributes nor local
form is assigned a value implicitly, either by respectively the value
schema element, or by the default value of these attributes; the default
unqualified in each case. So here is another
schema that validates the document.
WSDL 1.1 recommends setting the
and keeping the default for
attributeFormDefault. This should minimize the use
of explicit namespace qualifiers if you judiciously set the schema's target namespace
default namespace in your messages.
We have only skimmed the surface here; W3C XML Schema (see Resources for a full reference) devotes a complete chapter to controlling namespaces. However, the questions that you will most likely encounter are covered.
W3C XML Schema has 3 compositor elements that construct complex data types from
Particles are nested inside compositor elements.
A sequence defines a compound structure in which the particles occur in order. The
particles within a choice are mutually exclusive. However, there may be multiple
occurrences of the chosen particle. all defines an unordered group. For all three
compositors, the number of legal occurrences of the particles within them is governed
minOccurs attributes on those particles. These
attributes are not required and their default value is 1.
The simplest particle is an
choice can both act as particles too.
sequence compositor is the one that is most often encountered in WSDs.
This seems a good choice; even if, conceptually, particles could occur in any order,
down the order will make parsing of messages that bit easier. However, implementations
do not observe the order constraints. This can be shown by invoking a web service
elements in a different order from the one laid down by a sequence: it often does
to matter. That is not such a bad thing. After all, if the server is more liberal
in what it
accepts than it strictly needs to be, this does not harm well-behaved clients and
some margin for error on more sloppily implemented clients. In other words, a server
did this can hardly be accused of being in breach of contract. Not so if the server
guarantee the order of the particles that are being sent back. Faced with such a server
implementation, I spent a good deal of time working through the ramifications of this
upon a time.
The first reflex is to replace
However, be aware that the remedy is not without its problems since the expressiveness
this compositor has been severely curtailed in the WXS spec. A detailed account of
is so and what the precise constraints are, is beyond the current scope. However,
limitation has already been pointed out:
all cannot be used as a particle.
Since derivation by extension in effect uses the compositor of the base type as a
in the subtype, opportunities for reuse of types defined with
all are limited.
Derivation is covered in further detail in a dedicated section.
The current WXS Recommendation is 1.0 and its namespace is
http://www.w3.org/2001/XMLSchema. However, some implementations still being
used today follow the specifications of previous working drafts, e.g.
http://www.w3.org/1999/XMLSchema. This is unfortunate and the perpetrators
should be encouraged to migrate to the released standard, but if you should come across
implementations, here are two of the common pitfalls. Firstly, there is a WXS data
common use that has changed from the 1999 to the 2001 version: 1999's
timeInstant became 2001's
dateTime. Make sure that the data type
you use fits the version of WXS. Secondly, derivations also changed significantly
1999 and 2001. These will be covered in the following section.
Derivation is a technique to define subtypes of a given base type. There are two kinds of derivation in WXS: extension and restriction. The former adds components at the end of the content model of the base type, the latter constrains the base type. Hence valid instances of a subtype derived by extension are not necessarily valid instances of the base type. Valid instances of a subtype derived by restriction, on the other hand, are always valid instances of the base type.
A subtype may be used anywhere where its base type is used, unless otherwise specified.
This may have the following impact on message definitions: assume that a message definition
declares a part with type
Bar is derived by extension
Foo. A party may send an element of type
Bar in such message.
The recipient may be unable to validate this message. Fortunately, it is possible
off the ability to substitute subtypes for base types by using the
attribute on the base type or on an element declared to be of a given base type.
Beware of derivation by extension, that is the message of this section so far. But what with derivation by restriction? From the discussion so far, it seems reasonable enough. However, using it may seem less attractive if the need is realized to list each particle of the content model of the subtype explicitly. This makes for very verbose definitions. It also does not bring the modularity benefits that an inheritance hierarchy in an OO programming language might bring: common features are not factored out, but must be repeated in each subtype. This is a change w.r.t. W3C XML Schema 1999 that caused a good deal of confusion.
Defining an array is one of the most confusing issues in WSDL. It has also caused a great deal of interoperability problems. Proceed with caution; a common approach is to extend the Array type defined in the SOAP encoding schema. In fact, this is mandated by WSDL 1.1 (see section 2.2). I was therefore surprised to see that the rules 2110 through to 2112 of the WS-I Basic Profile Working Group overrules this. On the other hand, I understand their position: WSDL 1.1 makes a pig's ear of array specifications. The basic profile's approach, on the other hand, is simple.
When I originally planned this article, it was my intention to write a good deal about SOAP arrays, how to use them in WSDs that are as near correct as is possible given the flaws in WSDL 1.1. However, given the basic profile's recommendation, the sensible thing is to avoid them altogether.
The purpose of this article was to flag some of the issues that require attention when modeling data. You should be underestimate neither the importance of defining data nor the complexity of the task. It is important because the data passed across the web service interface largely determine the quality of the interface. It is complex because data modeling is inherently complex. Nonetheless, I cannot help feeling that XML W3C Schema 1.0 does not mitigate this complexity adequately. I look forward to tools better suited to data modeling for web services.
XML Schema by Eric van der Vlist, published by O'Reilly, 2002, proved to be an invaluable companion in my encounters with W3C XML Schema. Warmly recommended to anyone who is serious about data modeling with WXS.
xFront has an item on global versus local element and type declarations in its excellent best practices section. While you are browsing the xFront, do have a look at what they have to say about web services as well, which is controversial and thought-provoking.