Really Simple Web Service Descriptions
October 14, 2003
Last month I discussed the trend of using the schema to describe the shape of the XML document, thus avoiding the W3C XML Schema (WXS) type system. This fits nicely with the so-called Service Oriented Architecture (SOA), which claims that exchanging XML documents leads to a more loosely-coupled architecture than exchanging strongly-typed RPC messages.
I take a broader view, seeing almost any distributed request-response message exchange as a kind of remote procedure call. I do this because all of the really hard parts -- network connectivity, distributed authentication, and so on -- are still there no matter what bytes are being exchanged. This seems to be a minority viewpoint, however. One reason for this is that so much was made of "RPC Encoding" in the SOAP 1.1 document, that when it was dropped from SOAP 1.2, WS-I, and WSDL 1.2 (see the last section of " Introducing WS-I" for the political details), "RPC" became a tainted word.
That's unfortunate. If we look at the big picture -- a request sent to a remote process with a reply usually coming back -- then this is RPC. Without a doubt, most web services implementations are doing SOAP request-response over HTTP. Keep that in mind, along with last month's view of schema use, as I sketch out a replacement for WSDL. Just like RPC, the three letters RSS have all sorts of charged meanings in the web world, so we'll add the extraneous "Web" in the middle, giving us the neutral acronym RSWS.
Service Description
A service description has three parts:
- a schema definition -- what messages look like
- an interface definition -- what methods are provided
- a location definition -- where to find the service
Let's define a container to hold these parts. This container can have a name, which is a URI, and each part can have an ID. The examples here all use rsws as the namespace prefix.
<rsws:description name="http://example.com/rsws/"> <rsws:schema id="mytypes"> . . . </rsws:schema> <rsws:interface id="sample"> . . . </rsws:interface> <rsws:location> . . . </rsws:location> </rsws:description>
Both the schema
and the interface
elements can be repeated.
I don't have a definite idea about how the name
attribute would be used. It
makes sense, however, that everything should have a name, and that the name should
be
independent of the location where it can be found. For example, consider the
schemaLocation
attribute of XML Schema, which has both the namespace and a
URL hint.
The id
attribute on top-level child elements is there to allow easy reuse. If
the previous description is at http://www.example.com/test.rsws
, then the
href
attribute lets us share descriptions:
<rsws:schema href="http://www.example.com/test.rsws#mytypes"/>
I'm leaning against the schemaLocation
approach because it's painfully obvious
that href
is the way to fetch things on the Web. I think XInclude is probably overkill, although I like
the fallback
element, which lets you provide alternate content if the URL
suffered linkrot. On the other hand, XInclude seems to be in limbo, SOAP 1.2 is defining
its own include
element for
adding binary data to SOAP messages. And there is also the two-value approach of
schemaLocation
.
Faced with all those choices, good old href
seems like the way to go.
Schema definitions
In order to be truly inclusive, RSWS should support all sorts of schema languages.
We can
do this by adding a schemaType
attribute to the element, which is a URI
defining the schema language. For example, RelaxNG compact notation might look like
this:
<rsws:schema schemaType="..."><![CDATA[ default namespace = "http://example.com" element foo { attribute bar { string } } ]]></rsws:schema>
I used CDATA
here to show the schemaType
attribute determines the
content of the schema definition, and that it doesn't have to be in XML syntax.
Once nice aspect of using href
and allowing multiple schema
elements is that doing so supports a refinement approach to interfaces. A service
can import
the basic definitions, and then include its own, more precise descriptions for the
set of
operations that it implements.
Interface definition
An interface is a set of operations, related solely by the fact that they're collected together:
<rsws:interface id="mgmt"> <rsws:operation> <rsws:input ref="tns:fooIn"/> <rsws:output ref="tns:fooOut"/> </rsws:operation> . . . </rsws:interface>
The word operation isn't completely free of baggage, but it's more neutral than method.
The input
and output
elements define the message exchange
pattern. This will almost always be request-response, but I don't see a need to construct
a
special form default that expresses this in a shorter way. I'm concerned about WSDL's
magic
defaults. I shy away from that practice. Also, this is XML, where terseness doesn't
seem to
be much of a virtue.
To allow multiple and optional responses, we'll use the minOccurs
and
maxOccurs
attributes from XML Schema.
We could also define a rsws:fault
element in order to specify which faults are
allowed. But what, exactly, are we specifying? Should we specify the content of the
fault --
the detail
element -- or is that too much detail? If we specify the fault code,
then we have issues with how to reasonably abstract that across different SOAP versions.
I
think it's important to keep the operation independent of the transport, so we should
avoid
having to do something like
<rsws:fault soap11Faultcode="env:server" soap12Code="env:server" soap12Subcode="tns:ErrorValues"> . . . </rsws:fault>
It's also nasty that the fault code operations are QNAME's for URI values, while the subcode is a QNAME for an enumeration type. Given these issues, I think it makes sense to defer fault support for a future version of RSWS.
Individual operations don't have a name
or id
attribute. We're
aggregating operations into an interface; since there doesn't seem to be any compelling
need
to distinguish individual operations within an interface, there's no need to identify
them.
For example, if an operation isn't implemented, a server can send back an
unimplemented
fault. This will result in an extra round-trip for the client,
but that seems to be the right trade-off: it's far more common for a server to implement
a
complete interface, and since there is a reasonable fallback, it's not worth the complexity
that would be added to the description language.
Location definition
While the previous elements described what the data looks like and when it's to be
used,
the location
element describes how to contact the server. There is where we
identify the interfaces which are available, as well as the communication protocols
used to
communicate with the server. "Location" isn't a perfect name, but it's probably good
enough.
Clearly, the overall theme of RSWS is to be good enough, rather than comprehensive.
A location
element has one child for each interface that it supports. As you
can see, we use the standard href
/id
pattern to refer to an
interface.
<rsws:location> <rsws:provides href="#mgmt"/> </rsws:location>
In addition to encouraging data sharing, this approach avoids the complexity of WSDL's
multiple namespaces for each type of object name. Within the provides
element
are transport-specific elements providing communication information. The first defined
element is for SOAP over HTTP:
<rsws:provides href="#mgmt"> <rsws:httpsoap soapVersion="1.1"> <url>/cgi-bin/mgmt</url> <ssl required="true"> <issuerDN>C=US, O=Foo, CN=SSL CA</issuerDN> </ssl> </rsws:httpsoap> </rsws:provides>
As you might expect, soapVersion
defines the version of SOAP to use. Note that
the URL is missing a protocol identifier, host, and port. The protocol identifier
is
specified by the use of the httpsoap
element. Leaving out the host and port
mean, respctively, to use the standard port and that the host is whatever host the
client is
talking to. This position independent URL style allows a single RSWS file to define
services
on multiple hosts.
More from Rich Salz |
The ssl
element -- purists may want to name it tls
-- is an
example of how security requirements might be specified. In the example above, we
are saying
that SSL is required, and that the server certificate will be signed by the specified
CA.
Other elements would include WS-Security requirements, XACML statements, and so on.
If every interface provided by a server has the same communication parameters, we
could
define a httpsoapdefault
element to hold them; thus, in most cases, the
provides
element only has to reference the interface definition, as shown at
the start of this section.
Conclusion
I've sketched a brief example of an interface definition language for web services, motivated by a desire to avoid the excesses of WSDL and to obtain an interface language that does a better job of encouraging reuse and supporting the typeless or document style of schema definitions that will be seeing increasing use. It would be nice if RSWS were further refined, and if some tools became available. While it is unlikely, RelaxNG shows that it is possible for systems that compete with W3C specifications to gain some traction and market share of their own.