Menu

Really Simple Web Service Descriptions

October 14, 2003

Rich Salz

Last month I discussed the trend of using the schema to describe the shape of the XML document, thus avoiding the W3C XML Schema (WXS) type system. This fits nicely with the so-called Service Oriented Architecture (SOA), which claims that exchanging XML documents leads to a more loosely-coupled architecture than exchanging strongly-typed RPC messages.

I take a broader view, seeing almost any distributed request-response message exchange as a kind of remote procedure call. I do this because all of the really hard parts -- network connectivity, distributed authentication, and so on -- are still there no matter what bytes are being exchanged. This seems to be a minority viewpoint, however. One reason for this is that so much was made of "RPC Encoding" in the SOAP 1.1 document, that when it was dropped from SOAP 1.2, WS-I, and WSDL 1.2 (see the last section of " Introducing WS-I" for the political details), "RPC" became a tainted word.

That's unfortunate. If we look at the big picture -- a request sent to a remote process with a reply usually coming back -- then this is RPC. Without a doubt, most web services implementations are doing SOAP request-response over HTTP. Keep that in mind, along with last month's view of schema use, as I sketch out a replacement for WSDL. Just like RPC, the three letters RSS have all sorts of charged meanings in the web world, so we'll add the extraneous "Web" in the middle, giving us the neutral acronym RSWS.

Service Description

A service description has three parts:

  1. a schema definition -- what messages look like
  2. an interface definition -- what methods are provided
  3. a location definition -- where to find the service

Let's define a container to hold these parts. This container can have a name, which is a URI, and each part can have an ID. The examples here all use rsws as the namespace prefix.


    <rsws:description name="http://example.com/rsws/">

        <rsws:schema id="mytypes">

            . . .

        </rsws:schema>

        <rsws:interface id="sample">

            . . .

        </rsws:interface>

        <rsws:location>

            . . .

        </rsws:location>

    </rsws:description>

Both the schema and the interface elements can be repeated.

I don't have a definite idea about how the name attribute would be used. It makes sense, however, that everything should have a name, and that the name should be independent of the location where it can be found. For example, consider the schemaLocation attribute of XML Schema, which has both the namespace and a URL hint.

The id attribute on top-level child elements is there to allow easy reuse. If the previous description is at http://www.example.com/test.rsws, then the href attribute lets us share descriptions:


    <rsws:schema href="http://www.example.com/test.rsws#mytypes"/>

I'm leaning against the schemaLocation approach because it's painfully obvious that href is the way to fetch things on the Web. I think XInclude is probably overkill, although I like the fallback element, which lets you provide alternate content if the URL suffered linkrot. On the other hand, XInclude seems to be in limbo, SOAP 1.2 is defining its own include element for adding binary data to SOAP messages. And there is also the two-value approach of schemaLocation.

Faced with all those choices, good old href seems like the way to go.

Schema definitions

In order to be truly inclusive, RSWS should support all sorts of schema languages. We can do this by adding a schemaType attribute to the element, which is a URI defining the schema language. For example, RelaxNG compact notation might look like this:


    <rsws:schema schemaType="..."><![CDATA[

        default namespace = "http://example.com"

        element foo { attribute bar { string } }

    ]]></rsws:schema>

I used CDATA here to show the schemaType attribute determines the content of the schema definition, and that it doesn't have to be in XML syntax.

Once nice aspect of using href and allowing multiple schema elements is that doing so supports a refinement approach to interfaces. A service can import the basic definitions, and then include its own, more precise descriptions for the set of operations that it implements.

Interface definition

An interface is a set of operations, related solely by the fact that they're collected together:


    <rsws:interface id="mgmt">

        <rsws:operation>

            <rsws:input ref="tns:fooIn"/>

            <rsws:output ref="tns:fooOut"/>

        </rsws:operation>

        . . .

    </rsws:interface>

The word operation isn't completely free of baggage, but it's more neutral than method.

The input and output elements define the message exchange pattern. This will almost always be request-response, but I don't see a need to construct a special form default that expresses this in a shorter way. I'm concerned about WSDL's magic defaults. I shy away from that practice. Also, this is XML, where terseness doesn't seem to be much of a virtue.

To allow multiple and optional responses, we'll use the minOccurs and maxOccurs attributes from XML Schema.

We could also define a rsws:fault element in order to specify which faults are allowed. But what, exactly, are we specifying? Should we specify the content of the fault -- the detail element -- or is that too much detail? If we specify the fault code, then we have issues with how to reasonably abstract that across different SOAP versions. I think it's important to keep the operation independent of the transport, so we should avoid having to do something like


    <rsws:fault soap11Faultcode="env:server"

        soap12Code="env:server" soap12Subcode="tns:ErrorValues">

        . . .

    </rsws:fault>

It's also nasty that the fault code operations are QNAME's for URI values, while the subcode is a QNAME for an enumeration type. Given these issues, I think it makes sense to defer fault support for a future version of RSWS.

Individual operations don't have a name or id attribute. We're aggregating operations into an interface; since there doesn't seem to be any compelling need to distinguish individual operations within an interface, there's no need to identify them. For example, if an operation isn't implemented, a server can send back an unimplemented fault. This will result in an extra round-trip for the client, but that seems to be the right trade-off: it's far more common for a server to implement a complete interface, and since there is a reasonable fallback, it's not worth the complexity that would be added to the description language.

Location definition

While the previous elements described what the data looks like and when it's to be used, the location element describes how to contact the server. There is where we identify the interfaces which are available, as well as the communication protocols used to communicate with the server. "Location" isn't a perfect name, but it's probably good enough. Clearly, the overall theme of RSWS is to be good enough, rather than comprehensive.

A location element has one child for each interface that it supports. As you can see, we use the standard href/id pattern to refer to an interface.


    <rsws:location>

        <rsws:provides href="#mgmt"/>

    </rsws:location>

In addition to encouraging data sharing, this approach avoids the complexity of WSDL's multiple namespaces for each type of object name. Within the provides element are transport-specific elements providing communication information. The first defined element is for SOAP over HTTP:


    <rsws:provides href="#mgmt">

        <rsws:httpsoap soapVersion="1.1">

            <url>/cgi-bin/mgmt</url>

            <ssl required="true">

                <issuerDN>C=US, O=Foo, CN=SSL CA</issuerDN>

            </ssl>

        </rsws:httpsoap>

    </rsws:provides>

As you might expect, soapVersion defines the version of SOAP to use. Note that the URL is missing a protocol identifier, host, and port. The protocol identifier is specified by the use of the httpsoap element. Leaving out the host and port mean, respctively, to use the standard port and that the host is whatever host the client is talking to. This position independent URL style allows a single RSWS file to define services on multiple hosts.

More from Rich Salz

SOA Made Real

SOA Made Simple

The xml:id Conundrum

Freeze the Core

WSDL 2: Just Say No

The ssl element -- purists may want to name it tls -- is an example of how security requirements might be specified. In the example above, we are saying that SSL is required, and that the server certificate will be signed by the specified CA. Other elements would include WS-Security requirements, XACML statements, and so on.

If every interface provided by a server has the same communication parameters, we could define a httpsoapdefault element to hold them; thus, in most cases, the provides element only has to reference the interface definition, as shown at the start of this section.

Conclusion

I've sketched a brief example of an interface definition language for web services, motivated by a desire to avoid the excesses of WSDL and to obtain an interface language that does a better job of encouraging reuse and supporting the typeless or document style of schema definitions that will be seeing increasing use. It would be nice if RSWS were further refined, and if some tools became available. While it is unlikely, RelaxNG shows that it is possible for systems that compete with W3C specifications to gain some traction and market share of their own.