Web Services and Sessions

July 22, 2003

Introduction

Web services are becoming an important tool for solving enterprise application and business-to-business integration problems. An enterprise application is usually exposed to the outside world as a single monolithic service, which can receive request messages and possibly return response messages, as determined by some contract. Such services are designed according to the principles of a service-oriented architecture. They can be either stateless or stateful. Stateful services can be useful, for example, for supporting conversational message exchange patterns and are usually instance or session-based, but they are monolithic in the sense that the session instantiation is always implicit.

However, when building a stateful service, it is often tempting, given some experience with CORBA or DCOM, to design it the way the underlying application has been designed. With traditional distributed technologies, complex systems are often designed with finer granularity, as session-oriented, where a single service object, acting as a factory, returns new or existing session objects to clients.

In general, a service-oriented approach (simple interactions, complex messages) may be better suited to building stateful web services, especially in the bigger B2B world, where integration is normally achieved through an exchange of XML documents. Coarse-grained services, with their API expressed in terms of the document exchange, are likely to be more suitable for creating loosely coupled, scalable and easily composable systems.

Yet there still exists a certain class of applications which might be better exposed in a traditional session-oriented manner. Sometimes a cleaner design can be achieved by assigning orthogonal sets of functionality to separate services, and using thus simpler XML messages as a result. Such web services are fine-grained. They may not be designed for a general large-scale consumption, but, rather, for cooperation with some specific external clients. The integration scope of such services can be limited in many cases to a single enterprise, even to a single department or a small project. However, this route inevitably leads to state, distributed resources management, and scalability problems, which are likely to be much more serious than the same ones associated with traditional distributed programming.

These problems don't mean you should necessarily avoid a fine-grained web services design. If you believe that for a particular use case a fine grained design can result in a better interface, and that a reasonable compromise with respect to those problems can be achieved, then such a route should at least be explored.

It is likely we'll see some standardization efforts in this area of state and resource management in the near future. Meanwhile, this article will look at ways of building stateful web services. In particular we highlight different ways of defining service references and identifying individual sessions.

How We Did It: URLs Can Help

We originally developed a COM-based device management software framework for controlling video-processing systems, which our company produces. A client can request a connection to a particular device from a service factory object, which returns either a new device session object or an existing one.

By the time we had to write a front-end to this system I happened to come across my first SOAP article, A Young Person's Guide to SOAP by Don Box. After some consideration we decided that a client application should talk to devices using SOAP over HTTP, rather than DCOM. This decision was based not only on the eternal programmers' desire to try a hot, new technology, but also in the hope that it would be easier to integrate our system with different third-party software applications.

We designed our first production-based web service the same way the encapsulated system was designed, by having a manager service to return a device key from a local map of device sessions in response to a connection request and passing it as part of a Request URL to a device session service. We also tried to describe our service in WSDL as soon as it appeared.

<definitions ...>
<!-- not all messages are shown -->
<message name="connectRequest">
<!- parts omitted for brevity -->
</message>
<message name="connectResponse">
<part name="deviceKey" type="xsd:long"/>
</message>
<!-- port types, not all operations are shown -->
<portType name="ICommManagerPortType">
<operation name="connect">
 <input message="tns:connectRequest"/>
 <output message="tns:connectResponse"/>
</operation>
</portType>
<portType name="ICommSessionPortType">
<!-- session-specific operations are not shown -->
</portType>
<!-- bindings are omitted for brevity -->
<service name="ZManagerService">
<port binding="zm:ICommManagerBinding" name="ICommManagerPort">
<soap:address location="http://www.zandar.com/devices"/> 
</port>
</service>
<service name="ZSessionService">
<port binding="zm:ICommSessionBinding" name="ICommSessionPort">
 <soap:address location="http://www.zandar.com/devices"/> 
</port>
</service>
</definitions>

What is visible in the above fragment is that there's no way to specify in WSDL 1.1 that an instance of a particular abstract port type (such as ICommManagerPort) returns or accepts a reference to an instance of another port type (such as ICommSessionPort) as part of some message.

As a result, an extra pressure falls on a client side programmer's shoulders because a proxy builder does not know how these port types relate to each other. Here's some client code in C#.

ZManagerService zmanagerService = new ZManagerService();
// connect to a device
long deviceKey = zmanagerService.connect();
ZSessionService zSession = new ZSessionService();
// explicitly modify zSession endpoint address
zSession.Url = zmanagerService.Url + "?DeviceKey=" + deviceKey;
// talk to device
SystemProps sysProps = zSession.getSystemProps();
// explicitly disconnect
zSession.disconnect(deviceKey);

This design was strongly influenced by our COM experience. If we were to build the service today, we would do it differently. We'd use a device name rather than an opaque number to help the service in identifying the session. We would consider creating a wrapper around our COM objects and exposing its interface instead, passing a device name as part of the application data, hiding completely the way our native implementation is organized. One reservation about this approach is that orthogonal operations, such as connect() and send() would likely have to be put into a single interface, as it may not be possible for a service to connect automatically as part of the first control request.

Should a service (port) reference be described in WSDL or in another specification layered on top of WSDL? It seems that a WSDL document is a good place for describing references which allow for simple unconstrained interactions between an application client and a service, and hence do not require complex runtime support. This would make the above code fragment look like this instead.

ZSessionService zSession = zmanagerService.connect();
SystemProps sysProps = zSession.getSystemProps();

However, more complex interactions and dependencies between services can be described by a higher-level language, such as Business Process Execution Language for Web Services (BPEL4WS).

Passing Session Keys in Messages

When building fine-grained session-oriented web services, one of the issues which needs to be addressed is how to identify an individual session.

One approach is to explicitly pass instance keys as part of request messages to a session service. Consider the following two interfaces :

interface IAccount {
void credit (long amount);
}
interface IBankManager {
IAccount openAccount();
}

A corresponding WSDL description may look like this:

<definitions ...>
<!-- non-standard SessionKey type is defined in the types section -->
<message name="openAccountRequest"/>
<message name="openAccountResponse">
  <part name="sessionKey" type="xsd1:SessionKey"/>
</message>
<message name="creditRequest">
 <part name="sessionKey" type="xsd1:SessionKey"/>
 <part name="amount" type="xsd:long"/>
</message>
<portType name="IBankManager">
 <operation name="openAccount">
  <input message="tns:openAccountRequest"/>
  <output message="tns:openAccountResponse"/>
</operation>
</portType>
<portType name="IAccount">
 <operation name="credit">
  <input message="tns:creditRequest"/>
  <!-- output may also be present --> 
 </operation>
</portType>
</definitions>

SessionKey is returned in the response to an openAccount request and passed as part of a credit operation. Its format varies between toolkits, for example, it can be an integer or a GUID. A proxy automatically adds a key to a request, and as such it's only visible on the wire, but not in the client code, which can now look like this :

IAccount account = bankManager.openAccount();
account.credit(1000);

This approach requires the same toolkit (and often the same language) be used on the client and server sides, because instance tokens as well as mechanisms for mapping them to message fields are not standardized.

Using SOAP Metadata

Possibly the strongest side of SOAP, which can swing the decision on how to do web services in its favor, lies within its extension mechanism, which makes it possible to carry out metadata related to a message body's content. It allows for building sophisticated, extremely distributed web services, which is more difficult to achieve with pure XML content alone.

Using SOAP metadata presents another alternative to the way session identification tokens are passed around. Systinet, a company emerging as one of the major players on the web services market, was the first to add remote references into its line of WASP Servers. For example, consider the following two Java interfaces :

//this interface is implemented by a CounterFactoryImpl class
public interface CounterFactory {
Counter createCounter();
}
public interface Counter extends java.rmi.Remote {
int getCount();
void increment();
}

At deployment time, only a WSDL for a CounterFactoryImpl service is generated:

<wsdl:definitions ...>
<wsdl:types>
<schema targetNamespace="http://idoox.com/interface" ...>
<complexType name="serviceReference">
<sequence>
<!-- first two elements are not shown -->  
<element name="wsdl" type="anyURI"/>
<element name="instanceID" type="string"/>
</sequence>
</complexType>
<element name="instance">
<complexType>
<choice>
<element name="id" type="string"/>
<!-- last two elements are not shown --> 
</choice>
</complexType>
</element>
</schema>
</wsdl:types>
<wsdl:message name="createCounterRequest"/>
<wsdl:message name="createCounterResponse">
<!-- serviceRef_res is declared in a separate schema -->
<wsdl:part element="ns0:serviceRef_res" name="response"/>
</wsdl:message>
<wsdl:portType name="CounterFactoryImpl">
<wsdl:operation name="createCounter">
<wsdl:input message="tns:createCounterRequest"/>
<wsdl:output message="tns:createCounterResponse"/>
</wsdl:operation>
</wsdl:portType>
<!-- binding and service elements are not shown for brevity -->
</wsdl:definitions>

When a client sends a createCounter request, an instance of a serviceReference type is returned, which includes an identifier of a Counter service instance and a URI of its automatically generated WSDL. This eventually results in a client stub being created (this is an example of a static runtime binding). Using an <instance> extension, the instance ID can then be passed as a value of its <id> child element during communication with a specific Counter service.

<e:Envelope xmlns:e="http://schemas.xmlsoap.org/soap/envelope/">
 <e:Header>
   <n0:instance xmlns:n0="http://idoox.com/interface">
     <id>localhost:2002.07.17 03:57:09 GMT:16A26A3EC87D818A:9</id>
   </no:instance> 
 </e:Header> 
 <e:Body>
  <n0:getCount 
  xmlns:n0="http://localhost/RemoteReferencesDemo/wsdl
            /demo.remote_references.CounterE9A7C2Counter"/> 
 </e:Body>
</e:Envelope>

The lifecycle of a newly created service can be managed either explicitly, for example, by calling an endCounter method on a Counter service, or implicitly, using an instance of LifeCycleService which can decide when a particular instance should time out, for example, when it has not been called for a limited period of time.

In general, this approach is a fairly serious attempt to apply a remote references idiom to web services design. Using SOAP headers for passing state data is guaranteed to work for all SOAP-based web services, irrespective of whether a message is sent within an application protocol, such as HTTP, or directly over transport protocols. An initial sender can also send such data to a receiver separated from it by multiple hops.

However, being a proprietary solution, it also bears some drawbacks. It requires a client to understand the semantics of the underlying server application and to use the same toolkit as is used on the server side. It would be great if it were possible for a client to get all the required knowledge from a WSDL description only, which then can be processed by a client's SOAP environment when required.

On the Grid

The Globus Project, led by Argonne National Laboratory and the University of Southern California's Information Sciences Institute is a research and development project focused on defining and building Grid middleware standards and software that can be applied to a broad range of scientific and technical computing applications. Open Grid Services Architecture (OGSA), the result of an ongoing collaboration between the Globus Project and IBM, is an effort to define a Grid system architecture based on an integration of Grid and Web services concepts and technologies.

OGSA defines a Grid service as an instance-based Web Service that conforms to a set of conventions (interfaces and behaviors) that define, in terms of WSDL, mechanisms required for creating and composing sophisticated distributed systems. It standardizes on a factory/instance pattern for managing stateful service instances, and deals with such issues as the instances identification, discovery, lifetime management, and soon, provisioning of resources.

The Grid Services specification introduces the notion of a handle and a reference. A handle is a permanent globally unique identifier to a Grid service instance and is represented as a URI. It is returned by a factory and must be resolved using a URI-specific protocol (for example, HTTP GET for an HTTP URI) to a reference to the service instance. It is possible to discover what type of service instances a factory creates before asking it to create an instance.

While references can take a variety of forms specific to the binding mechanism of a service, a standard form defined by OGSA that is expected to be used widely is a WSDL document. It is recommended to contain only a <service> element which can be used to discover what port types a service instance implements and get required port and binding details. Resolving a service handle into such a minimal WSDL fragment is likely to lead to a more efficient and type-safe static (and possibly dynamic) runtime binding, especially when a client chooses between different instances of the same service. Depending on the definition of the selected binding, a service instance can be actually identified in a number of ways, for example, by passing the handle in a SOAP header.

A reference is transient, that is its lifecycle may be independent from the lifecycle of the service instance it refers to. An important thing is that the same handle to a specific instance can be resolved to a new reference after the original one becomes invalid (for example, after the service port or binding has changed), which provides for an enhanced load-balancing and scalability.

The specification defines a flexible lifecycle management scheme, by allowing clients to destroy service instances by either explicitly invoking a destroy operation or using a soft-style approach.

The "Business" Way

Good examples of a coarse-grained, service-oriented approach toward building stateful services can be found in today's B2B integration solutions.

Consider the following simplified fragment of a purchase order service definition.

<definitions ...>
<types ...>
 <schema ...>
  <complexType name="PurchaseOrder">
    <element name="orderId" type="xsd:int">
    <!-- other elements are not shown --> 
  </complexType> 
 </schema>
</types>
<message name="PORequest"/>
  <part name="PurchaseOrder" type="xsd1:PurchaseOrder"/>
</message>
<!-- POResponse message is not shown -->
<portType name="PurchaseOrderPortType">
 <operation name="submitOrder">
  <input message="tns:PORequest"/>
  <!-- output is not defined for asynchronous responses --> 
  <output message="tns:POResponse"/>
</operation>
</portType>
</definitions>

This service can process purchase order requests from its clients synchronously or asynchronously. An initial request can lead to a multi-step conversation between a service and a client. The service needs to maintain a conversational state per each individual request and is instance (or session) based.

A value of a simple token orderId, which is part of a business document instance, is associated with an implicitly created session and is used during the following conversation regarding a specific purchase order. In this example, it is the responsibility of a buyer to initialize an orderId field. A token may also be passed within a SOAP header.

The BPEL4WS specification, which allows the exposure of a business process as a stateful compound service, generalizes this approach by introducing message correlation sets. These are named groups of abstract properties that are mapped by XPath selections to fields within the message data. A correlation set is instantiated within the scope of a business process instance as part of some initial operation and uniquely identifies this instance during an application-level conversation afterwards. For a good and clear explanation of how it all works please read the relevant parts of the specification.

BPEL4WS 1.1 uses endpoint references defined by WS-Addressing specification. These references may include the qualified name of a wsdl:service element, which itself can be inlined within the service reference if it's not already known. It may also contain instance-specific properties, such as an instance identifier, however, they're not strictly required to be present when a reference refers to another BPEL4WS-enabled service due to the fact that correlation sets can identify specific instances.

A client typically can not directly affect the lifecycle of an individual session, unless some custom conversational protocol is used or a close operation is added. For example, BPEL4WS says that a business process instance can die after an activity it executes has ended normally or as a result of an exception, etc.

Using a single monolithic service is good for supporting peer-to-peer interactions between business partners, but it may not be ideal in other situations.

For example, consider a banking web service which allows the opening of an account and retrieval of the balance. An open request (possibly conversational) is orthogonal to debit/credit messages, so making them belong to a single interface, or opening an account implicitly as part of some update requires the service to build a state machine. So, it might make sense to have one service to open an account and another account service to accept update requests:

Account a = bankManager.openAccount();  (1)
a.debit(getAmountFromUser());
showBalanceToUser(a.getBalance());

New account details (number, sort code, etc) and an account service reference may be returned in an open response. A BPEL4WS-like client runtime can privately use the account details for correlating future interactions with the account service. However, it all can become quite complex, as the runtime may need to interact with a user during an open request and has to make only part of the response (a service reference converted to an Account proxy ) available to the user application, so that a programmer does not have to write :

AccountResponse response = bankManager.openAccount();
Account a = Runtime.getProxy(response.serviceReference);
a.debit(response.accountNumber, getAmountFromUser());

While it may not be applicable in the above scenario, a returned standardized service reference containing both an instance handle and a WSDL service definition can make this work without a BPEL4WS or a proprietary toolkit's support, especially when such a reference is all a client needs to start talking to an associated service instance when it wants to. Passing handles like URLs within headers for a session identification can be preferable to using application-specific tokens in such cases.

While maintaining a state machine may not be a major problem with a single interface to a stateful service, handling a heavy load of client requests may become the one. By returning session service references it is not only possible to redirect clients to different physical nodes but also to balance the load effectively.

In .NET

Microsoft .NET Remoting provides an infrastructure for exposing the .NET objects to remote processes. It includes support for passing objects by value or by reference. Marshal-by-reference (MBR) objects can be activated by a client and the server. Client-activated objects are created on the server when a client issues an activation request, which returns a serialized object reference and results in a client proxy being created.

An interesting aspect of this .NET Remoting feature (thanks to Simon Fell for referring me to it) is that when SOAP within HTTP is used for an inter-domain integration, an object reference is serialized as a URI uniquely identifying an object instance on the server and which subsequent calls by a proxy are made against.

Time to REST

Representational State Transfer (REST) is an architectural style underlying the way the Web is built. It encourages thinking in terms of URIs and resources, but not in terms of messages, procedures, and objects, and may help in building web-friendly web services.

Acknowledging this fact, SOAP 1.2 Working Draft introduced the Web Method Specification Feature that allows the identification of the HTTP method (such as GET or POST) being used, currently supported by SOAP HTTP binding. Particularly, it recommends the use of the GET method in conjunction with the SOAP Response message exchange pattern for information retrievals that are safe and have no side effects, for which no parameters other than a URI and no SOAP request headers are required. SOAP 1.2 Part 0 Primer explains well how and when this feature can be practically applied.

Simon Fell, an author of a PocketSOAP toolkit, has done some experiments with a server implementation of the SOAP 1.2 GET HTTP binding for the SOAPBuilders Interop Registration and Notification Service, which allows toolkits builders to register server implementations and client results for a specific service used for interoperability tests. The service, which allows clients to query the registry for information about which servers provide implementations of a particular service is a good candidate for being accessed using the GET HTTP binding.

This is a fragment of an experimental XML schema types definition:

<xs:schema   targetNamespace="http://www.pocketsoap.com/registration/12/">
 <xs:import namespace="http://www.w3.org/1999/xlink"   schemaLocation="xlink.xsd"/>
 
<!-- not all elements and types are shown -->
<xs:element name="Toolkits" type="tns:Toolkits"/>
<xs:complexType name="Locator">
  <xs:simpleContent>
   <xs:extension base="tns:Empty">
      <xs:attribute ref="xlink:href" use="required"/>
      <!-- two other attributed are not shown -->
   </xs:extension>
  </xs:simpleContent>
</xs:complexType>
 
<xs:complexType name="Toolkits">
 <xs:sequence>
   <xs:element name="Toolkit" type="tns:Locator" minOccurs="0"  maxOccurs="unbounded"/>
 </xs:sequence>
</xs:complexType>
 
<xs:complexType name="Toolkit">
  <xs:sequence>
      <xs:element name="Name" type="xs:string"/>
      <xs:element name="Version" type="xs:string"/>
      <xs:element name="Homepage" type="tns:Locator"/>
  </xs:sequence>
</xs:complexType>
</xs:schema>

This schema defines several resource types such as a Toolkit and a Test (not shown in the above fragment), which can be identified by URIs. When an HTTP GET query about a particular type of resources is executed, a sequence of references (of type Locator) to resource instances is returned, which allows the retrieval of information about individual resources, each of them possibly providing links to other resources. For example, a response to a GET request to http://soap.4s4c.com/registration/toolkits/ looks like this.

<s:Envelope xmlns:s="http://www.w3.org/2002/06/envelope/">
 <s:Body>
  <q:Toolkits xmlns:q="http://www.pocketsoap.com/registration/12/" 
                  xmlns:x="http://www.w3.org/1999/xlink">
   <q:Toolkit x:href="19d3e126-317c-4680-9bad-0035cbf44c35"/>
   <!-- many more toolkits -->
  </q:Toolkits>
 </s:Body>
</s:Envelope>

Each <q:Toolkit> element refers to a particular Toolkit resource. By appending a value of an XLink x:href attribute to the original URI, information about the toolkit, such as its name, version and homepage, can be obtained.

It would be nice if it were possible to automate this process of the information retrieval, which requires type-safe proxy generation at run-time. For this to happen, it is necessary to identify at compile or deployment time a WSDL binding which can be used for accessing the resource instance a particular element, such as <q:Toolkit>, refers to. In fact, this is in essence (at least as seen by some people) of a WSDL requirement R085, which states that "a description language should allow describing messages that include references (URIs) to strongly typed referents, both values and Services".

This can probably be achieved by annotating locator elements with information about the relevant WSDL bindings using schema extensions. An annotation may be done only once per every specific locator. However, it might require all such locators be declared as global elements and referenced from elsewhere, as the same locator may appear as part of different elements. Also, some bindings may not be known at a schema design time.

Alternatively, this might be done by adding an element or attribute of type anyURI to a corresponding complex type, and stating in a WSDL port type definition that this element (attribute), identified by applying an XPath expression to an output/input of an operation, serves as a reference to a binding required for accessing a particular port type.

The proposal by Arthur Ryman shows how the requirement R085 can be resolved with the introduction of a new <wsdl:endpoint> element, which can identify URIs included either as hyperlinks using XLink or as part of endpoint references defined by WS-Addressing.

Resolving R085 will likely make it easier to write session-based web services. For example, another way to expose our application would be to GET a required device from a management service (assuming that obtaining a device handle can be regarded as a safe operation!) and then send the control requests to that device service with HTTP POST. This is nearly identical to the way we did it with one major difference: an application method is now GET rather than connect(). What is also interesting to observe is that in this case REST interactions can be considered as factory-based: each resource can implicitly serve as a factory by explicitly providing references to other resources in its state representation.

An interesting aspect of the REST approach to programming stateful web services is that it can make custom conversational policies unnecessary, as the current state of a conversation/session is normally represented by a resource addressable by a URL. A RESTful web service "guides" the clients through its state machine. More thoughts on this can be found here or by reading A Web-Centric approach to State Transition by Paul Prescod.

It may not always be possible or easy to commit to building web services the REST way, for example, when transport or different application protocols are used, SOAP headers need to be present. In such cases it's still worth putting some efforts into maximizing a visibility of a web service.

Conclusion

In this article I tried to assess different approaches toward building session-based web services. In general, it is recommended that web services be designed according to the principles of a service-oriented architecture. However, it is sometimes desirable to build services capable of referencing each other, which may lead to a finer-grained, session-oriented services design. When building a new service, it is worth considering carefully the pros and cons of both design styles, which can result in a better integration solution for a targeted domain.

Acknowledgments

This article was inspired by many other peoples' implementations, specifications and discussions of various session-related aspects of web services.

Many thanks to Steve Tuecke for detailed critical comments to the earlier version of this article.

Many thanks to Simon Fell, Jacek Kopecky and Zdenek Svoboda for reading the article and commenting on its various parts.

Resources

Web Services Interaction Models by Steve Vinoski.
A Young Person's Guide to SOAP by Don Box.
A Web-Centric approach to State Transition by Paul Prescod.