Protocol Design: Sessions

January 20, 2004

Many protocols require sending a number of messages or commands that are connected in one way or another. Sessions, a way of grouping together multiple messages, occur on all levels of protocol design, from the low-level parts of the TCP/IP stack to high level constructs built on top of other protocols.

To understand the concept of sessions, it's best to start with UDP. UDP is a thin layer on top of IP, the underlying protocol of the Internet that deals with delivering messages between different hosts. UDP is composed of datagrams or fixed length sequences of bytes typically less than 1500 bytes long. If the sender sends 500 bytes, the receiver will receive those 500 bytes as a single unit, separately from any other datagrams. UDP datagrams sent from one host to another may get lost, duplicated, or arrive out of order. As a means of communication UDP is somewhat like mailing letters. Sometimes the post office screws up and the letter never arrives, and the recipient of the letter may be getting other letters from other people. There isn't necessarily any way of knowing which of these letters are connected; some correspondents may write multiple letters about different subjects.

For some simple protocols UDP is a good fit. One good example is DNS, which is used to turn human readable domain names into IP addresses. The DNS client sends a request datagram to a DNS server asking for information about "www.example.com", and at some point in the future the client will receive a datagram indicating "www.example.com" has the IP address 192.0.34.166. Duplicate messages can be ignored, and if no response is received the client can try again.

Other protocols have more complex needs. What if the server, in response to a request, asks for more information (e.g. for a username and password to authorize the request)? Likewise, how can large amounts of data be sent over UDP? One partial solution is to have each datagram contain all necessary data. If a client receives a response requesting authorization, it will send a new request datagram that contains all of the original data together with the additional required information, a username and password. Since UDP packets have a maximum size, this solution doesn't solve the second problem. Even if it were possible, sending multiple copies of the same information is inefficient.

The second solution to these issues is the concept of a session. Multiple datagrams will be identified as being in the same session. This can be done based on the source IP and port of the datagrams, i.e. all datagrams from a specific sender are considered a session. Another way of grouping datagrams is using a unique session identifier, included in each datagram. All datagrams that have the same identifier are in the same session. Once datagrams are in a session they can refer to the contents of previous datagrams in that session without having to resend all the information if the protocol requires this. It's equivalent to having a filing cabinet that stores old letters sorted based on the sender of the letter. Without the concept of a session, there'd be no way of knowing to which datagram another datagram is referring.

Sessions don't solve another issue with UDP, the fact that datagram delivery is unreliable, with possibly undesirable side effects. Datagrams may be duplicated (requesting the download of a 200MB files twice) or arrive out of order (a "delete item 1" command may arrive before the "backup item 1" command). To solve this, a protocol can use a message counter. The first datagram contains, in addition to its session id, the fact that it is message 1. The second datagram contains the fact that it is message 2 and so on. The recipient can then re-order received datagrams if they arrive in the wrong order by sorting them based on their message counter. Datagram loss can be handled by the recipient asking for missing messages after a certain timeout ("resend message 2, I never got it") or by having the recipient acknowledge all received messages. If the sender doesn't receive acknowledgment for a specific message after a certain amount of time, the sender will will resend it. Duplicates are easy to detect as they have duplicate message counters. The TFTP protocol uses the second method to transfer files over UDP in a reliable fashion.

Implementing all of this from scratch for each protocol would be a pretty wasteful effort, and this is where TCP comes in. Implemented on top of IP, using more sophisticated versions of these methods, TCP provides connections, that is, reliable ordered stream of bytes. A client opens a connection to a server, sends bytes over the connection and receives bytes from the server, and then at some point either side may end the connection. Unlike UDP, bytes are not grouped into datagrams, so if the sender writes 500 bytes and then later 500 bytes, this may arrive on the recipient side as 1000 bytes or as 900 bytes and then 100 bytes. Implementing sessions on top of TCP is trivial: all that is required is to match a session to a connection. All messages sent over a single connection are considered to be a single session. If UDP is similar to the postal system, a TCP connection is a lot like a telephone conversation: a continuous stream of information that requires no extra effort to tie together.

Consider the following transcript of a POP3 session, a TCP-based protocol used to retrieve email. The "APOP" command is used to authenticate the user. The other commands, such as "LIST", can not be run until the user has authenticated. If the authentication is successful, the session's commands (i.e. all commands sent over this specific TCP connection) are assumed to refer to the specific user's mailbox.

S:    +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us>

C:    LIST

S:    -ERR Invalid command

C:    APOP mrose c4c9334bac560ecc979e58001b3e22fb

S:    +OK mrose's maildrop has 2 messages (320 octets)

C:    LIST

S:    +OK 2 messages (320 octets)

S:    1 120

S:    2 200

S:    .

C:    RETR 1

S:    +OK 120 octets

S:    <the POP3 server sends message 1>

S:    .

C:    DELE 1

S:    +OK message 1 deleted

C:    QUIT

S:    +OK dewey POP3 server signing off (maildrop empty)

POP3 is a stateful protocol. The session (tied to the connection) is a state machine, and the server will support different commands depending on the state. In the initial state only "APOP" or some other command that authenticates the user are accepted. Once the user is in an authenticated state, "LIST", "RETR" and so on will be accepted by the server.

Just because it's easy to supports sessions using TCP doesn't mean it's obligatory. Unlike POP3, HTTP does not tie the concept of a session to TCP connections. A POP3 client can not send a "APOP" command, get a response, open a new connection and send "LIST". That just won't work since the new connection will be considered a separate session. HTTP clients on the other hand can send requests over a single TCP connection or over multiple connections, and the only real difference will be speed (opening multiple connections is slower). Further, HTTP servers can't assume all that all requests in the same connection are part of the same session, they might be arriving from different clients via a HTTP proxy. Without the concept of a session, HTTP is a stateless protocol, which is to say that it does not change its protocol level behavior based on previous commands. Of course changes to the underlying data storage (e.g. a HTTP request causing a database entry on the server to be deleted) might affect the results of future request.

When downloading static files, the lack of sessions is not important. Which file was downloaded before the current one is not going to affect the contents of the current download. Many HTTP applications do need session support. Shopping carts, for example, need to remember which items the user added to the cart, which means matching multiple requests to a single session. In order to make this possible, the concept of "cookies" was introduced to the protocol, essentially session IDs sent along with each request. Other mechanisms for implementing sessions are occasionally used as well. Rejecting the concept of sessions tied to connections, HTTP developers were forced to reintroduce sessions in other ways, all with drawbacks (e.g. browsers can choose to disable cookie support) and all requiring extra effort to use.

HTTP is an established protocol for web browsing, and changes need to be within the framework of the existing protocol. New protocols that require tight coupling, and which can't be modeled using one-off request/response transactions, would in many cases be better off not using HTTP, nor should such protocols follow its model. The need to support sessions, required by many common types of interactions, involves extra complexity that would not be necessary were a better underlying protocol chosen. For these protocols, tying the session to the TCP connection is the natural way to implement the requirement for sessions.

The concept of a session, a series of ordered connected messages, is fundamental to protocols that require long term conversations. Sessions can be implemented on a number of levels: in the protocol itself or using the transport layer's concept of sessions, typically TCP. When the transport layer's session support is not used, or the transport does not support sessions, extra effort is required to add them, leading to a more complex and harder to implement protocol.

The next installment in this series will discuss a related design issue: dealing with parallel requests, including their interactions with sessions.