October 16, 2002
This article is the last in a series examining how one might go about sending binary data as part of a SOAP message. This month we look at BEEP, the Blocks Extensible Exchange Protocol.
The primary inventor of BEEP is Marshall Rose, a long-time "protocol wonk" within the IETF community. Marshall has authored more than 60 RFC's, covering everything from core SNMP details to a DTD for IETF RFCs.
Like SOAP 1.2, BEEP is described in a transport-neutral manner and is defined in RFC 3080. The most common transport is TCP, and the TCP mapping is found in RFC 3081. If you are at all interested in network protocols, you should read RFC 3117, "On the Design of Application Protocols". At the BEEP site you can find HTML versions of BEEP and related RFCs generated from the DTD mentioned above.
BEEP actually addresses a wider range of problems than just SOAP and binary data. According to the RFC, BEEP is "a generic application protocol kernel for connection-oriented, asynchronous interactions." In other words, it's like an application-level transport protocol. Unlike many application protocols, it supports more than just lockstep request-response interchanges, which makes it suitable for applications such as event logging (see RFC 3195, for "Reliable Syslog"), and general peer-to-peer interactions.
One of the more interesting implications of BEEP's design principals is that it supports multiplexing -- that is, multiple channels communicating over a single transport stream. BEEP allows different applications, or multiple instances of the same application, to use the same stream for independent activities, including independent security mechanisms. For example, a single BEEP-over-TCP link between a browser and a web server would efficiently allow the browser to fetch a page over SSL-encrypted HTTP, while also streaming in multiple images over unencrypted HTTP.
A BEEP message is the complete set of data to be exchanged. BEEP defines three styles of message exchange, distinguished by the type of reply the server (or, more accurately, the message recipient) returns:
- Message/reply, the server performs a task, and sends a positive reply back.
- Message/error, the server does not perform the task, and sends a negative reply back.
- Message/answer, the server sends back zero or more answers, followed by a termination indicator. This style is appropriate for streaming data, for example.
Messages are divided into frames. The specification says that a message is normally sent in a single frame. Frames consist of a header, a payload, and a trailer. The header and trailer are encoded in printable ASCII and terminated with a CRLF -- in other words, typical IETF protocol style. The payload is an arbitrary set of octets.
The frame headers are almost identical and are structured as follows,
type SP channel SP msgno SP more SP seqno SP size CR LF
LF stand for the ASCII space,
carriage-return, and linefeed characters respectively.
type specifies the message type and is a three-byte strings --
ERR (error), or
NUL (terminator) -- which correspond to the
message types listed in the exchange patterns above.
channel identifies the multiplex channel of the communication and is the
printed form of a number between zero and 231 - 1. BEEP reserves channel zero for
management tasks, like creating new channels. So the simplest BEEP application will
channels and is,therefore, likely to end up being multiplex-ready.
msgno uniquely identifies the message. It's a number in the same format as
channel and acts like a session identifier: a reply to a given message will
have the same
msgno cannot be reused until the final
response packet --
NUL -- has been
more indicator is a period if this is the only or last frame of a message.
It's an asterisk if at least one other frame follows for this message. Note that frames
have a zero-length payload, so that final EOF can be sent by itself.
seqno is an unsigned 32-bit number (i.e., twice the size of the other
number fields) and specifies the offset of the first octet in the current payload.
different from the conventional use of the term "sequence number"; perhaps "offset"
have been a better choice. In most cases, the
seqno of frame n will be
the sum of the payload lengths of the prior n - 1 packets. Some applications,
however, might want to efficiently use BEEP to omit large sets of default values.
example, a distributed file system protocol could avoid sending a large number of
If a server is sending multiple replies to the client -- the message/answer message
-- there will be an
ansno field to identify each answer. The combination of
more is similar to nested messages within DIME and the
The final field is the
size, which has the obvious meaning of specifying the
number of bytes in the message payload. While BEEP is often used for XML or other
data, payloads can be arbitrary. Payloads are MIME objects in that they may have MIME
headers describing the type and encoding of the payload. BEEP adds two rules: the
Content-Type is octet-stream, and the default transfer encoding is binary. This seems
reasonable trade-off; binary data has no overhead, and text data has a standard, but
overhead to describe its type.
There are a set of simple constraints among the various fields that let implementations do a large number of error-checks. Over a dozen are documented in the RFC.
BEEP initial connection follows standard IETF practice: the client (or initiator) connects to a remote port, and the server (or listener) responds with a greeting message. The greeting message includes a profile element in which the server lists its supported facilities. For example, after receiving a connection, the following greeting indicates SOAP and TLS (SSL) support:
RPY 0 0 . 0 143 Content-Type: application/beep+xml <greeting> <profile uri="http://iana.org/beep/TLS"/> <profile uri="http://iana.org/soap"/> </greeting> END
From the frame header, we see that this is a reply (to the initial TCP connection), that it is the first message on the management channel, and that it is completely contained in this frame. This is a simple greeting; optional elements include codeset localization and identification of management features.
The client now acknowledges the greeting message, immediately indicating that it wants to start a SOAP channel:
RPY 0 0 . 0 51 Content-Type: application/beep+xml <greeting/> END MSG 0 1 . 0 138 Content-Type: application/beep+xml <start number='1' serverName='www.example.com'> <profile uri='http://iana.org/beep/soap'/> </start> END
number attribute identifies the channel, and the
attribute identifies the "virtual server" the client wishes to work with. A client
specify multiple profiles, and the server will determine which ones it can support
them back in its reply:
RPY 0 1 . 221 79 Content-Type: application/beep+xml <profile uri='http://iana.org/beep/soap'/> END
Now both parties are using a single TCP connection for management and plain text SOAP communication. BEEP is a fairly simple protocol, allowing you to build powerful communication infrastructures with modest work.
The full SOAP-over-BEEP specification is RFC 3288. It says that, once the channel has been opened, both sides enter a "boot" phase. During this phase information equivalent to the classic SOAP-over-HTTP headers is sent -- that is, the URI that would be POST'd to. It's also a placeholder for future feature negotiation and has a trivial format:
As an optimization, BEEP allows a boot message and its reply to be piggy-backed into the channel start message, avoiding extra network round-trips (and latency). In practice, then, here is what a real start message and its reply would look like:
MSG 0 1 . 0 197 Content-Type: application/beep+xml <start number='1' serverName='www.example.com'> <profile uri='http://iana.org/beep/soap'> <![CDATA[<bootmsg resource='StockQuote'/>]]> </profile> </start> END RPY 0 1 . 0 112 Content-Type: application/beep+xml <profile uri='http://iana.org/beep/soap'> <![CDATA[<bootrpy/>]]> </profile> END
More from Rich Salz
Having established communication, simple attachments become trivial: send the message as multi-part MIME, as specified by the SOAP with Attachments Note. This is, in fact, the only attachment scheme currently specified in RFC 3288, which is unfortunate. SOAP should be able to take full advantage of the powerful asynchronous multiplex communications core provided by BEEP. It would seem that only two things are needed: first, a URN scheme for naming BEEP channels; second, that a SOAP message could refer to data coming in over separate channels, streaming in as it becomes available. Now, that would be cool.
I'll close with the following plea: Don, pick BEEP. which I'll explain. Back in February, Don Box pointed out that HTTP has a number of limitations for use as a web services protocol. His criticisms are right-on. It's my sincere hope that we don't get a new protocol: BEEP seems to meet all his requirements.