Brother, Can You Spare a DIME?

September 18, 2002

Direct Internet Message Encapsulation

Last month we talked about the reasons for associating attachments with SOAP messages, and we looked at the initial SOAP Messages with Attachments (SwA) note. This month we look at Direct Internet Message Encapsulation (DIME), a binary message format; and we'll also look briefly at the WS-Attachments specification, which provides a generic framework for SOAP attachments, and a definition for a DIME-based instantiation of that framework.

Both specifications are being developed through the IETF and are available as internet drafts (see the list of archive locations). Interestingly, they are not being developed through an official IETF Working Group but are being published as the work of individuals. Microsoft created DIME and was the original promoter for its adoption; the current DIME draft has IBM and Microsoft authors, as does the WS-Attachments document.

If we read between the lines, we can conclude that DIME is a part of the global XML Architecture effort led by IBM and Microsoft. In addition to the IETF archives, you can find (in a prettier format than the historic IETF lineprinter output) copies at IBM's developerWorks and Microsoft's MSDN sites. While the absence of IETF WG support might slow their adoption as IETF standards, DIME and WS-Attachments will be de facto, if not de jure standards, which makes them well worth a look.

Like SwA or MIME, DIME is a message format -- it is not a network protocol like HTTP. The biggest surprise to most XML developers will be that DIME is a binary format: fields have fixed size, as opposed to being terminated by a newline character, numbers often have fixed sizes, bytes are written in a specified order -- the common "network byte order": most significant byte first -- and so on.

I'm not entirely sure, but I think this may have been a mistake. If the record header was defined as two or three Unicode text lines, with a series of space-separated fields, padding could still be defined, maintaining the current requirement that the total header size be a multiple of four bytes. This would have given developers the ability to build and deploy DIME tools using standard text-processing and scripting languages to handle all of the DIME overhead.

If nothing else, DIME is an interesting case study of engineering trade-offs.

A DIME message consists of a series of one or more records joined together to make a single application message. Records aren't numbered: their position is implied by their position in the data stream. Thus DIME requires a stream protocol like TCP and is unsuitable for UDP, a datagram protocol. Many multimedia protocols are "lossy" and a datagram approach make sense.

Multiple units of application data or payloads can be encapsulated in a single DIME message stream. Payloads that don't fit into a single DIME record packet can be divided into chunks and sent in pieces. The following diagram, cribbed from the DIME draft, shows the byte layout of a DIME record header.


  0                   1                   2                   3 

  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |         |M|M|C|       |       |                               | 

 | VERSION |B|E|F| TYPE_T| RESRVD|         OPTIONS_LENGTH        | 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |            ID_LENGTH          |           TYPE_LENGTH         | 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |                          DATA_LENGTH                          | 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |                            OPTIONS                            |

 .              ...PADDED TO MULTIPLE OF FOUR BYTES...           . 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |                          PAYLOAD ID                           | 

 .              ...PADDED TO MULTIPLE OF FOUR BYTES...           . 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |                         TYPE OF DATA                          |  

 .              ...PADDED TO MULTIPLE OF FOUR BYTES...           . 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 

 |                              DATA                             | 

 .              ...PADDED TO MULTIPLE OF FOUR BYTES...           . 

 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The five-bit VERSION field identifies the message format version. Only Version 1 has been defined so far. Some Microsoft toolkits supported an earlier version of the protocol, which didn't have a version number. A message can be split into multiple records: the first record of a message will have the MB (Message Begin) bit set; the last record of a message will have the ME (Message End) bit set. If the message fits in a single record, both bits are set.

A DIME message cannot carry another DIME message unchanged. The specification requires that the MB and ME bits don't nest -- if you find a second record with MB set before a record that has ME set, it's an error. When embedding DIME in DIME, therefore, you must make sure to clear the internal MB/ME bits.

But what if the message doesn't fit in a single record? DIME supports chunking, a means of splitting a message over multiple records. But before we can understand chunking, we need to look at some other fields.

DIME supports typed application data or payloads. Data can be typed by using either a registered MIME media type (such as application/soap+xml for a SOAP message) or by a URI. When using a URI, the specification recommends that it be the namespace URI of the top-level element of the XML message, if the payload is XML. That's a clever hook for early message dispatching.

In order to determine how the message is typed, the TYPE_T (the "type type") field is used. It's a four-byte field that includes the following definitions: 0x0 for "same as before" (we'll see what that means shortly), 0x01 for a MIME type, and 0x02 for an absolute URI -- relative URIs are not supported. The TYPE_LENGTH field specifies how many bytes are in the type; padding, of course, is not counted. The TYPE field is an octet string -- essentially always a UTF-8 text string -- which specifies the actual payload type.

So, if a payload doesn't fit into a single record, here's what we do:

The first record puts the full typing information into the TYPE_T, TYPE_LENGTH and TYPE fields, sets the CF bit (the chunk flag), and includes as much of the data as it wants. If the message has an identifier, then it can be put into the ID field, and the ID_LENGTH field must be set to indicate the identifier length (again, ignoring any padding).
Subsequent records have the CF bit set, TYPE_T field set to 0x0 -- now we see what "same as before" means -- and the TYPE_LENGTH and ID_LENGTH fields must be zero, implying that the TYPE and ID fields are omitted from the record. (The spec doesn't say but certainly implies that padding must be to the lowest multiple of four not less than the true field size.)
The last chunk is treated the same as intermediate record chunks, except that the CF bit is cleared.

Because there's no requirement on the size of each chunk, DIME is well suited for encapsulating content where the initial size isn't known, where it's dynamically generated, and it's not desirable to have sender or receiver buffer the entire message, and so on. Not surprisingly, this is the second design goal, as enumerated in section 1.3 of the DIME spec. It seems well-suited, then, to web servers (or web services servers) that will send such content to clients.

On the down side, it's not as well-suited for clients that want to send this type of content to a web server. Imagine Apples's iMovie as a set of web services. Since there's content, the HTTP POST command must be used; if the content length isn't known in advance, HTTP/1.1's chunked encoding must be used. This means requiring HTTP/1.1, which might not be feasible and further requiring a server that accepts chunked POST's. Few servers do, particularly if the URL refers to an external CGI.

Requiring the Content-Length header on POST messages allows servers to be predictable in their user of resources, helps protect them against over-consumption and denial of service attacks, and in general seems like a good trade-off. (When the server does send this type of data, it's going to use HTTP's chunked encoding to send chunked DIME records...everyone's chunking up all over the place.)

Unfortunately, trade-offs often conflict, so it's impossible to have generic support for DIME in a POST body. We can rationalize that by pointing out that dynamic content is more likely to flow from server to client than the other way 'round.

WS-Attachments

The WS-Attachments specification is fairly simple, and if you've read the SwA note you'll find much of it is to be familiar. Simply put, the spec says that a SOAP message with attachments is sent as a multi-payload DIME message. Again, each payload corresponds to a (possibly-chunked) DIME record: the first record has the MB bit set, and the last record has the ME bit set.

More from Rich Salz

The first record is the SOAP message. For SOAP 1.1 the type must be the namespace identifier of the SOAP 1.1 namespace, which as I'm sure we've all committed to memory is http://schemas.xmlsoap.org/soap/envelope . For SOAP 1.2 it should be the MIME type being developed as an internet-draft, application/soap+xml.

Each subsequent record is used to encapsulate a single attachment. Attachments will have a name that corresponds to the href attribute referent in the initial SOAP message. This corresponds to the Content-ID header in the SwA note, and names like URLs, a UUID, or an architected Message-ID, as I described last month, are all good choices.