Protocol Design: Reliablity and Security
August 25, 2004
Protocols are part of the infrastructure used to implement a service: delivering an email message, requesting a web page. These services have security and reliability requirements that need to be implemented at different layers of the infrastructure. For example, emails should not be dropped along the way, web pages should be returned untampered by malicious third parties. When designing a protocol it is important to understand what guarantees the protocol and the infrastructure layers beneath it must provide, and at what layer they should be implemented.
Reliability, Security, and Other Guarantees
Some of the guarantees that protocols might want to provide include privacy, integrity, identity, reliability, and freshness. Privacy requires that third parties won't be able to get information about the contents of the messages, their destination, or other similar properties that depend on the requirements of the protocol. Integrity means the contents of messages will not get changed or corrupted. Identity lets the various sides of a conversation have some knowledge of who they are communicating with.
Reliability means that data arrives and is not dropped along the way. Freshness means that data that arrives is known to be up-to-date and not, for example, a copy of an old message. Depending on the service the protocol is intended to provide, some or all of theses guarantees may be required or supported as options.
What guarantees does TCP provide? TCP makes sure the packets it transmits do not get corrupted in transit, and that they arrive at the destination IP address only once and in the correct order. TCP provides almost no privacy. At the very least any router along the path between the two IP addresses can read the contents of the TCP connection.
The Transport Level Security (TLS) protocol, the latest version of the SSL standard, was developed in order to provide stronger guarantees than TCP alone can. TLS was designed as a layer on top of TCP that has similar semantics to TCP (i.e. a reliable, ordered stream of data) while providing additional guarantees. As a result, protocols that run on top of TCP can be modified to run on top of TLS with minimal or no changes. For example, HTTPS URLs on the Web are loaded by running HTTP over SSL or TLS (which are running over TCP).
What additional guarantees does TLS provide? TLS provides some amount of privacy, as the contents of messages are encrypted. It provides a more robust concept of identity, by using public/private keys to identify the end points of the connection. TLS uses cryptographically secure checksums to ensure that the data is not tampered with, corrupted, or replayed, thus providing integrity and freshness beyond what TCP provides.
Even though TLS provides extra services that TCP cannot, TLS doesn't and can't provide services that even basic protocols require. The problem is that both TCP and TLS are tied to a connection, but protocols may implement services that need to work beyond the span of a connection.
A file transfer protocol, for example, should verify that the file was actually stored correctly on the destination machine's storage, as delivering the non-corrupted file to the destination is the goal of the protocol. Unfortunately this is something that TLS cannot possibly provide, because storing a file to disk happens outside the scope of the TCP connection.
The protocol, on the other hand, can provide integrity and reliability by asking, once the transfer has completed, for a checksum of the stored file and comparing it to the checksum of the local file. If the checksums don't match, the file can simply be transferred again. Even though the communication is done over a reliable connection, TCP or TLS can't provide these guarantees, they can only guarantee the data they transmit in the scope of their connection.
Even when it comes to privacy TLS is not always sufficient. When transferring email using SMTP, messages get passed from SMTP server to SMTP server, until they reach the destination server, which delivers them to the destination user's mailbox. The SMTP mail-delivery model is similar to that used by postal mail, where the letter gets passed by truck to the mail sorting center and passed on by stages until it reaches the postal carrier, who will deliver it to the appropriate mailbox.
The problem is that even if the connections between the SMTP servers are done using TLS rather than TCP, the SMTP servers will still be able to read the contents of the email, as they will be decrypted before reaching the protocol layer. TLS provides privacy from third parties: they can't read the email because the TLS protocol encrypts its data. On the other hand, TLS cannot provide privacy from the SMTP servers. The problem is that the privacy requirements for transferring an email are not tied to a specific TCP connection, and that is all that TLS can provide.
Besides privacy, TLS cannot provide other services we might want for email. TLS cannot indicate the identity of the sender of an email, it can only prove the identity of a specific connection. As with file transfer, TLS can't provide reliability to email delivery. Nor can TLS provide freshness or integrity, as SMTP servers can duplicate, modify, or delay emails they are relaying if they wish.
Layers Upon Layers
How then would the concept of identity and privacy be added to SMTP? Or, more specifically, how would one ensure that the recipient can verify the identity of the sender of an email and that only the intended recipient can read it? The common solution is not, interestingly enough, implemented in SMTP. Instead, it is implemented on top of SMTP, in the contents of the messages.
PGP and S/MIME, the two main alternatives, use public, key-based identities to encrypt and checksum the email itself -- which is unlike TLS's encryption of the communication that transfers the emails. The email is already encrypted before delivery to the SMTP server. As a result, the privacy, identity, and integrity are all implemented in such a way that the SMTP protocol, and the servers that implement it, need not be involved. Unlike TLS, these services are preserved even in the face of malicious or malfunctioning SMTP servers.
Given the use of PGP or S/MIME, using TLS for SMTP communication does not add much. All it does is prevent third parties that are eavesdropping on the SMTP servers from discovering the sender and recipient email addresses of emails, at the price of putting a potentially large computational burden on the servers in order to implement TLS's cryptographic functionality.
As with privacy and integrity, reliability (knowing whether an email was delivered) can't be implemented on the connection level. Notification of failed delivery (known as a "bounce") and of message receipts are done as additional, specially formatted email messages forwarded on top of the SMTP delivery mechanisms.
Protocols need to provide a number of guarantees regarding reliability, integrity, and so on. When designing a protocol it is important to understand where these services are best implemented, what problems and potential attackers or failures they are intended to guard against, and what tradeoffs are involved in implementing them.
While some of these services can be implemented at the lower levels of the protocol stack, in many cases they are better implemented or must be implemented at higher levels. This principle is known as the end-to-end argument and is fundamental to the design of the Internet.