httplib2: HTTP Persistence and Authentication
Last time we covered HTTP caching and how it can improve the performance of your web service. This time we'll cover some other aspects of HTTP that, if fully utilized, can also speed up your web service.
Persistent connections are critical to performance. In early versions of HTTP, connections from the client to the server were built up and torn down for every request. That's a lot of overhead on the client, on the server, and on any intermediaries. The persistent connection approach, that is, keeping the same socket connection open for multiple requests, is the default behavior in HTTP 1.1.
Now if all HTTP 1.1 connections are considered persistent, then there must be a mechanism to signal that the connection is to be closed, right? That is handled by the
The header signals that the connection is to be closed after the current request-response is finished. Note that either the client or the server can send such a header.
If you allow persistent connections, then the next obvious optimization is pipelining: stuffing a bunch of requests down a socket without waiting for the response from the first request to be returned before sending subsequent requests. Now this only works for certain types of requests; at a minimum, those requests have to be idempotent. Now aren't you glad you made all your
GETs idempotent when you designed your RESTful web service?
So now we're saving time and bandwidth by using caching to avoid retrieving content if it hasn't changed, and using persistent connections to avoid the overhead of tearing down and rebuilding sockets. If you have an entity to transfer, then you can still speed things up by transferring fewer bytes over the wire--that is, by using compression.
Though RFC 2616 specifies three types of compression, the values are actually tracked in an IANA registry and could in theory be supplemented by others. But it's been nine years since HTTP 1.1 was released and it hasn't been added to yet. Even at that, with three types of compression specified, only two,
compress, are regularly seen in the wild.
The way compression normally works is that the client announces the types of compression it can handle by using the
Accept-Encoding: request header:
Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
Those are weighted parameters with resolution rules similar to the mime-types in
Accept: headers. I covered parsing and interpreting those in Just Use Media Types?, which you should read if you missed it the first time.
If the server supports any of the listed compression types, it can compress the response entity and announce how a response was compressed so that the client can decompress it correctly. That information is carried by the
In the process of implementing
httplib2 I also discovered some rough spots in HTTP implementations.
Pages: 1, 2