XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Doing HTTP Caching Right: Introducing httplib2

Doing HTTP Caching Right: Introducing httplib2

February 01, 2006

You need to understand HTTP caching. No, really, you do. I have mentioned repeatedly that you need to choose your HTTP methods carefully when building a web service, in part because you can get the performance benefits of caching with GET. Well, if you want to get the real advantages of GET then you need to understand caching and how you can use it effectively to improve the performance of your service.

This article will not explain how to set up caching for your particular web server, nor will it cover the different kinds of caches. If you want that kind of information I recommend Mark Nottingham's excellent tutorial on HTTP caching.

Goals

First you need to understand the goals of the HTTP caching model. One objective is to let both the client and server have a say over when to return a cached entry. As you can imagine, allowing both client and server to have input on when a cached entry is to be considered stale is obviously going to introduce some complexity.

The HTTP caching model is based on validators, which are bits of data that a client can use to validate that a cached response is still valid. They are fundamental to the operation of caches since they allow a client or intermediary to query the status of a resource without having to transfer the entire response again: the server returns an entity body only if the validator indicates that the cache has a stale response.

Validators

One of the validators for HTTP is the ETag. An ETag is like a fingerprint for the bytes in the representation; if a single byte changes the ETag also changes.

Using validators requires that you already have done a GET once on a resource. The cache stores the value of the ETag header if present and then uses the value of that header in later requests to that same URI.

For example, if I send a request to example.org and get back this response:

HTTP/1.1 200 OK
Date: Fri, 30 Dec 2005 17:30:56 GMT
Server: Apache
ETag: "11c415a-8206-243aea40"
Accept-Ranges: bytes
Content-Length: 33286
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=7200
Expires: Fri, 30 Dec 2005 19:30:56 GMT
Content-Type: image/png 

-- binary data --

Then the next time I do a GET I can add the validator in. Note that the value of ETag is placed in the If-None-Match: header.

GET / HTTP/1.1
Host: example.org
If-None-Match: "11c415a-8206-243aea40"

If there was no change in the representation then the server returns a 304 Not Modified.

HTTP/1.1 304 Not Modified 
Date: Fri, 30 Dec 2005 17:32:47 GMT

If there was a change, the new representation is returned with a status code of 200 and a new ETag.

HTTP/1.1 200 OK
Date: Fri, 30 Dec 2005 17:32:47 GMT
Server: Apache
ETag: "0192384-9023-1a929893"
Accept-Ranges: bytes
Content-Length: 33286
Vary: Accept-Encoding,User-Agent
Cache-Control: max-age=7200
Expires: Fri, 30 Dec 2005 19:30:56 GMT
Content-Type: image/png 

-- binary data --

Pages: 1, 2, 3, 4

Next Pagearrow