Introducing WSGI: Python's Secret Web Weapon, Part Two
October 4, 2006
Web Server Gateway Interface Part II: Making Use of Middleware
In Part I we discussed how the Web Server Gateway Interface (WSGI) is helping to unify the Python web-framework world by providing a standard API through which different web applications can operate with different servers. We also looked at the HTTP protocol and how to write and deploy WSGI applications. In the second part of this article, we'll look at how to make use of existing middleware components to add functionality to your WSGI applications.
What is Middleware?
WSGI middleware is software that behaves like a server to an application, passing
an
environ
dictionary and start_response
callable to the
application in the same way a server would. Middleware components also expect the
application to return an iterable, such as a list of strings, or to be iterable themselves.
Importantly, middleware also behaves like an application to the server, expecting
to receive
an environ
dictionary and start_response
callable itself, and
returning an iterable back to the server.
Middleware effectively sits between a server and an application, isolating one from the other, and can therefore do any of the following or a combination thereof:
- Provide more functionality by adding a key to the
environ
dictionary - Change the status
- Intercept an error
- Add, remove, or change headers
- Change a response
Middleware is therefore extremely powerful and can build a broad range of discrete components that can be used with different WSGI servers and applications. For example, a middleware component can:
- Produce error documents when certain status codes are received (typically responding
to
404
and500
codes) - Email error reports to a developer if a problem occurs
- Provide interactive debugging facilities
- Forward requests to other parts of the application
- Test the API compliance of applications and servers to the WSGI
- Authenticate a user
- Cache pages
- Provide a session store
Gzip
the response
Have a look at the middleware and utilities page on the wsgi.org site to get an idea of some of the middleware that already exists. Paste, one of the packages mentioned, contains many middleware components of its own that are not listed separately on the wsgi.org page, but are worth taking the time to investigate.
Getting Started
As an example, we'll create an application that uses session middleware to store the value of a variable between requests. Here is the application that needs session support:
def application(environ, start_response): session = environ['beaker.session'] if not session.has_key('value'): session['value'] = 0 session['value'] += 1 session.save() start_response('200 OK', [('Content-type', 'text/plain')]) return ['The current value is: %d' % session['value']]
The application stores a variable called value
in the session store. On each
request, the variable is incremented and a message stating its current value is returned.
For this application to work, the environ
dictionary needs to contain the
beaker.session
key, which is provided by the session middleware from the beaker
package.
Here's how you would wrap the application in beaker
's session middleware:
from beaker.session import SessionMiddleware application = SessionMiddleware( application, key='mysession', secret='randomsecret', )
The new application
object behaves just like a normal WSGI application. When
the combined application and middleware object is called, the SessionMiddleware
adds the beaker.session
key to the environ
dictionary and calls
the original application with the modified environ
dictionary and
start_response()
callable. The application then calls
start_response()
as normal and returns an iterable to the middleware. The
middleware returns this information to the server so that, from the server's point
of view,
the combined middleware and application can be treated in exactly the same way as
a normal
WSGI application.
You can test the example above by serving the finished application with the following code and visiting http://localhost:8000 on your local machine once the server is running:
def application(environ, start_response): session = environ['beaker.session'] if not session.has_key('value'): session['value'] = 0 session['value'] += 1 session.save() start_response('200 OK', [('Content-type', 'text/plain')]) return ['The current value is: %d' % session['value']] from beaker.session import SessionMiddleware application = SessionMiddleware( application, key='mysession', secret='randomsecret', ) from wsgiref import simple_server httpd = simple_server.WSGIServer( ('',8000), simple_server.WSGIRequestHandler, ) httpd.set_app(application) httpd.serve_forever()
If you are running a version of Python prior to 2.5, you will need to download and
install
the wsgiref
package as described
in the Part I of this article. You will also need to download and install the
beaker
package, which provides SessionMiddleware
.
If you test the example above, you may find the count goes from 1
to
3
, missing 2
. This is because many web browsers try to retrieve
a /favicon.ico
file the first time a site is visited, and this request also
results in value
being incremented. If you have the LiveHTTPHeaders extension for the Firefox web browser installed, you'll be able
to see the request being made when you visit a site not already in the browser's cache.
Middleware Chains
In the previous example, we saw how adding a single middleware component to an application gave it powerful new functionality. In fact, you don't have to stop at one middleware component. Since a combined middleware and application object is also a valid WSGI application, you can also wrap the combined application and middleware object in another middleware component. This leads to the idea of middleware chains, where you have a number of pieces of middleware between the server and the application. Below is an example using some fictional middleware components:
MyEnvironMiddleware( MyStatusMiddleware( SessionMiddleware( application, key='mysession', secret='randomsecret', ) ), 'Some Configuration Option', )
In situations such as the one above where you are using a number of middleware components, it is often more convenient to structure your code like this:
application = SessionMiddleware(application, key='mysession', secret='randomsecret',) application = MyStatusMiddleware(application) application = MyEnvironMiddleware(application, 'Some Configuration Option')
In a similar way, it's possible to create an entire web-framework stack just out of individual WSGI middleware components; indeed, the popular Pylons web framework, used to build production sites worldwide, already takes this approach.
Having a stack made entirely from WSGI middleware has a huge advantage: developers are free to pick and choose the components they need, or even to replace the parts of the stack they don't like by simply changing which middleware they use. If you've ever tried changing parts of the application stack in other framework architectures, you understand how hard it can sometimes be.
Error Handling
During development of an application, it's really useful to be able to debug errors.
The
first step is to display an error report. To do this, you can use the
CgitbMiddleware
middleware from the Paste project:
from paste.cgitb_catcher import CgitbMiddleware application = CgitbMiddleware(application, {'debug':True})
Now if an exception is raised in your application code, a full error report--similar to one shown below--will be displayed.
Figure 1. CgitbMiddleware middleware in action
For production deployment, you'd want to disable this facility to prevent a visitor from accidentally being shown the values of important variables, such as passwords that might otherwise be displayed if an error occurred.
While it is undoubtedly useful to have a traceback report, it would be even more useful to be able to interactively debug each part of the call stack up to the point at which the error occurred by using a web-based command prompt. Such a solution already exists and can be added in exactly the same way:
from paste.evalexception import EvalException application = EvalException(application)
The EvalException
middleware will not work in a multiprocess environment, such
as a WSGI application deployed as a CGI script, because the middleware must be able
to store
information about the error in memory. This is not possible if the whole application
is
restarted on each request. If you want to test the EvalException
middleware,
you could use this code:
def application(environ, start_response): start_response('200 OK',[('Content-type','text/plain')]) response = [] variable1 = 'All local variables will be displayed' response.append('Everything is going fine...\n') raise Exception('Something went wrong!') response.append("We won't get to here!") return response from paste.evalexception import EvalException application = EvalException(application) from wsgiref import simple_server httpd = simple_server.WSGIServer( ('',8000), simple_server.WSGIRequestHandler, ) httpd.set_app(application) httpd.serve_forever()
If you run the example above and visit http://localhost:8000, you will see the error report. Clicking on the icon will give you the interactive debug prompt, and clicking on will give you a representation of the code at the point that the error occurred. Try entering this at the prompt and press enter:
print "Hello World!"
You will see the Hello World!
printed exactly as if it were entered at a
normal Python prompt because the middleware acts like a full Python interpreter. You
can
even use the up and down arrows on your keyboard to scroll through the command history,
just
as you can at a real Python prompt.
The EvalException
middleware also displays the values of local variables and
has an extra data
button that displays extra information about the environment.
The screenshot below shows the middleware in action with the interactive prompt and
local
variables, including variable1
, displayed:
Figure 2. EvalException middleware in action
Having this much power makes it much easier to debug applications, but makes it even more important that you disable debugging for production use so that malicious visitors can't execute destructive commands through your debug screen if an error occurred.
Once again, this example shows just how much useful functionality can be added to an application using a single middleware component.
Configuration
When the Web Server Gateway Interface was being drawn up, there were a number of discussions about how best to deploy a finished application with middleware. Clearly developers couldn't expect non-technical users to directly modify the middleware chains themselves. The most widely adopted solution is to use PasteDeploy. Users configure a config file in a familiar INI-style format, and the desired application--with any necessary middleware--is created by PasteDeploy from that file. PasteDeploy is used like this:
from paste.deploy import loadapp application = loadapp('config:/path/to/config.ini')
The configuration file can be used to specify server settings, including middleware, and even combine multiple WSGI applications together into a composite application. All the options are well documented on the PasteDeploy site.
The Many Frameworks Problem Revisited
At the start of Part I of this article, we looked at how the WSGI was created to help solve the fragmentation of the Python web community. The WSGI specification also had wider ambitions. PEP 333 states:
"If middleware can be both simple and robust, and WSGI is widely available in servers and frameworks, it allows for the possibility of an entirely new kind of Python web application framework: one consisting of loosely coupled WSGI middleware components. Indeed, existing framework authors may even choose to refactor their frameworks' existing services to be provided in this way, becoming more like libraries used with WSGI, and less like monolithic frameworks. This would then allow application developers to choose "best-of-breed" components for specific functionality, rather than having to commit to all the pros and cons of a single framework."
With today's powerful middleware, this vision is fast becoming a reality. Emerging projects such as Clever Harold provide just such a framework of loosely-coupled middleware components. Projects such as Pylons go further still, providing a ready-made configuration of WSGI middleware. We have also seen existing projects like Myghty refactored to work better with WSGI configurations.
The WSGI has shifted the point of reuse from the framework itself to individual middleware components. While developers can still create their own solutions to web development problems, as long as they're creating, using, and improving middleware components, the whole Python community now benefits.
I hope this article has demonstrated some of the power of WSGI middleware and will encourage you to make use of the specification and the various projects that already implement it. Here are some useful places to start if you wish to learn more about WSGI programming.