Menu

Introducing WSGI: Python's Secret Web Weapon, Part Two

October 4, 2006

James Gardner

Web Server Gateway Interface Part II: Making Use of Middleware

In Part I we discussed how the Web Server Gateway Interface (WSGI) is helping to unify the Python web-framework world by providing a standard API through which different web applications can operate with different servers. We also looked at the HTTP protocol and how to write and deploy WSGI applications. In the second part of this article, we'll look at how to make use of existing middleware components to add functionality to your WSGI applications.

What is Middleware?

WSGI middleware is software that behaves like a server to an application, passing an environ dictionary and start_response callable to the application in the same way a server would. Middleware components also expect the application to return an iterable, such as a list of strings, or to be iterable themselves. Importantly, middleware also behaves like an application to the server, expecting to receive an environ dictionary and start_response callable itself, and returning an iterable back to the server.

Middleware effectively sits between a server and an application, isolating one from the other, and can therefore do any of the following or a combination thereof:

  • Provide more functionality by adding a key to the environ dictionary
  • Change the status
  • Intercept an error
  • Add, remove, or change headers
  • Change a response

Middleware is therefore extremely powerful and can build a broad range of discrete components that can be used with different WSGI servers and applications. For example, a middleware component can:

  • Produce error documents when certain status codes are received (typically responding to 404 and 500 codes)
  • Email error reports to a developer if a problem occurs
  • Provide interactive debugging facilities
  • Forward requests to other parts of the application
  • Test the API compliance of applications and servers to the WSGI
  • Authenticate a user
  • Cache pages
  • Provide a session store
  • Gzip the response

Have a look at the middleware and utilities page on the wsgi.org site to get an idea of some of the middleware that already exists. Paste, one of the packages mentioned, contains many middleware components of its own that are not listed separately on the wsgi.org page, but are worth taking the time to investigate.

Getting Started

As an example, we'll create an application that uses session middleware to store the value of a variable between requests. Here is the application that needs session support:

    def application(environ, start_response):

        session = environ['beaker.session']

        if not session.has_key('value'):

            session['value'] = 0

        session['value'] += 1

        session.save()

        start_response('200 OK', [('Content-type', 'text/plain')])

        return ['The current value is: %d' % session['value']]

The application stores a variable called value in the session store. On each request, the variable is incremented and a message stating its current value is returned. For this application to work, the environ dictionary needs to contain the beaker.session key, which is provided by the session middleware from the beaker package.

Here's how you would wrap the application in beaker's session middleware:

    from beaker.session import SessionMiddleware



    application = SessionMiddleware(

        application,

        key='mysession',

        secret='randomsecret',

    )

The new application object behaves just like a normal WSGI application. When the combined application and middleware object is called, the SessionMiddleware adds the beaker.session key to the environ dictionary and calls the original application with the modified environ dictionary and start_response() callable. The application then calls start_response() as normal and returns an iterable to the middleware. The middleware returns this information to the server so that, from the server's point of view, the combined middleware and application can be treated in exactly the same way as a normal WSGI application.

You can test the example above by serving the finished application with the following code and visiting http://localhost:8000 on your local machine once the server is running:

    def application(environ, start_response):

        session = environ['beaker.session']

        if not session.has_key('value'):

            session['value'] = 0

        session['value'] += 1

        session.save()

        start_response('200 OK', [('Content-type', 'text/plain')])

        return ['The current value is: %d' % session['value']]

        

    from beaker.session import SessionMiddleware



    application = SessionMiddleware(

        application,

        key='mysession',

        secret='randomsecret',

    )

    

    from wsgiref import simple_server

    httpd = simple_server.WSGIServer(

        ('',8000),

        simple_server.WSGIRequestHandler,

    )

    httpd.set_app(application)

    httpd.serve_forever()

If you are running a version of Python prior to 2.5, you will need to download and install the wsgiref package as described in the Part I of this article. You will also need to download and install the beaker package, which provides SessionMiddleware.

If you test the example above, you may find the count goes from 1 to 3, missing 2. This is because many web browsers try to retrieve a /favicon.ico file the first time a site is visited, and this request also results in value being incremented. If you have the LiveHTTPHeaders extension for the Firefox web browser installed, you'll be able to see the request being made when you visit a site not already in the browser's cache.

Middleware Chains

In the previous example, we saw how adding a single middleware component to an application gave it powerful new functionality. In fact, you don't have to stop at one middleware component. Since a combined middleware and application object is also a valid WSGI application, you can also wrap the combined application and middleware object in another middleware component. This leads to the idea of middleware chains, where you have a number of pieces of middleware between the server and the application. Below is an example using some fictional middleware components:

    MyEnvironMiddleware(

        MyStatusMiddleware(

            SessionMiddleware(

                application,

                key='mysession',

                secret='randomsecret',

            )

        ),

        'Some Configuration Option',

    )

In situations such as the one above where you are using a number of middleware components, it is often more convenient to structure your code like this:

    application = SessionMiddleware(application, key='mysession', secret='randomsecret',)

    application = MyStatusMiddleware(application)

    application = MyEnvironMiddleware(application, 'Some Configuration Option')

In a similar way, it's possible to create an entire web-framework stack just out of individual WSGI middleware components; indeed, the popular Pylons web framework, used to build production sites worldwide, already takes this approach.

Having a stack made entirely from WSGI middleware has a huge advantage: developers are free to pick and choose the components they need, or even to replace the parts of the stack they don't like by simply changing which middleware they use. If you've ever tried changing parts of the application stack in other framework architectures, you understand how hard it can sometimes be.

Error Handling

During development of an application, it's really useful to be able to debug errors. The first step is to display an error report. To do this, you can use the CgitbMiddleware middleware from the Paste project:

    from paste.cgitb_catcher import CgitbMiddleware

    application = CgitbMiddleware(application, {'debug':True})

Now if an exception is raised in your application code, a full error report--similar to one shown below--will be displayed.

CgitbMiddleware middleware in action
Figure 1. CgitbMiddleware middleware in action

For production deployment, you'd want to disable this facility to prevent a visitor from accidentally being shown the values of important variables, such as passwords that might otherwise be displayed if an error occurred.

While it is undoubtedly useful to have a traceback report, it would be even more useful to be able to interactively debug each part of the call stack up to the point at which the error occurred by using a web-based command prompt. Such a solution already exists and can be added in exactly the same way:

    from paste.evalexception import EvalException

    application = EvalException(application)

The EvalException middleware will not work in a multiprocess environment, such as a WSGI application deployed as a CGI script, because the middleware must be able to store information about the error in memory. This is not possible if the whole application is restarted on each request. If you want to test the EvalException middleware, you could use this code:

    def application(environ, start_response):

        start_response('200 OK',[('Content-type','text/plain')])

        response = []

        variable1 = 'All local variables will be displayed'

        response.append('Everything is going fine...\n')

        raise Exception('Something went wrong!')

        response.append("We won't get to here!")

        return response



    from paste.evalexception import EvalException

    application = EvalException(application)

    

    from wsgiref import simple_server

    httpd = simple_server.WSGIServer(

        ('',8000),

        simple_server.WSGIRequestHandler,

    )

    httpd.set_app(application)

    httpd.serve_forever()

If you run the example above and visit http://localhost:8000, you will see the error report. Clicking on the plus icon will give you the interactive debug prompt, and clicking on >> will give you a representation of the code at the point that the error occurred. Try entering this at the prompt and press enter:

    print "Hello World!"

You will see the Hello World! printed exactly as if it were entered at a normal Python prompt because the middleware acts like a full Python interpreter. You can even use the up and down arrows on your keyboard to scroll through the command history, just as you can at a real Python prompt.

The EvalException middleware also displays the values of local variables and has an extra data button that displays extra information about the environment. The screenshot below shows the middleware in action with the interactive prompt and local variables, including variable1, displayed:

EvalException middleware in action
Figure 2. EvalException middleware in action

Having this much power makes it much easier to debug applications, but makes it even more important that you disable debugging for production use so that malicious visitors can't execute destructive commands through your debug screen if an error occurred.

Once again, this example shows just how much useful functionality can be added to an application using a single middleware component.

Configuration

When the Web Server Gateway Interface was being drawn up, there were a number of discussions about how best to deploy a finished application with middleware. Clearly developers couldn't expect non-technical users to directly modify the middleware chains themselves. The most widely adopted solution is to use PasteDeploy. Users configure a config file in a familiar INI-style format, and the desired application--with any necessary middleware--is created by PasteDeploy from that file. PasteDeploy is used like this:

    from paste.deploy import loadapp

    application = loadapp('config:/path/to/config.ini')

The configuration file can be used to specify server settings, including middleware, and even combine multiple WSGI applications together into a composite application. All the options are well documented on the PasteDeploy site.

The Many Frameworks Problem Revisited

At the start of Part I of this article, we looked at how the WSGI was created to help solve the fragmentation of the Python web community. The WSGI specification also had wider ambitions. PEP 333 states:

"If middleware can be both simple and robust, and WSGI is widely available in servers and frameworks, it allows for the possibility of an entirely new kind of Python web application framework: one consisting of loosely coupled WSGI middleware components. Indeed, existing framework authors may even choose to refactor their frameworks' existing services to be provided in this way, becoming more like libraries used with WSGI, and less like monolithic frameworks. This would then allow application developers to choose "best-of-breed" components for specific functionality, rather than having to commit to all the pros and cons of a single framework."

With today's powerful middleware, this vision is fast becoming a reality. Emerging projects such as Clever Harold provide just such a framework of loosely-coupled middleware components. Projects such as Pylons go further still, providing a ready-made configuration of WSGI middleware. We have also seen existing projects like Myghty refactored to work better with WSGI configurations.

The WSGI has shifted the point of reuse from the framework itself to individual middleware components. While developers can still create their own solutions to web development problems, as long as they're creating, using, and improving middleware components, the whole Python community now benefits.

I hope this article has demonstrated some of the power of WSGI middleware and will encourage you to make use of the specification and the various projects that already implement it. Here are some useful places to start if you wish to learn more about WSGI programming.