Menu

Advanced XML Applications in Zope

February 23, 2000

Amos Latteier



Zope is an open source application server that allows you to develop web applications quickly. With it you can develop network services that interoperate via XML.

In this article we'll look at how to build a web application that reads and writes XML. This article further develops themes begun in Creating XML Applications with Zope and Internet Scripting: Zope and XML-RPC.

Design Patterns for XML Applications

Zope publishes objects on the Web. We've seen how you can import XML into Zope, give it behavior, and publish it on the Web.

This model of using XML as Zope objects is very appealing because of its simplicity. You can explore the elements of your XML and call methods on them directly. However, in more complex XML applications, this sort of scheme may not work well. In real-world XML applications you may find the following:

  • You need to adapt heterogeneous data into a coherent object model. For example, you may need to work with XML from different DTDs, or you may need to work with both XML and non-XML data.

  • Your object model may not map well onto the DTD. For example, your DTD may describe documents as containing authors while your object model may see an author as the container of a document.

  • You may wish to use different DTDs to represent the same type of objects. If different DTDs reveal different information about your objects, how can you decide which is the authoritative description of your object?

These problems and others lead us away from using XML directly as application objects.

XML as a View

XML is not necessarily valuable in and of itself; it is useful for providing interoperability between applications. Just because XML is on the wire doesn't mean it is an appropriate internal construct for the objects that provide the network services.

XML provides a view of an object, rather than defining an object directly. An XML application doesn't exist to process XML. Instead it provides value by offering network services that are exposed via XML.

In this article we'll work from a general design pattern for Zope/XML applications that presumes that XML is consumed and produced by applications objects, but not stored internally. Of course, this pattern is not appropriate for all XML applications. For example, applications that edit XML documents or manage XML archives would have good reason to store XML internally. (For further information on XML and design patterns see XML Design Patterns.)

Example Application: RSS Channel Manager

Let's build a simple XML application in Zope to demonstrate how it's done.

For our sample application, we'll construct a web content aggregator. It should be able to gather web content from diverse sources, manipulate and search the content, and serve it back up in XML.

For simplicity's sake we'll restrict ourselves to RSS, which is an XML DTD for the exchange of simple descriptions of web content, such as stories and news items.

The RSS DTD

RSS stands for Rich Site Summary. It is an XML DTD that describes pieces of content, commonly stories or news items, on a web portal such as Slashdot. Netscape uses RSS to present users with a personalized home page on my.netscape.com.

The RSS DTD is defined in terms of channels and items. A channel is a container for items and items are short summaries of web content. Here's a short RSS file.

<?xml version="1.0"?>

<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN"

                  "http://my.netscape.com/publish/formats/rss-0.91.dtd">

  <rss version="0.91">

    <channel>

      <title>Python Dot Org</title>

      <link>http://www.python.org</link>

      <description>The Python language web site. Your source for all

      things Python!</description>

      <item>

        <title>PySol 3.20</title>

        <link>http://wildsau.idv.uni-linz.ac.at/mfx/pysol.html</link>

        <description>new version of Python Solitaire Games (using

        Tkinter); now supports 151 (!) distinct solitaire card game

        variants.</description>

      </item>

      <item>

         <title>Cryptography modules now available worldwide</title>

         <link>ftp://starship.python.net/pub/crew/amk/crypto/</link>

         <description>re-release of (historical) crypto modules; may now

         be downloaded world-wide due to relaxed US export control

         policies; however, please use mxCrypto or M2Crypto for new

         projects instead.</description>

      </item>

      <item>

         <title>Stackless Python 1.0 + Continuations 0.6</title>

         <link>http://www.tismer.com/research/stackless/</link>

         <description>a version of Python 1.5.2 that does not need space

         on the C stack, and first-class callable continuation objects for

         Python.</description>

        </item>

      </channel>

    </rss>

Architecture

Our basic strategy will be to build Zope objects from remote RSS files. We can then manipulate and query the Zope objects directly, without having to reparse any XML. The Zope objects will leverage the Zope framework to provide facilities for persistence, "through-the-web" management, security, and searching. Finally, the Zope objects will have templates to allow them to represent themselves as HTML and RSS.

Implementation

We will implement these Zope objects in Python. Building Zope objects in Python is unfortunately still not well documented. For background on this subject see the following:

In general it will not be necessary to understand all the details of extending Zope in Python to understand the basic working of the application.

Channel and Item Classes

Our channel class will mimic, to a large degree, the RSS description of a channel. Channels will contain items. Channels and items will both have attributes that are determined by the RSS data.

Channel and Item Classes
Figure 1: Channel and Item Class Diagram. (For help in interpreting this diagram, see our simple UML class diagram guide.)

In our implementation, we'll skip many of the optional attributes of channels and items for the sake of simplicity.

Instead of simply destroying and rebuilding our channel and item objects each time we fetch an RSS description of the channel, we may want to allow our channel to intelligently update itself. This method can allow the channel to archive old items. In effect, our channels will be able to hold vast numbers of items rather than the standard 15 items per channel. This extension provides a good example of why simply storing the RSS directly in Zope would be limiting—we would not be able to easily archive channel items.

Working with Channels

Installing the RSS Channel Product

If you want to follow along as we exercise and explore RSS channels in Zope, you'll need to download and install the RSS Channel Product and XML Document version 1.0a5 or later.

First download the products. Place the "tarballs" (.tgz files) in your Zope directory. Then ungzip and untar the files. On Windows, WinZip should be able to handle this for you.

Now restart Zope, and you should be able to use RSS Channels.

Creating a Channel

To create a channel, choose RSS Channel from the product add list. Then specify SlashDot as the Id and http://www.slashdot.org/slashdot.rdf as the URL. Then click Add.

There should be a short pause during which Zope fetches the RSS and constructs the channel and its items.

If all went well, you should be returned to the Zope management screen. To examine your channel, click on the new channel object named SlashDot. Then click the View tab to see an HTML representation of the channel.

SlashDot Channel View
Parsing RSS

So how did the channel get built? The fundamental step in constructing a channel is parsing the RSS data.

Zope ships with Expat, a well respected XML parser. We could use Expat directly to build the channel, but in our implementation we have chosen to use a higher level interface—XML Document. XML Document parses XML and builds a DOM tree of Zope objects. Here's a method of the channel class that uses XML Document to parse RSS:

def update(self, REQUEST=None):

    """

    Fetch the RSS content from a remote URL and update

    the channel.

    """

    # retrieve RSS file

    f=urllib.urlopen(self.url)



    # parse RSS file to DOM using XML Document

    d=Products.XMLDocument.XMLDocument.Document()

    d.parse(f)



    # get channel attributes

    c=d.getElementsByTagName("channel")[0]

    self.title=c.text_content("title")

    self.link=c.text_content("link")

    self.description=c.text_content("description")



    # get channel items

    items=d.getElementsByTagName("item")

    for item in items:

        # get information about the item

        title=item.text_content("title")

        link=item.text_content("link")

        description=item.text_content("description")



        # add the item

        self.addItem(title, link, description)



    # remember when the channel was last fetched

    self.update_last=DateTime()



    # return a management screen if called from the web

    if REQUEST is not None:

        return self.statusForm(self, REQUEST)

Let's look how this method works. First, it retrieves the remote RSS file using Python's standard urllib module. With the XML in hand, it next creates a DOM tree using XML Document. Using the DOM getElementsByTagName() method, it locates the channel element in the DOM and queries its subelements to set the channel object's attributes. Next, the method examines all the item elements in turn. It collects some information about each item and then calls another method to create an item instance inside our channel object.

We then do some house-keeping to keep track of when we last fetched the RSS file. This allows us to only fetch the remote RSS file when we need to. Finally the method returns a Zope management screen if necessary.

Displaying HTML and XML

To a channel object, the tasks of displaying itself in HTML and XML are very similar. Both views of the channel are created by methods on the channel. These display methods use Zope's template reporting language, DTML.

Part of the Zope philosophy is that objects should be able to represent themselves in multiple ways. By creating new templates, you can extend the formats in which your channels can be displayed.

You can examine the template files channelHTML.dtml and channelRSS.dtml in the lib/python/Products/RSSChannel directory to see how they work.

Using Channels in Zope

Now that we have Zope channel objects, what can we do with them besides simply look at them in HTML and RSS format? You can index and search them, manage them over the Web, protect them with a comprehensive security system, integrate them with relational databases, and integrate them with network services such as FTP, WebDAV, XML-RPC, and more.

To see some examples, import the sample file that comes with the RSSChannel.tgz distribution. Click on the Import/Export from the Zope management screen and type RSSSample.zexp in the Import file name field and click Import.

Now you should have a RSSSample folder. Take a look inside it. You'll find a couple of Channel objects, a ZCatalog that indexes the channels, and some DTML Methods that exercise the channels. To try things out, click on the View tab. Experiment with some of the options and look at the DTML methods to see how they perform their actions.

The RSSSample folder includes examples that demonstrate how to use Zope's ZCatalog to search channels. The indexChannels DTML method takes care of registering the channels and their items with the ZCatalog. The searchResults DTML method performs the ZCatalog search. The searchAsRSS DTML method shows how to represent a ZCatalog search as an RSS channel.

Where To Go From Here

This article only begins to demonstrate the potential of Zope and XML.

In the case of our RSS channel example, you might want to build an application in which different users were given access to different channels depending on their credentials. You could create new channels that are composed of searches of many other channels. You could convert RSS channels to another format such as email, and mail users items at regular intervals given their preferences. You could use Cybercash to charge users or other sites for retrieving information via RSS. You could enable remote channel management over XML-RPC.

Conclusion

Real world XML application development requires more than just the ability to retrieve and store XML. Zope provides a host of resources that can be useful for turning XML data into a web application. Zope gives you searching, security, persistence, over-the-web management, support for many network protocols, rapid application development, and more. Add to this the ability to read and write XML over the network, and you have a good environment for XML application development.