
The Document is the Database
by Jon UdellJuly 09, 2003
When you need to store and display a modest amount of structured or semistructured data, it's tempting to store it directly in an HTML file. I've used this strategy many times; undoubtedly you have too. The advantages and disadvantages of working directly with a presentation format are pretty clear. It's handy that the "database" is a self-contained package that can be updated using any text editor, emailed, read directly from a file system, or served by any web server. But it's awkward to share the work of updating with other people or to isolate and edit parts of the file as it grows. When we convert to a database-backed web application in order to solve these problems, we trade away the convenience of the file-oriented approach. Can we have our cake and eat it too? This month's column explores the idea that a complete web application can be wrapped around an XHTML document, using XSLT for search, insert, and update functions.
I've been developing the idea in the context of the Zope application server, so the first order of business was to come up with an XSLT wrapper for Zope. Since Zope is written in Python, my first inclination was to use a Python binding to an XSLT library. But which one? libxslt is a popular choice, but on one particular FreeBSD system -- where I lack the root privileges needed to install libxslt -- I use Sablotron instead. On Windows, meanwhile, MSXML is the incumbent. So I settled on the more basic strategy of wrapping a command-line XSLT processor -- such as libxslt's xsltproc, Sablotron's sabcmd, or MSXML's msxsl.exe -- in a Zope method. Because this method calls OS functions to create and remove temporary files, it has to be deployed in Zope as an External Method rather than a Python Script. To do that, I put the code in a file called xslt.py, put the file in Zope's Extensions directory, added an External Method called 'xslt' to a folder, and used 'xslt' as both its module name and function name.
xsltproc = 'xsltproc' # or 'sabcmd' or 'msxsl.exe'
# see http://www.linuxjournal.com/article.php?sid=5821
def formatExceptionInfo(maxTBlevel=5):
import traceback, sys
cla, exc, trbk = sys.exc_info()
excName = cla.__name__
try:
excArgs = exc.__dict__["args"]
except KeyError:
excArgs = "<no args>"
excTb = traceback.format_tb(trbk, maxTBlevel)
return (excName, excArgs, excTb)
def xslt(self,xsl,idXml,update=0):
try:
import os, tempfile
xmldata = self.findFileInFolder(idXml)
xslfile = tempfile.mktemp()
xmlfile = tempfile.mktemp()
outfile = tempfile.mktemp()
errfile = tempfile.mktemp()
open(xslfile,"w").write(xsl)
open(xmlfile,"w").write(xmldata.data)
cmd = "( %s %s %s) > %s 2> %s" % (
xsltproc, xslfile, xmlfile, outfile, errfile)
errlev=os.system(cmd) >> 8
out=open(outfile,"r").read()
err=open(errfile,"r").read()
os.remove(xslfile)
os.remove(xmlfile)
os.remove(outfile)
os.remove(errfile)
if ( update == 1 ):
self.manage_delObjects(idXml)
self.manage_addFile(idXml,out,'')
except:
return formatExceptionInfo()
if ( errlev > 0 ):
return err
else:
return out
The method has three required arguments and an optional fourth, but the first argument, self, is supplied automatically by Zope. It's the folder from which the method is called and, through the magic of Zope acquisition, it can be any folder below the one containing the External Method. The second argument is the XSLT data which, as we'll see, is produced by other scripts that interpolate values into templates. The third argument is the name (in Zope lingo, the id) of the Zope File object containing the XML data to be transformed. In this case, that File has a html extension, contains XHTML, and is served with a text/html content type. The optional fourth argument, update, defaults to false, but when true causes the XSLT transformation to overwrite the XML data.
To set the stage, let's suppose we're collecting and displaying data about speakers at a conference. Here's the shell of our XHTML data:
<?xml version="1.0"?>
<body>
<style>
.speaker { margin-bottom: 10px }
.speakername { font-weight: bold }
.speakerTitle { font-style: italic }
</style>
<speakers>
</speakers>
</body>
And let's assume that we're dealing with multiple conferences, so the Zope namespace looks like this:
/Conferences/OSCON
/Conferences/ETech
Our xslt External Method, installed in the /Conferences folder, can be acquired by any subfolder, as can the other scripts we'll use to add, find, and update speaker data. If the data are stored in a file called speakers.html, there can be multiple instances of it -- for example, /Conferences/OSCON/speakers.html and /Conferences/ETech/speakers.html.
Now let's add a speaker to /Conferences/OSCON/speakers.html. This script, called add, kicks off the process:
form = '''
<script>
function insertSpeaker(){
speaker = document.insertSpeaker.speaker.value;
location = 'insert?speakerEmail=' + speaker;
}
</script>
<form name="insertSpeaker" method="post" action="javascript:insertSpeaker()">
<div>new speaker's email address: <input name="speaker"/> </div>
<div><input type="submit" value="insertSpeaker"/></div>
</form>
'''
return context.showMenu() + form
The add script is a Python Script, not an External Method, which means that it's subject to security restrictions but is more convenient to update. It's located in /Conferences, but when called as /Conferences/OSCON/add it sets up a context that will cause /Conferences/OSCON/speakers.html to be updated. The script simply displays a form that collects the speaker's email address -- which will serve as the key into our XHTML database -- and passes it (by way of JavaScript) to another Python Script, insert:
speakerEmail = context.REQUEST.form['speakerEmail']
xsl = '''%s
%s
<xsl:template match="//speakers">
<speakers>
<xsl:if test="count(//div[@email='%s'])=0" >
<xsl:text> </xsl:text>
<div class="speaker" email="%s"><xsl:text> </xsl:text>
<div class="speakerTitle"/><xsl:text> </xsl:text>
<div class="speakerName"/><xsl:text> </xsl:text>
<div class="speakerTitle"/><xsl:text> </xsl:text>
<div class="speakerBio"><p>bio</p></div><xsl:text> </xsl:text>
</div>
</xsl:if>
<xsl:apply-templates />
</speakers>
</xsl:template>
</xsl:stylesheet>
''' % (context.xsltPreamble(), context.xsltIdentityTransform(),
speakerEmail, speakerEmail)
try:
context.acquireLock()
except:
return "insert: exception acquiring lock"
try:
context.xslt(xsl, 'speakers.html', update=1)
except:
return "insert: exception updating"
try:
context.releaseLock()
except:
return "insert: exception releasing lock"
return context.REQUEST.RESPONSE.redirect('select?key='+speakerEmail)
In a Zope Python Script, all the interesting stuff hangs off the context variable. In this case, we'll use it to get to the HTTP request with the caller's form data, to locate some convenience scripts that supply XSLT boilerplate, and to locate our xslt External Method.
The XSLT script that's created is a filter for speakers.html. It locates the <speakers> node in that file. If no <speaker> node with the given email address exists, it inserts one. The XSLT identity transform, i.e.:
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
passes the rest of the XML data through the filter unchanged. When the insert script makes this call:
context.xslt(xsl, 'speakers.html', update=1)
the xslt external method receives an implicit first argument, self, which represents the context folder, in this case /Conferences/OSCON. It uses that handle in three times:
| self.findFileInFolder | Convert the name (e.g. speakers.html) to a ZODB object reference. The
findFileInFolder function is:
|
| self.manage_delObjects | Delete speakers.html. |
| self.manage_addFile | Recreate speakers.html. |
The result is a modified speakers.html with a newly-added speaker:
<?xml version="1.0"?>
<body>
<style>
.speaker { margin-bottom: 10px }
.speakername { font-weight: bold }
.speakerTitle { font-style: italic }
</style>
<speakers>
<div class="speaker" email="dj.adams@pobox.com">
<div class="speakerTitle"/>
<div class="speakerName"/>
<div class="speakerTitle"/>
<div class="speakerBio"><p>bio</p></div>
</div>
</speakers>
</body>
Finally, the insert script redirects to another script, select, which locates the newly-added node and presents it for editing. Here's that script:
xsl = '''%s
<xsl:template match="//div[@email='%s']">
<xsl:for-each select=".">
<xsl:sort/>
<form method="post" action="update">
<div><input type="hidden" name="speakerEmail" value="{@email}"/>
<xsl:value-of select="@email"/></div>
<div>speakerName:
<input name="speakerName" value="{normalize-space(*[@class='speakerName'])}"/>
</div>
<div>speakerTitle:
<input name="speakerTitle" value="{normalize-space(*[@class='speakerTitle'])}"/>
</div>
<div>speakerBio: <textarea name="speakerBio">
<xsl:copy-of select="*[@class='speakerBio']/*"/>
</textarea></div>
<div><input value="update" type="submit"/></div>
</form>
</xsl:for-each>
</xsl:template>
<xsl:template match="text()">
</xsl:template>
</xsl:stylesheet>'''
xsl = xsl % (context.xsltPreamble(),context.REQUEST.form['key'])
return context.showMenu() + context.xslt(xsl,'speakers.html')
The select script uses XSLT to find a speaker node and generate an update form. The work of updating is handled by another script, update:
speakerEmail = context.REQUEST.form['speakerEmail']
speakerName = context.REQUEST.form['speakerName']
speakerTitle = context.REQUEST.form['speakerTitle']
speakerBio = context.REQUEST.form['speakerBio']
xsl = '''%s
%s
<xsl:template match="//div[@email='%s']">
<div class="speaker" email="%s">
<xsl:text> </xsl:text>
<div class="speakerName">
%s
</div>
<xsl:text> </xsl:text>
<div class="speakerTitle">
%s
</div>
<xsl:text> </xsl:text>
<div class="speakerBio">
%s
</div>
<xsl:text> </xsl:text>
</div>
</xsl:template>
</xsl:stylesheet>
''' % (context.xsltPreamble(), context.xsltIdentityTransform(),
speakerEmail, speakerEmail, speakerName, speakerTitle, speakerBio)
try:
context.acquireLock()
except:
return "update: exception acquiring lock"
try:
context.xslt(xsl, 'speakers.html', update=1)
except:
return "update: exception updating"
try:
context.releaseLock()
except:
return "update: exception releasing lock"
return context.REQUEST.RESPONSE.redirect('select?key='+speakerEmail)
After the update, speakers.html might look like this:
<?xml version="1.0"?>
<body>
<style>
.speaker { margin-bottom: 10px }
.speakername { font-weight: bold }
.speakerTitle { font-style: italic }
</style>
<speakers>
<div class="speaker" email="dj.adams@pobox.com">
<div class="speakerName">
DJ Adams
</div>
<div class="speakerTitle">
SAP hacker
</div>
<div class="speakerBio">
<p>
DJ Adams is an old SAP hacker who still thinks JCL and S/370 assembler
is pretty cool. In recent years he's been successfully combining Open
Source software with R/3 to produce hybrid systems that show off the
power of free software.
</p>
<p>
He is the author of O'Reilly's <a
href="http://www.oreilly.com/catalog/jabber/"><i>Programming
Jabber</i></a>, contributes <a
href="http://www.oreillynet.com/pub/au/139">articles</a> to
O'ReillyNet's P2P site, and has to own up to being responsible for the
Jabber::Connection, Jabber::RPC and Jabber::Component::Proxy modules
on CPAN.
</p>
</div>
</div>
</speakers>
</body>
As new speaker nodes are added to the file, they push down the older ones. In this naive implementation, there's no effort to sort the nodes stored in the XHTML file. But here's another script, find, that uses XSLT to produce an HTML SELECT statement sorted by speakers' email addresses. The selected item is fed to the select script for updating.
xsl = '''%s
<xsl:template match="//speakers">
<script>
function chooseSpeaker(){
var list = document.chooseSpeaker.speakers;
speaker = list[list.selectedIndex].value;
location = 'select?key=' + speaker;
}
</script>
<form name="chooseSpeaker" method="post" action="javascript:chooseSpeaker()">
<select name="speakers">
<xsl:apply-templates select="./div[@class='speaker']">
<xsl:sort select="@email"/>
</xsl:apply-templates>
</select>
<div><input value="chooseSpeaker" type="submit" /></div>
</form>
</xsl:template>
<xsl:template match="//div[@class='speaker']">
<option value="{@email}"><xsl:value-of select="@email"/></option>
</xsl:template>
<xsl:template match="text()">
</xsl:template>
</xsl:stylesheet>''' % (context.xsltPreamble())
return context.showMenu() + context.xslt(xsl,'speakers.html')
More from Jon Udell | |
Lightweight XML Search Servers, Part 2 | |
As speakers are added and updated, the speakers.html file remains immediately viewable in the browser. The file can also be searched in a structured way, using the technique I explored last month. Here, for example, is a query that finds speakers whose biographies contain 'JCL':
//div[@class='speaker'][contains(./div[@class='speakerBio'] ,
'JCL')]
Is this really a practical way to manage a collection of semistructured data? Frankly, I'm undecided. But it's an interesting preview of how things will be when native XML storage, and node-level update capability, are standard features of all databases. Meanwhile, the ability to use Python to generate and run XSLT transformations, in a Zope context, seems like a useful pattern.
Share your comments on this article in our forum.
(* You must be a member of XML.com to use this feature.)
Comment on this Article
| Titles Only | Titles Only | Newest First |
- 24 Hour Auto Repair Services Roadside Assistance Los Angeles 1-877-364-5264
2009-07-02 15:00:15 carpetcare [Reply]
24 Hour Emergency Towing Service
24 Hour Winching Service
24 Hour Battery and Jumpstart Assistance
24 Hour Flat Tire Service
24 Hour Lost Key and Lockout Service
Up to $100 for a Locksmith
24 Hour Out-of-Gas Fuel Delivery
Emergency Transportation Service
24 Hour Emergency Personal Assistance
24 Hour Map and Routing Service
Roadside Assistance Small Maintenance Repair Fixing Auto Services.
24 Hour Emergency Towing Service: As a member, your vehicle can be towed up to 15 miles to the repair facility of your choice. We will pay up to $100.00 for winching services if necessary. Note: Only one tow per disablement.
24 Hour Winching Service: If your vehicle is stuck and you only need to have your vehicle extracted by use of a winch, then we cover that too.
24 Hour Battery and Jumpstart Assistance: If you are unable to get your car started, a maintenance person will be sent to provide a battery jump and help you get your car running again. Whether on the road or parked at home we will send someone out. If your battery cannot be jumped then your vehicle will be towed to a repair facility of your choice.
24 Hour Flat Tire Service: We will send a maintenance person to you to change your vehicle's flat tire to a drivable spare, enabling you to drive to safety. It is your responsibility to have a safe and working spare with your vehicle at all times.
24 Hour Lost Key and Lockout Service: We will send a pre-authorized contractor to unlock your vehicle using manufacturer approved tools. If there is not a network contractor available in your area, as a member, you are entitled to $100.00 toward the cost of a locksmith.
Licensed, Insured & Bonded Professional Locksmiths
24 Hour Out-of-Gas Fuel Delivery: If you should run out of gasoline, a service technician will deliver fuel to get you on your way. Your membership includes 3 gallons of fuel delivered to you free of charge.
Emergency Transportation Service: If you find yourself on the side of the road with your car disabled give us a call and we will arrange transportation for you for up to 15 miles. This service can only be used instead of another service. It is for those times where you just have to get where you are going whether your vehicle is disabled or not.
24 Hour Emergency Personal Assistance: During any weather related or roadside emergency, our customer service staff can relay messages to loved ones and receive messages for you, helping you to put your mind at ease.
http://www.roadsideassistancelosangeles.com
1-877-364-5264
- christmas tree lights company Los Angeles 310-925-1720
2008-10-16 10:11:37 orellytos [Reply]
Christmas Tree Lights Indoor Outdoor Los Angeles 1-310-925-1720 Decoration Sale and Installation
Christmas Lights Decoration Installation
Welcome: Holiday lights Christmas lights (also sometimes called fairy lights, twinkle lights or holiday lights in the United States) are strands of electric lights used to decorate homes, public/commercial buildings and Christmas trees during the Christmas season. Christmas lights come in a dazzling array of configurations and colors. The small "midget" bulbs commonly known as fairy lights are also called Italian lights in some parts of the U.S.,
Experience pays off! Our experience can save you hundreds, if not thousands, of dollars by determining the best combination of services to meet your needs — that means every project we build is customized for you, not all home Christmas lights decorations project are identical.
We are known for our reliability, superior workmanship & impeccable service. Using only quality materials, our standards of excellence provide you the most return for your investment. Over the years, we have developed a deep respect for the importance of individual expression in home Christmas lights decor. Right from the start of every project, we strive to fully understand and incorporate your individuality into every phase of planning, design and Christmas lights Sale and decorations.
We offer the following Products and Services:
Christmas Lighting New inside / outside christmas planter that lights up
Full service sales and installation departments
Custom pole-mounted banner sales and installation
Large animated holiday displays
Custom holiday displays
Leasing and rental programs
In house graphic arts department
Knowledgeable and helpful year round staff.
call 1-310-925-1720
- The Document is the Database
2008-06-13 06:11:56 docsharp01 [Reply]
Great blog with lots of useful information and excellent commentary! Thanks for sharing.
http://www.1-satellite-tv-facts.com/Direct-TV.html
http://www.1-satellite-tv-facts.com/Dish-Network.html
http://www.1-satellite-tv-facts.com/Satellite-Radio.html
http://www.1-satellite-tv-facts.com/T1-Internet-Service.html
http://www.1-satellite-tv-facts.com/Satellite-DSL.html
http://www.1-satellite-tv-facts.com/Satellite-Internet.html
http://www.1-satellite-tv-facts.com/VoIP.html
http://www.1-satellite-tv-facts.com/Phone-Systems.html
http://www.1-satellite-tv-facts.com/Affiliate-Programs.html
- I have an idea...is this right?
2004-03-01 15:08:03 Michael Kelley [Reply]
Hello I am trying to figure all this out. I do not have much experience with building programs but I found a place to work out to one. I have several questions and maybe someone can point me in the right direction.
My idea is to come up with a databsae for managing people and associated files using the same ideas as iTunes. Think of it as instead of music files...have personel files. Maybe something like this is already done. I would think XML is what is used for this to work.
I would like multiple people to edit information and have it reflected in a semi real time form. All using the same application...change info about the person...it changes the associated data for you and manages all the back directory structure for you.
SO here is my questions...
1. What is iTunes? Is it a glorified database that can write to files for catalogging?
2. What language is it written in?
3. Would it be somewhat easy to write a program simular to it except with people's names and personel info?
Thank you for everybodies help.
Mike
- I have an idea...is this right?
2005-12-31 16:22:16 www.tuinadvies.be [Reply]
Dit is een test uit België
Vrg,
http://www.tuinadvies.be
- I have an idea...is this right?
- remember transquery
2003-07-20 10:59:41 James Fuller [Reply]
The idea that xml files can be 'the database', is not new; check out ( from the querying aspect ) Transquery http://www.xmlportfolio.com/transquery/.
I think what this article highlights is the need for an update language, not unlike XML DB XUpdate...but something simpler.
jim fuller
- Interesting approach
2003-07-15 10:11:05 Douwe Osinga [Reply]
This is certainly an interesting approach and a good example of using the tools available to get something done.
However, storing XML in DTML documents would only be my second choice; If you're using Zope, using a product like ParsedXML (http://www.zope.org/Members/faassen/ParsedXML), seems more logical (but you would need some priveleges to install the product of course). This product will allow you to have xml documents right in your Zope tree. These xml documents can then be easily updated from Zope, by DTML method, Python Script or External method.
By the way, using self[idXml] would be more efficient than findFileInFolder. It raises an exception when the object is not found, but an exception would be raised anyway.
Douwe Osinga
http://douweosinga.com
