Generating XML and HTML using XQuery
by Per Bothner
|
Pages: 1, 2, 3, 4, 5, 6
Things start to get interesting when we get to the call to
make-rows. This is a recursive helper used to divide a
sequence of <picture> elements into rows of at most
three. It's quite straightforward if you're comfortable with recursive
functions. (If you're not comfortable using recursion, it may seem
mysterious, but it does work.) The $pictures parameter
is a sequence of the <picture> elements we haven't
dealt with yet. If the list is empty, we return nothing (the empty
sequence ()). If there are 3 or fewer pictures, we pass them
to the format-row function, which makes a single row of
thumbnail pictures. If there are exactly 4 pictures total, we take the
first two pictures, and pass them the format-row function,
which puts 2 thumbnail pictures in one row, and then we do the same with
the remaining 2 pictures. Otherwise, there are more than 3 pictures, so we
take the first 3 pictures, put them in a row using
format-row, then we take the rest of the pictures (starting
at picture 4), and recursively call make-rows. This processes
the rest of the pictures in the same way, putting 3 pictures in a row each
time, until we come to the end.
This is the make-rows function:
{-- Process a sequence of <picture> elements, grouping them into
-- rows of at most 3, for the thumbnail page.
-- $prev: An integer giving the number of <pictures> in
this current sequence that precede those in $pictures.
-- $pictures: Remaining <picture> elements to processes.
-- Returns formatted HTML for the input $pictures.
--}
define function make-rows($prev, $pictures) {
let $count := count($pictures) return
if ($count = 0) then ()
else if ($count < 3) then
format-row($pictures)
{-- A special case: If there are 4 pictures in a row, then group them
-- as 2 rows of 2 rather than 3 + 1. --}
else if ($count = 4 and $prev = 0) then
(format-row(sublist($pictures, 1,2)),
format-row(sublist($pictures, 3,2)))
else
(format-row(sublist($pictures, 1,3)),
make-rows($prev+3,sublist($pictures,4)))
}
The format-row function loops over a sequence of
<picture> elements and calls
make-thumbnail on each one. If the
<picture> has a <caption> child, it
is placed underneath the thumbnail. We wrap each thumbnail plus its
caption inside an <a> HTML link wrapped inside an HTML
<table>, finally wrapping the entire row in another
HTML <table>.
define function format-row($row) {
{-- emit a newline for readability --} "
",<table width="90%" class="row"><tr>{
for $pic in $row return
<td>
<table bgcolor="black" cellpadding="0" frame="border"
border="0" rules="none" vspace="10">
<tr>
<td align="center"><a href="{$pic/@id}.html">{
make-thumbnail($pic)}</a></td>
</tr>
{ if ($pic/caption) then
<tr>
<td bgcolor="#FFFF99" align="center"><a class="textual"
href="{$pic/@id}.html">{$pic/caption/node()}</a></td>
</tr>
else ()}
</table>
</td>
}</tr></table>
}
Finally, make-thumbnail looks for a
<small-image> child that contains the actual name of
the JPEG file and calls make-img to emit the
<img> image link. If there is no
<small-image>, we look for a <image>
or a <full-image> element and have the browser scale
the image to a smaller size. (In that case the browser has to download the
bigger image, only to throw away most of the detail, so this is not
something you want to do normally.)
define function make-thumbnail($pic) {
if ($pic/small-image) then
make-img($pic/small-image, 1.0)
else if ($pic/image) then
make-img($pic/small-image, 0.5)
else if ($pic/full-image) then
make-img($pic/full-image, 0.2)
else
( "(missing small-image)", string($pic), ")" )
}
define function make-img($picture, $scale) {
<img border="1" src="{$picture}" width="{number($picture/@width) * $scale}"
height="{number($picture/@height) * $scale}" />
}
A Slight Refinement
The code so far ignores any children of the top-level
<group> except for
<picture>. However, if you look at the original
index.xml example, you'll see that the
<group> has a <text> child. Let's
place any such <text> contents in the appropriate place
on the overview page. Another sometimes useful feature is to be able to
explicitly specify how the pictures are to be organized into rows, rather
than depend on the default maximum of three per row. You can do that by
editing the index.xml file to place a
<row> element around one or more
<picture> elements to specify that these should go in a
row by themselves:
define function make-group-page($group) {
<html>
... rest as before ..
<h2>{$group/title/node()}</h2>
{ find-rows((), $group/*)}
</body>
</html>
}
{-- Process the children of a <group>, grouping thumbnails into rows.
-- $pictures: Sequence of <picture>s that need to be split into rows.
-- $unseen: sequence of remaining children we have not processed yet.
--}
define function find-rows($pictures, $unseen) {
if (empty($unseen)) then make-rows(0, $pictures)
else
let $next := item-at($unseen, 1),
$rest := sublist($unseen, 2)
return
typeswitch ($next)
case element row return
(make-rows(0, $pictures),format-row($next/*),find-rows((), $rest))
case element date return {-- ignore <date> children here. --}
(make-rows(0, $pictures),find-rows((), $rest))
case element title return {-- ignore <title> children here. --}
(make-rows(0, $pictures),find-rows((), $rest))
case element text return {-- format <text> as a paragraph. --}
(make-rows(0, $pictures),<p>{$next/node()}</p>,find-rows((), $rest))
default return
find-rows(($pictures,$next), $rest)
}
The initial call to find-rows sets $pictures
to the empty sequence and $unseen to the sequence of all the
child elements of the top-level <group>. If
$unseen is the empty sequence, we're done: we just call
make-row to wrap up the last row. Otherwise, we look at the
first element of $unseen.
We use a typeswitch expression to do the appropriate thing
depending on the type of that first element. A typeswitch
evaluates an expression (in this case $next, the first value
of the $unseen sequence). Then it searches through the
case clauses, each of which specifies a type. It selects the
first case clause such that the $next value is
an instance of the type and evaluates the corresponding
return expression. If there is no matching case,
it evaluates the default return instead.
If the $next value is a <row> element,
we first pass any previously seen $pictures to
make-row so it can split those into rows. Then we pass the
children of the <row> element to
format-row to create a single row. And then we recursively
call find-rows to process the $rest of the
sequence. The next two cases are to just skip any
<title> and <date> elements since
they are handled elsewhere. The logic for handling a
<text> element is similar to that for
<row>, except that we wrap the contents of the
<text> in a <p> paragraph. Finally,
the default case handles <picture> elements with a
recursive call that moves the $next element over to the
$pictures sequence.
Generating the Picture Pages
Now let's look at how we can generate a web page for each picture, something like that shown in Figure 2.
Figure 2: A web page for viewing a single picture.
The first tricky part is dealing with the links for the previous and next picture. The other tricky part is that we want to support multiple styles. The existing code supports three styles that each image can be displayed in:
"Medium image"is the default style. It displays the image at around 640 pixels wide, which is fine for most screens and browsers."Large image"gives you a higher resolution, about 1024 or 1280 pixels wide. (For most of the pictures I take, 1280 pixels wide is the original camera resolution: from which fact you can infer that my camera is a few years old.)The
"Info"style shows the thumbnail image, EXIF camera information from the JPEG, and links to the original JPEG files.
Thus we need to generate 3 HTML pages times the number of
<picture> elements. Exactly how to write all these
files is somewhat implementation dependent. There are at least three ways
to do it:
Write a script or driver program (perhaps a batch script) that loops through the pictures, calling the XQuery implementation once for each desired output file. You need to pass the desired file as some kind of parameter that XQuery can access. The standard function
inputsis one way to tell the XQuery program which file to generate. The output from XQuery is redirected to the intended HTML file.Generate all the HTML output files in a single XQuery run, by putting them in a single large XML object, like this:
<outputs> <output-file filename="picture1.html"> <html>contents of picture1.html</html> </output-file> <output-file filename="picture2.html"> <html>contents of picture2.html</html> </output-file> ... and so on ... </outputs>
It is then easy to write a post-processor to split this into separate XML files.
Generate all the HTML output files in a single XQuery run, but use an implementation-specific function to write each HTML separately. While nonstandard, it is the simplest and most efficient, so we will use that approach here.