XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Generating XML and HTML using XQuery
by Per Bothner | Pages: 1, 2, 3, 4, 5, 6

Things start to get interesting when we get to the call to make-rows. This is a recursive helper used to divide a sequence of <picture> elements into rows of at most three. It's quite straightforward if you're comfortable with recursive functions. (If you're not comfortable using recursion, it may seem mysterious, but it does work.) The $pictures parameter is a sequence of the <picture> elements we haven't dealt with yet. If the list is empty, we return nothing (the empty sequence ()). If there are 3 or fewer pictures, we pass them to the format-row function, which makes a single row of thumbnail pictures. If there are exactly 4 pictures total, we take the first two pictures, and pass them the format-row function, which puts 2 thumbnail pictures in one row, and then we do the same with the remaining 2 pictures. Otherwise, there are more than 3 pictures, so we take the first 3 pictures, put them in a row using format-row, then we take the rest of the pictures (starting at picture 4), and recursively call make-rows. This processes the rest of the pictures in the same way, putting 3 pictures in a row each time, until we come to the end.

This is the make-rows function:

{-- Process a sequence of <picture> elements, grouping them into
 -- rows of at most 3, for the thumbnail page.
 -- $prev:  An integer giving the number of <pictures> in
    this current sequence that precede those in $pictures.
 -- $pictures:  Remaining <picture> elements to processes.
 -- Returns formatted HTML for the input $pictures.
 --}
define function make-rows($prev, $pictures) {
  let $count := count($pictures) return
  if ($count = 0) then ()
  else if ($count < 3) then
      format-row($pictures)
  {-- A special case:  If there are 4 pictures in a row, then group them
   -- as 2 rows of 2 rather than 3 + 1. --}
  else if ($count = 4 and $prev = 0) then
     (format-row(sublist($pictures, 1,2)),
      format-row(sublist($pictures, 3,2)))
  else
     (format-row(sublist($pictures, 1,3)),
      make-rows($prev+3,sublist($pictures,4)))
}

The format-row function loops over a sequence of <picture> elements and calls make-thumbnail on each one. If the <picture> has a <caption> child, it is placed underneath the thumbnail. We wrap each thumbnail plus its caption inside an <a> HTML link wrapped inside an HTML <table>, finally wrapping the entire row in another HTML <table>.

define function format-row($row) {
  {-- emit a newline for readability --} "
",<table width="90%" class="row"><tr>{
  for $pic in $row return
  <td>
   <table bgcolor="black" cellpadding="0" frame="border"
      border="0" rules="none" vspace="10">
      <tr>
        <td align="center"><a href="{$pic/@id}.html">{
          make-thumbnail($pic)}</a></td>
      </tr>
      { if ($pic/caption) then
      <tr>
        <td bgcolor="#FFFF99" align="center"><a class="textual"
          href="{$pic/@id}.html">{$pic/caption/node()}</a></td>
      </tr>
      else ()}
    </table>
  </td>
  }</tr></table>
}

Finally, make-thumbnail looks for a <small-image> child that contains the actual name of the JPEG file and calls make-img to emit the <img> image link. If there is no <small-image>, we look for a <image> or a <full-image> element and have the browser scale the image to a smaller size. (In that case the browser has to download the bigger image, only to throw away most of the detail, so this is not something you want to do normally.)

define function make-thumbnail($pic) {
  if ($pic/small-image) then
    make-img($pic/small-image, 1.0)
  else if ($pic/image) then
    make-img($pic/small-image, 0.5)
  else if ($pic/full-image) then
    make-img($pic/full-image, 0.2)
  else
  ( "(missing small-image)", string($pic), ")" )
}

define function make-img($picture, $scale) {
  <img border="1" src="{$picture}" width="{number($picture/@width) * $scale}"
    height="{number($picture/@height) * $scale}" />
}

A Slight Refinement

The code so far ignores any children of the top-level <group> except for <picture>. However, if you look at the original index.xml example, you'll see that the <group> has a <text> child. Let's place any such <text> contents in the appropriate place on the overview page. Another sometimes useful feature is to be able to explicitly specify how the pictures are to be organized into rows, rather than depend on the default maximum of three per row. You can do that by editing the index.xml file to place a <row> element around one or more <picture> elements to specify that these should go in a row by themselves:

define function make-group-page($group) {
<html>
  ... rest as before ..
    <h2>{$group/title/node()}</h2>
{   find-rows((), $group/*)}
  </body>
</html>
}

{-- Process the children of a <group>, grouping thumbnails into rows.
 -- $pictures:  Sequence of <picture>s that need to be split into rows.
 -- $unseen: sequence of remaining children we have not processed yet.
 --}
define function find-rows($pictures, $unseen) {
  if (empty($unseen)) then make-rows(0, $pictures)
  else
    let $next := item-at($unseen, 1),
        $rest := sublist($unseen, 2)
    return
      typeswitch ($next)
      case element row return
        (make-rows(0, $pictures),format-row($next/*),find-rows((), $rest))
      case element date return {-- ignore <date> children here. --}
        (make-rows(0, $pictures),find-rows((), $rest))
      case element title return {-- ignore <title> children here. --}
        (make-rows(0, $pictures),find-rows((), $rest))
      case element text return {-- format <text> as a paragraph. --}
        (make-rows(0, $pictures),<p>{$next/node()}</p>,find-rows((), $rest))
      default return
        find-rows(($pictures,$next), $rest)
}

The initial call to find-rows sets $pictures to the empty sequence and $unseen to the sequence of all the child elements of the top-level <group>. If $unseen is the empty sequence, we're done: we just call make-row to wrap up the last row. Otherwise, we look at the first element of $unseen.

We use a typeswitch expression to do the appropriate thing depending on the type of that first element. A typeswitch evaluates an expression (in this case $next, the first value of the $unseen sequence). Then it searches through the case clauses, each of which specifies a type. It selects the first case clause such that the $next value is an instance of the type and evaluates the corresponding return expression. If there is no matching case, it evaluates the default return instead.

If the $next value is a <row> element, we first pass any previously seen $pictures to make-row so it can split those into rows. Then we pass the children of the <row> element to format-row to create a single row. And then we recursively call find-rows to process the $rest of the sequence. The next two cases are to just skip any <title> and <date> elements since they are handled elsewhere. The logic for handling a <text> element is similar to that for <row>, except that we wrap the contents of the <text> in a <p> paragraph. Finally, the default case handles <picture> elements with a recursive call that moves the $next element over to the $pictures sequence.

Generating the Picture Pages

Now let's look at how we can generate a web page for each picture, something like that shown in Figure 2.

Figure 2

Figure 2: A web page for viewing a single picture.

The first tricky part is dealing with the links for the previous and next picture. The other tricky part is that we want to support multiple styles. The existing code supports three styles that each image can be displayed in:

  • "Medium image" is the default style. It displays the image at around 640 pixels wide, which is fine for most screens and browsers.

  • "Large image" gives you a higher resolution, about 1024 or 1280 pixels wide. (For most of the pictures I take, 1280 pixels wide is the original camera resolution: from which fact you can infer that my camera is a few years old.)

  • The "Info" style shows the thumbnail image, EXIF camera information from the JPEG, and links to the original JPEG files.

Thus we need to generate 3 HTML pages times the number of <picture> elements. Exactly how to write all these files is somewhat implementation dependent. There are at least three ways to do it:

  • Write a script or driver program (perhaps a batch script) that loops through the pictures, calling the XQuery implementation once for each desired output file. You need to pass the desired file as some kind of parameter that XQuery can access. The standard function inputs is one way to tell the XQuery program which file to generate. The output from XQuery is redirected to the intended HTML file.

  • Generate all the HTML output files in a single XQuery run, by putting them in a single large XML object, like this:

    <outputs>
      <output-file filename="picture1.html">
        <html>contents of picture1.html</html>
      </output-file>
      <output-file filename="picture2.html">
        <html>contents of picture2.html</html>
      </output-file>
      ... and so on ...
    </outputs>

    It is then easy to write a post-processor to split this into separate XML files.

  • Generate all the HTML output files in a single XQuery run, but use an implementation-specific function to write each HTML separately. While nonstandard, it is the simplest and most efficient, so we will use that approach here.

Pages: 1, 2, 3, 4, 5, 6

Next Pagearrow