Menu

BumbleBee, the XQuery Test Harness

March 10, 2004

Jason Hunter

Will XQuery be the key that unlocks a new generation of data and content? My money, and the vendor money, says yes. Nearly every vendor, from the well-known old guard (IBM, Oracle, and Microsoft) to the plucky upstarts (Cerisent, X-Hive, and Qizx) has expressed their support for XQuery and are actively collaborating in its standardization. Under development by the W3C and in Last Call, XQuery looks poised to become the standard query language by which companies access and manipulate semi-structured data and merge together disparate data and content repositories.

If you're unfamiliar with XQuery, see the Resources section below for some introductory articles to get you started.

Using XQuery, however, can be quite frustrating, as you're faced with choosing from a variety of XQuery vendors that support different versions and interpretations of the XQuery specification. Then once you've selected the XQuery engine that's best for you, it can be hard to know if the queries you write today will produce reliable results tomorrow after you upgrade your engine or make changes to your queries.

The BumbleBee XQuery test harness (available at XQuery.com) addresses these frustrations and takes the pain and uncertainty out of learning and using XQuery. Named because it buzzes around FLWORs, BumbleBee provides a cross-platform, vendor-neutral automated testing environment for XQuery development. In other words, BumbleBee is to XQuery what JUnit is to Java. Write your query, define the expected result, and let the tool do the rest. With BumbleBee you can automate regression testing, quantify vendor compatibility, and solidify your understanding of XQuery by running the same query across numerous vendors with the push of a button.

I'm one of the folks behind BumbleBee, and probably its biggest fan since it's helped me so much with my own XQuery authoring.

Automated Testing, Vendor Selection, and Test-First Learning

Before we discuss the details of BumbleBee, let's put its role in context. Regular testing, especially automated testing, has proven essential to producing high-quality software. One can, of course, write XQuery code without testing, but that carries extra risk. Unlike GUI programs where buggy code usually produces obvious errors, a miswritten query can execute fine and produce results that appear correct, while in reality the results may be incomplete or slightly erroneous.

Using BumbleBee as your testing tool, you can craft automated tests to run your queries against fixed data sets and verify the results of the queries, so you can trust the results of production queries run against less controlled input. Then when you change your query or upgrade your XQuery engine, the automated tests let you perform a quick verification before deployment. What if a query bug escapes initial testing? Add a new BumbleBee test case to the test suite ensuring the bug never reappears.

To take an example from my own life, I teach a weeklong XQuery course. I've found it valuable to load all the course examples and homework into a BumbleBee test suite. That way, before teaching the course, I can test everything against the specific XQuery engine we'll be using in the course. Any bugs pop out immediately with little effort. No longer must students act as guinea pigs.

Testing can also help the process of selecting an XQuery vendor. BumbleBee includes a set of several hundred tests based on the W3C Use Cases and NIST conformance test suite. These tests can exercise an XQuery engine and quantify its conformance. You can also extend these tests with your own conformance tests to more exhaustively cover the specific areas of importance to you. Contributions from users are welcome into BumbleBee, so if you do this please consider turning them in for the use of others.

Some have also found BumbleBee useful in learning XQuery, employing it to support "test-first learning." In this mode of learning, you present yourself a challenge: "I want to take this input and produce this output." Then you work to code the query that produces the desired result. When you succeed, the test passes. Over time your suite of learning tests becomes a knowledge base you can draw from later when writing production code. As new XQuery specification versions come out, you can use the tests and test new failures to see if anything you learned is antiquated. This style of learning can also be useful by a teacher or course instructor to hand out and grade homework. Student grades are almost printed for you. You just judge code for style and take the afternoon off. If you're looking for new challenges, the XQuery.com Wiki has a Challenges area.

Using BumbleBee

Now that we understand the advantages of XQuery automated testing, let's put BumbleBee to work buzzing around some tests. BumbleBee executes from a command-line script (bumblebee.bat or bumblebee.sh depending on platform). By default, if you don't specify any command-line tests, BumbleBee runs a user configurable default suite of tests. As each test is run, its name and pass or fail status is printed to the console. For example:

Passed   -> Test (Vendor 1): Test 1 in 0.015 sec 

Failed!  -> Test (Vendor 1): Test 2

Expected attribute value '1992' but was '1994' - comparing 

<book year="1992"...> at /BumbleBee_Result[1]/bib[1]/book[1]/@year

to <book year="1994"...> at /BumbleBee_Result[1]/bib[1]/book[1]/@year

When all the tests have run to completion, you see a summary of test results printed to the console for each XQuery engine that was tested. For example:

Time: 39.598 seconds



FAILURES!!!

Vendor 1: Tests Run: 72, Failures: 22, Disabled: 0 (69.4% passed)

Vendor 2: Tests Run: 72, Failures: 1, Disabled: 0 (98.6% passed)

Total   : Tests Run: 144, Failures: 23, Disabled: 0 (84% passed)



(See log/bumblebee.log for failure details.)

To run BumbleBee against a specific directory containing BumbleBee test files, you just list the directories on the command line:

bumblebee directory1 [directory2 [...]]

For example, to run the November 2003 Use Case tests distributed with BumbleBee, type:

bumblebee tests/2003-11/usecases

Depending on your server's compliance level, you may want to use the August 2003 or May 2003 tests instead.

After you run BumbleBee, the log/bumblebee.log file contains a comprehensive report of all tests run. For each failed test, the test report includes the XQuery expression that was run, the actual query result returned by the XQuery engine under test, the query result that was expected by the test, and the failure message. The console output is always fairly short; the log output is always comprehensive.

Writing BumbleBee Tests

To learn how to write a BumbleBee test, let's write a query against the following XML file, named tunes.xml, representing a collection of songs:

<Tunes>

  <Tracks>

    <Track>

      <Name>Ready, Steady, Go</Name>

      <Artist>Paul Oakenfold</Artist>

      <Album>Bunkka</Album>

      <Genre>Electronic</Genre>

      <MyRating>10</MyRating>

      <Time>254</Time>

    </Track>

    <Track>

      <Name>Battle</Name>

      <Artist>Hans Zimmer and Lisa Gerrard</Artist>

      <Album>Gladiator Soundtrack</Album>

      <Genre>Instrumental</Genre>

      <MyRating>8</MyRating>

      <Time>193</Time>

    </Track>

    <Track>

      <Name>Orange Wedge</Name>

      <Artist>The Chemical Brothers</Artist>

      <Album>Surrender</Album>

      <Genre>Electronic</Genre>

      <MyRating>7</MyRating>

      <Time>254</Time>

    </Track>

  </Tracks>

</Tunes>

We place this XML file in the tests/2003-05/examples directory assuming our chosen vendor supports the May 2003 draft. It's good practice to organize tests by XQuery specification version.

We want the query to generate a new XML document representing a play list of our favorite songs sorted by song name. Using any text editor, we create the following BumbleBee test file named MyFirstTest.bee in the tests/2003-05/examples directory:

!name My First Test



# What is my favorite music?



!load tests/2003-05/examples/tunes.xml



!query

<Playlist>

  {

    for $t in doc("tests/2003-05/examples/tunes.xml")//Track

    where $t/Genre = "Electronic" and $t/MyRating > 5

    order by $t/Name

    return

        <Track>

          { $t/Name, $t/Artist, $t/Genre, $t/MyRating }

        </Track>

  }

</Playlist>

!end



!result

<Playlist>

  <Track>

    <Name>Orange Wedge</Name>

    <Artist>The Chemical Brothers</Artist>

    <Genre>Electronic</Genre>

    <MyRating>7</MyRating>

  </Track>

  <Track>

    <Name>Ready, Steady, Go</Name>

    <Artist>Paul Oakenfold</Artist>

    <Genre>Electronic</Genre>

    <MyRating>10</MyRating>

  </Track>

</Playlist>

!end

Test directives begin with an exclamation (!) symbol. In this example all the test directives have been highlighted in a bold font. Comments begin with a hash (#).

The first test directive is the !name directive. This directive specifies an arbitrary name for the test. You'll see the test name used in the console and file output to uniquely identify the test. The second test directive is !load. This requests that the server load the file at the given path under the same URI as the path. On some engines that drive off the file system directly, this isn't technically necessary, but it's always good to have. The path should be relative to the directory from which we run BumbleBee.

Next we find the !query directive. This directive contains the text of the XQuery expression to be run. The !query directive must end with a single line containing the !end directive. The last test directive is the !result directive. This directive contains the text of the result we expect to be produced by the XQuery expression specified in the !query directive. The !result directive must end with a single line containing the !end directive. If desired, you can add additional test cases within the same .bee file; just start the additional tests with a new !name directive.

As an aside, some may wonder why the hierarchical .bee files aren't written using XML. Turns out it's especially difficult for humans to author tests in XML -- not just because XML is more verbose, but because the query and result contents themselves use XML. Thus the tests have to be escaped or placed in CDATA sections, and neither of those solutions is helpful for test authoring. What's more, when the query or result use CDATA sections themselves, then it gets extremely difficult to decipher what's what.

After placing the MyFirstTest.bee file in the tests/2003-05/examples directory, you can run the test individually by typing:

bumblebee tests/2003-05/examples/MyFirstTest.bee

Alternatively, if there were multiple test files in the examples directory, you could run them all in one fell swoop using:

bumblebee tests/2003-05/examples

On running the test, you'll see the following console output:

BumbleBee: The XQuery Test Harness



Test script: bumblebee/tests/2003-05/examples/MyFirstTest.bee



Passed   -> Test (Qizx): My First Test in 2.902 sec



Test script: bumblebee/tests/2003-05/examples/MyFirstTest.bee



Passed   -> Test (Saxon): My First Test in 1.285 sec



Time: 4.394 seconds



OK!

Qizx : Tests Run: 1, Failures: 0, Disabled: 0 (100% passed)

Saxon: Tests Run: 1, Failures: 0, Disabled: 0 (100% passed)

Total: Tests Run: 2, Failures: 0, Disabled: 0 (100% passed)

Notice that in this example BumbleBee ran the test against two XQuery engines: Qizx and Saxon. The test passed in both cases. That is, as a result of running our test through BumbleBee we know that our XQuery expression produces the expected result when run against either of these engines. The XQuery engines used by default are selectable in the external configuration of BumbleBee.

Negative and Compound Tests

Our first BumbleBee test was a positive test. It passed only when the XQuery engine under test produced the expected result and not an error condition. Sometimes you want to test that an XQuery engine produces an error when an error is appropriate. The following example ensures the engine reports a divide by zero error:

!name A Negative Test



!query

3 idiv 0 = 1

!end



!result

ERROR

!end

The !result directive uses a special ERROR keyword to indicate that any error reported by the XQuery engine is another permissible result. Any non-error condition is a failure.

BumbleBee also allows you to specify multiple results for a single query. Why would you want this? Because sometimes two answers are possible. For example, in XQuery an expression can be evaluated in any order and some orderings may short circuit to success while others may legitimately return errors. The following BumbleBee test demonstrates.

!name A Compound Test



!query

1 eq 2 and 3 idiv 0 = 1

!end



!result

false

!end



!result

ERROR

!end

Notice the use of two !result directives. The first !result directive indicates that false may be returned (if the expression is evaluated left to right and it short circuits). The second !result directive allows an error as another legal possibility (if the expression is evaluated right to left). If either false or an error condition is returned by the XQuery engine, then the test will pass. An arbitrary number of possible results can be declared in any BumbleBee test using this format.

XQuery Vendor Options

BumbleBee supports any XQuery engine accessible from Java. Currently that list stands at seven, listed in alphabetical order: Cerisent, Ipedo, IPSI-XQ, Qexo, Qizx/open, Saxon, and X-Hive. BumbleBee has adapter code for each of these vendors that maps from a standard interface to each vendor-specific implementation.

When you run a BumbleBee test, or suite of tests, the tests run against all enabled XQuery engines. If you run two tests with three XQuery engines enabled, then you will see six test results, a pair of results for each XQuery engine. You can control which engines BumbleBee uses by editing a list in the bumblebee.properties external configuration file.

Power users can enable and disable tests for specific engines using the !enable enginename and !disable enginename directives. These directives let you author a vendor-specific test that won't be run by other vendors, or in skipping tests on a particular server where that server doesn't support the optional XQuery feature under test.

Conclusion

BumbleBee provides a powerful, portable, vendor-neutral automated test environment for XQuery. With BumbleBee you can automate your regression testing, compare multiple XQuery engines, and learn the language through structured challenges.

The latest BumbleBee release, version 1.2, includes support for seven vendors and numerous specification draft releases. The easy-to-write .bee file format allows for quick development of tests, including negative tests and compound tests.

Future BumbleBee versions may include a graphical query execution environment and test authoring tool. If this sparks your interest, write to "buzz" at xquery.com so it'll be sure to get done. If it doesn't spark your interest, write in anyway with what would.

Getting BumbleBee

A free evaluation download of BumbleBee can be found at http://xquery.com/bumblebee. Non-expiring licenses are available for commercial use by emailing with your usage requirements. Licenses are also available free of charge for developers of open source XQuery implementations and qualified non-profit and educational use. Email "buzz" to discuss qualifying for such a license.