Some Background on XML Conformance Testing

September 15, 1999

David Brownell


Part 1: Conformance Testing for XML Processors
Part 2: Background on Conformance Testing
Part 3: Non-Validating XML Processors
Part 4: Validating XML Processors
Part 5: Summary

This article uses the following freely available tools in its evaluation of the conformance of XML processors in Java:

Note that this article uses the term XML Processor, as found in the XML specification. In effect this is what is informally called an XML Parser.

OASIS XML Conformance Test Suite

In order to improve the overall quality of XML processors available in the industry, OASIS (the Organization for Advancement of Structured Information Systems) established a working group to establish an organizational home for a suite of XML conformance tests. This progressed, and produced its first draft in July 1999.

The goals of this working group were initially limited to providing and documenting a set of test cases covering the testable requirements found in the XML 1.0 specification. The work of applying those tests, and evaluating their results, was intentionally left separate. (Vendors can create test drivers specific to their APIs, and use them as part of their quality control processes.) Accordingly, the results provided by the OASIS group were as follows:

  • A suite of test cases, XML documents for which any conformant XML processor has a known result. These test cases came from several sources:
    • XMLTEST cases, which have been widely available since early in the life of the XML specification, and focus on non-validating processors.
    • Fuji Xerox Japanese XML Tests, available for slightly less time, consisting of two documents with many Japanese characters, in a variety of encodings and syntactic contexts.
    • Sun Microsystems Tests, designed to augment XMLTEST and the Fuji Xerox tests by adding coverage of all validity constraints in the XML specification, and by incorporating tests for various customer-reported problems not previously tested.
    • OASIS Tests, written by the OASIS group with direct reference to the XML specification. They were intended to provide rigorous coverage of many of the grammar productions.
  • A valid XML document acting as a test database, describing each test in terms of: (a) what part of the XML specification it tests; (b) any requirements it places on the processor, such as what types of external entities it must process; (c) an interpretation of the intent of that test; and (d) a test identifier, used to uniquely identify each test.
  • Preliminary XSL/T style sheets to process that test database and produce test documentation documenting all of the tests. (One conformed to a recent draft of the XSL/T specification, and another was usable in the Microsoft IE5 browser.)

That test database was designed to be used for at least two specific applications: producing the test documentation described above, and generating an easily interpreted testing report in conjunction with a test driver. (Such testing reports are the focus of this article.)

It is important to understand the structure of the tests provided by OASIS. This structure came from basic principles of software testing, and from the content of the XML specification itself. Read the test documentation for full information, and for a brief summary just understand that:

  • Different kinds of processors (validating, non-validating, etc) have slightly different testing criteria, as identified in the testing reports discussed here;
  • Tests include positive test documents which the processor must accept (producing at most a warning);
  • Tests include negative test documents for which the processor must report an appropriate type of error for the thing being tested by that document (the error must always relate to the case being tested; it must also be continuable for validity errors, and fatal for well-formedness errors);
  • Associated with some positive tests are output documents which hold a normalized representation of the data that type of processor is required to report when parsing the input document.

The information in the test report for a given XML processor can accordingly be quite complex.

XML Conformance Test Driver

This article presents the results of tests as driven by a test driver I wrote. It is available under an Open Source license at the URL given above. The driver reads the test database, applies it to the processor under test, and writes out a detailed report. It uses a report template mechanism. Such reports are valuable to developers who can use them to identify bugs that must be fixed without needing to look at perhaps hundreds of test cases. They are also useful to anyone who wants to evaluate a parser for use in an application. Moreover, such reports are invaluable for interpreting negative test results, as described later.

This test driver uses the SAX API to perform its testing. This is the most widely supported XML processor invocation API in Java, as will be appreciated by the fact that this article can look at a dozen such processors! Of necessity, the driver tests SAX conformance as much as it tests conformance to the XML specification, since the SAX API is the way the processor reports the information that the XML specification requires it to report. For example, it is this API that offers the choice between continuing after a validity error or not; and it is this API which exposes the characters, elements, and attributes that a processor reports to an application.

(Note that DOM can't be an API to an XML processor in its current form. One basic issue is the lack of a standard for getting a DOM tree from a given document; there is also no standard way to report parsing errors, or to choose whether to continue processing after a validity error. DOM is also less widely supported than SAX.)

The first draft of the OASIS report has some bugs in its test database. Specifically, the relative paths to the test cases are incorrect, since the file locations changed during the publication process without their URIs being updated), and a few output test cases were omitted. Read the documentation in the test driver for information about how the test database were manually patched to fix those problems for the test runs documented here.

Analyzing XML Conformance Test Reports

Using the test driver and the OASIS tests, a report is generated for each processor. That report summarizes the results for over one thousand test cases. For brevity, only errors and cases needing further interpretation are presented in each report.

The first step in analyzing each processor was to identify any patterns in the clear failures (both positive and negative). Some processors have strong patterns, such as one or two particularly severe bugs which skew the results. Others have more scattershot patterns of bugs; in such cases, I have tried to provide a summary that indicates the types of errors which were most commonly reported.

As a practical matter, I had to put a time limit on this analysis. Those processors with more than about two hundred clear failures got less quality time invested in their analysis. That limit was somewhat arbitrary; enough processors fell within that limit, although it was still a lot of time. Failing more than twenty percent of these tests is at best a marginal grade, and such analysis would be better done by the implementor preparatory to fixing their processor's bugs.

The second step was to interpret the negative tests to understand if any of those passed tests were actually failures ... and adjust the count of ``passed'' tests (down) appropriately. This work can't easily be done without the test interpretation data from the test documentation, since it requires comparing the reason each test failed (its diagnostic) with the reason it was supposed to fail (as shared between the test documentation and this test report). If the reasons don't match, the parser has a problem: the test cases are only supposed to have the one error! Typically, mismatches have been conformance bugs, but they can also be incorrect diagnostics.

This work is not at all straightforward, and is strongly influenced by the quality of diagnostics that a processor provides. Processors with poor diagnostics (such as the classic "syntax error on line 23") can't be effectively analyzed. This analysis is also quite problematic in terms of time. For some (good) processors, there are well over six hundred tests that need such analysis. That is well over the arbitrary limit of two hundred tests mentioned above, and the better processors often need more work than the worse ones. (Note that there is a clear end-user benefit to such work: clear diagnostics make products easier to use, and the work is not likely to be done with bad diagnostics.)

Accordingly, only the most egregious errors of this type were normally identified. If a more thorough analysis was performed, perhaps because the diagnostics were notably good or bad, this is noted.

What about Non-Java XML Processors?

Just because this article looks only at Java processors, don't assume that such testing can only be done for processors in Java!

In fact, other XML processors deserve the same kind of testing (and consequent bug fixing), particularly the widely available processors found in XML-enabled web browsers (Mozilla, IE5) and in desktop environments like Gnome and KDE. It should be straightforward to create a Java wrapper with a SAX API for the Mozilla and Gnome processors (which have event driven APIs) and that may also be possible for the processors in KDE and in IE5. That would let this test driver be used, automatically providing a usable test report format. (And having that SAX API to the native C code might be handy too.) Alternatively, custom test drivers could be written for each of those specific XML processor APIs.


These results are not in any sense "official", although they will be enlightening nonetheless. The test cases themselves are under review, for content as well as completeness of coverage, as is the driver used to generate these results. Some of the XML specification interpretations associated with the test cases are dependent on errata from the W3C XML team that have not yet been published; when those get published, some of the categorizations of individual tests may need to change.

Comments about those interpretations of the XML specification should be directed to the OASIS XML Conformance working group, or to the W3C in cases where the XML specification is unclear. Comments about the test driver itself should be directed to the author.

Also, not only am I the author of this article and the test driver, but I was the developer of one of the processors (Sun's, although I no longer work there) and was also a significant contributor to the OASIS work. I attribute the good ranking of the processor I wrote to paying careful attention to testing against the XML specification, in the same way I attribute the corresponding ranking of XP to such attention. (The author of XP is the author of the XMLTEST cases, and also contributed to the XML specification itself.) I have attempted to avoid factual bias.

With all of that said, these testing results are still the best that are generally available at this time, and should be an effective guide to anyone needing to choose (or bug-fix) an XML processor in Java.