XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Microsoft XML Parser Conformance

November 17, 1999

Contents

Part 2: Non-validating mode
Part 3: Default mode
Part 4: Summary of findings

Last September, David Brownell conducted a review of XML parsers for XML.com, testing them for conformance to the XML 1.0 specification. In this follow-up article, he tests Microsoft's MSXML.DLL parser, as found in Internet Explorer 5. Unlike previously tested parsers, the Microsoft parser does not provide a SAX interface, used in the testing procedure. As a result of collaboration with Microsoft, the author constructed a Javascript DOM-based test harness. The results of the tests gave the Microsoft parser a "pretty good" rating—in the top 25% for conformance. They did however reveal a serious flaw with DTD handling and validation, for which Brownell presents a workaround.

In my earlier conformance review for Java XML Processors, I evaluated a dozen XML processors written in Java and using the SAX API. Feedback I got from that article was generally positive, and several readers suggested I provide a corresponding evaluation of another widely available XML processor: the Microsoft XML Parser (MSXML.DLL), which is the one bundled with the Internet Explorer 5 web browser. This article provides such an evaluation.

Some readers were also confused about Microsoft's Java XML processor, called "MSXML" in that earlier review. Briefly, Microsoft has had several implementations of XML processor technology. While today one tends to only hear about the latest version of such technologies, they have all been called "MSXML," or "MS XML," in common usage, by numerous people, including some Microsoft staff. Since the Java processor hasn't been updated in well over a year, some confusion seems inevitable. The Java processor was formally called the Microsoft XML Parser for Java. I hope that helps to clarify the distinctions between the various packages; the details of the two reviews should also help.

The version of the Microsoft XML (MSXML) processor reviewed here is the one that has been bundled with Microsoft's Internet Explorer 5.0 web browser. It can be accessed as "MSXML.DLL," and can be redistributed with other software, as part of Win32 applications. Since it provides a COM API, it can be used from JavaScript, C/C++, Visual Basic, and other COM-aware programming languages. It can even be used from Java, but for most Java developers, that support is not particularly useful since it requires using Microsoft's JVM, and does not support the standard SAX or W3C DOM APIs (org.w3c.dom.*).

Another Test Harness for JavaScript, DOM, and MSXML.DLL

I encourage you to read my earlier article for more background on testing XML conformance. Briefly, there are several kinds of tests, which are supported by test cases—not yet in final form—collected and organized by a joint OASIS/NIST working group. These tests need to be run through a test harness using some particular API to access the XML processor under test. The earlier review used SAX as that API, but that would not work for the MSXML.DLL processor, so a new harness was needed. The harness produces some sort of testing report. This article includes the raw test reports, which are in an HTML format that should be easy to use.

I was pleased to receive queries from Chris Lovett, a Program Manager in the XML Group at Microsoft, about those test cases. After some email back and forth, I had a basic JScript test harness my mailbox, which was good, since I usually stick to Java, and it's always a lot easier to improve something that already works! That version has been substantially enhanced, and you can see the reports it now generates in the review below, or run the tests yourself and see what turns up on your own system.

As before, that test harness is provided here as an Open Source tool for general use. In this case, I've put it under the GNU Public License. I hope the various DOM portability issues will get resolved so that the same code can be used with the XML processors in Mozilla (in some beta version soon) and in Internet Explorer.

Also as before, I'd like to emphasize that these reports are in no way official. They don't represent anyone's opinion but my own.

You may recall comments in the earlier review about problems using DOM as a standard XML processor API. Those still hold true. This harness had to use Microsoft-proprietary APIs to acquire a DOM Document object, to populate it with the contents of an XML file, and to detect and report parsing errors. I still remain hopeful that those issues, shared by all bindings of DOM, can be fixed in some upcoming version of the DOM API so that applications using DOM can use any vendor's implementation, in the same way that SAX currently provides an OS-independent API.

Conformance of Today's MSXML.DLL

In order to ensure that these results can be accurately compared against those in the earlier review, I did two things:

  1. As noted above, the testing report format is the same; it uses almost the same template, though some updates were needed. Since that template was (X)HTML, there was basically no problem here.
  2. The same patched version of the OASIS/NIST XML test database was used. This was done even though issues have turned up with some of the individual tests. (Eight in total, these cases did not particularly affect the testing results.)

Note that the source code distributed with the earlier review describes how the July version of that test database needed to be patched.

This table provides a quick reference to the results of the testing:

Processor Name
and Version
Passed TestsRating (Out of 5)Summary
MSXML.DLL (non-validating)
5.00.2314.1000
931

Overall this processor is above average, though some of its problems have a broad impact. In addition to a variety of problems which should be readily fixed, it (wrongly) tests validity constraints in many cases.

MSXML.DLL (default mode)
5.00.2314.1000
895

Since it accepts documents as "valid" that don't even have a DTD, all applications need to apply a workaround.

More detailed analysis of each processor mode can be found in the following sections, with links to the complete testing reports.

Pages: 1, 2, 3, 4

Next Pagearrow