XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

XML Parser Benchmarks: Part 1
Pages: 1, 2

StAX Parser Benchmark

You can choose the StAX implementation (for example Apache's AXIOM) in many recent Java XML applications. Since there are already a handful of StAX implementations out there, we compared their reading performance in the following benchmarks.


Benchmark results for the StAX parsers and small documents.
Figure 3: Benchmark results the StAX parsers and small documents

Benchmark results for the StAX parsers and medium sized documents.
Figure 4: Benchmark results for the StAX parsers and medium-sized documents

Benchmark results for the StAX parsers and large documents.
Figure 5: Benchmark results for the StAX parsers and large documents

Figures 3-5 show the benchmarks of the five different StAX implementations. In all but the last benchmark the Javolution and the Woodstox parser perform the best results. The SUN SJSXP lags behind for small documents but outperforms all other parsers for the very large 4 MB XML file. The BEA implementation is slightly better for small documents than the SJSXP, but for XML files bigger than 10 KB it is overtaken by the SJSXP. Oracles StAX implementation ranks last on the two biggest documents where it performs equal to the BEA implementation.

Conclusions

From the results of the benchmarks we can see that there are big performance differences between the parser implementations. Overall the SAX-like implementation of LIBXML2 in C performs best in all benchmarks. For most document sizes it had one-third to twice as much throughput as its competitors. This is interesting because as we will see in the next part of this series, the LIBXML2 DOM implementation in C uses this parser to read in data and therefore already has a performance advantage over the other object model parsers in Java. A negative point of this parser is definitely the complexity of its interface. With the need to handle void, and double pointers in the callback interface, it is a great difference to the rather intuitive use of the Java StAX interfaces.

Javolution and Woodstox are the winners of the StAX parsers. Woodstox has the advantage of being JSR 173 conforming StAX parser, which makes it usable for more applications.

In the next part of this series we will look at the results of the object model parser benchmarks, and will see if any Java parser can beat the performance of the LIBXML2 object model parser in C. This will lead to our final conclusion which XML parser to use for our high-performance web service gateway.

Additional Resources