XML Parser Benchmarks: Part 1
Pages: 1, 2
StAX Parser Benchmark
You can choose the StAX implementation (for example Apache's AXIOM) in many recent Java XML applications. Since there are already a handful of StAX implementations out there, we compared their reading performance in the following benchmarks.

Figure 3: Benchmark results the StAX parsers and small documents

Figure 4: Benchmark results for the StAX parsers and medium-sized documents

Figure 5: Benchmark results for the StAX parsers and large documents
Figures 3-5 show the benchmarks of the five different StAX implementations. In all but the last benchmark the Javolution and the Woodstox parser perform the best results. The SUN SJSXP lags behind for small documents but outperforms all other parsers for the very large 4 MB XML file. The BEA implementation is slightly better for small documents than the SJSXP, but for XML files bigger than 10 KB it is overtaken by the SJSXP. Oracles StAX implementation ranks last on the two biggest documents where it performs equal to the BEA implementation.
Conclusions
From the results of the benchmarks we can see that there are big performance differences between the parser implementations. Overall the SAX-like implementation of LIBXML2 in C performs best in all benchmarks. For most document sizes it had one-third to twice as much throughput as its competitors. This is interesting because as we will see in the next part of this series, the LIBXML2 DOM implementation in C uses this parser to read in data and therefore already has a performance advantage over the other object model parsers in Java. A negative point of this parser is definitely the complexity of its interface. With the need to handle void, and double pointers in the callback interface, it is a great difference to the rather intuitive use of the Java StAX interfaces.
Javolution and Woodstox are the winners of the StAX parsers. Woodstox has the advantage of being JSR 173 conforming StAX parser, which makes it usable for more applications.
In the next part of this series we will look at the results of the object model parser benchmarks, and will see if any Java parser can beat the performance of the LIBXML2 object model parser in C. This will lead to our final conclusion which XML parser to use for our high-performance web service gateway.
Additional Resources
- The StAX specification JSR 173
- Sun's XMLTest XML parser benchmark tool
- xmlbench a XML parser benchmark tool in C
- SUN's StAX benchmark with XMLTest
- Reply
2007-05-15 08:48:35 MatthiasFarwick - Where's Microsoft?
2007-05-15 05:40:03 Alan Carlyle - Parsing is NOT the bottleneck
2007-05-11 13:19:27 TextScience