Sign In/My Account | View Cart  
advertisement

Article:
 Non-Extractive Parsing for XML
Subject: MPWXmlKit has done this for over 5 years
Date: 2004-05-22 13:34:36
From: mweiher

Along with a very fast low-level scanner and special optimizations for CDATA sections, this allowed a parser with a pretty high-level interface (all data provided as Objective-C objects) to compare favorably in speed with the fastest plain C parsers.


The primary reason I wanted this was in order to be able to do partial processing, dealing only with certain sections and effectively + correctly ignoring the rest with both very high speed and bit-by-bit fidelity.


The encoding issues can be dealt with very elegantly the same way the NSString class cluster deals with them: specific subclasses can handle different encoding (styles), and remember what their encoding is. Strings with compatible/identical encodings can just ignore the issue, incompatible strings need to go through a canonical encoding (unicode).


MPWXmlKit can be downloaded from www.metaobject.com.


Keep in mind though, that this is mostly an internal parser trying to deal with specialized requirements, it doesn't have all the features todays parser have, because I have no no need for them.


Marcel


Previous Message Previous Message   Next Message Next Message


Titles Only Full Threads Newest First
  • MPWXmlKit has done this for over 5 years
    2004-05-22 15:26:38 jimmy_z [Reply]

    Thanks for the posting.


    Using Offset/length approach, one usually loads entire doc in memory undecoded, and doesn't use byte stream to access the document.


    Hope using offset/length can complement DOM/SAX in some situations.





Sponsored By: