Sign In/My Account | View Cart  
advertisement

Article:
 Non-Extractive Parsing for XML
Subject: How you deal with encoding
Date: 2004-05-20 10:52:57
From: jimmy_z
Response to: How you deal with encoding

Thanks for posting this question.


One way to deal with character encoding is to build "intelligence" into directly various non-extractive string comparison functions.


Most people are used to UCS-2 string representation in their code. So a "non-extractive" comparison function needs to compare UTF-8 tokens (or UTF-16) against UCS-2 strings.
In addition,it may also resolve entity references on the fly during the comparison.


Same thing applies to text to numerical data conversion as well. An non-extractive version of "parseInt" needs to convert a UTF-8 (or UTF-16) token into an integer without "extracting" it out of the source document.


Hope I answered your question.


No Previous Message Previous Message Move up to Parent Message Up Next Message Next Message


Sponsored By: