Encodings: Details for Programmers

If you're not a computer programmer, you can skip the next few paragraphs, down to the one that begins "Like any self-labeling system..." For non-programmers, the message is that even if normal character encoding signalling mechanisms break down, in XML you have an excellent chance of figuring out what's going on and thus interchanging documents, using the techniques described (elegantly, by my co-editor Michael Sperberg-McQueen) in the following section.

If you are a computer programmer, these techniques are actually pretty easy to implement. The really disgusting code is when you have to pick apart UTF-8; but that's documented fairly well in the Unicode standard, and the CD-ROM comes with example code in C.

Back-link to spec

Copyright © 1998, Tim Bray. All rights reserved.