XML is Case-Sensitive!

This is really different from what we've all become used to with HTML and SGML (well, SGML allows case-sensitivity, but nobody actually uses it). Most people who've worked with either SGML or HTML find XML's case-sensitivity unnerving at first, but you get used to it quickly, just as we've all become used to case-sensitivity in our programming languages and file names.

The reason for the case-sensitivity is simple: internationalization. English is one of the few languages in the world where it's easy and straightforward to map upper- and lower-case letters together. The majority of the world's population uses languages that don't even have case, and don't see why one class of characters should arbitrarily be regarded as the same as another.

But once you've left the safe bounds of English, you don't have to go all the way to Asia to get in trouble with case-folding. What is the upper-case version of the character "é"? It turns out the answer is sometimes different depending on whether you're in Québec or France. Then there are the problems with the German "ß" and the dotless Turkish "i", and in general... well, you want to stay away from case-folding. So XML does.

Back-link to spec

Copyright © 1998, Tim Bray. All rights reserved.