Using the namespace prefixes instead of proper namespace processing is dirty. Getting a namespace-aware XML parser is not that hard. Please don't break namespaces by using prefix-based guessing.
Even more worrying is the last sentence: "Next month we'll tackle the thorny problem of how to handle RSS feeds that are almost, but not quite, well-formed XML."
What's there to tackle? The only correct way to handle ill-formed XML is to firmly reject it. Please enforce the XML well-formedness requirements in order to protect XML from degenerating into tag soup.
Oh my freakin' god, XML is not a sacred standard, shit happens, the offical W3C docs encourage browser/user-agent creators to attempt to properly render imperfect html/xhtml.
The spirit of RFC's and protocols that has made the internet work(able) is:
"Be liberal in in what you accept and conservative in what you send" and it's variations by Jon Postel.
Also, TOG et al would probably assail a system that was so anal and rigid and non-resilient (and they would say lazy) that it couldn't route around some minor formatting transgressions and give the user 50%, 80% or whatever percent of the feed that it could decipher.
But this brings up the elephant that no one is allowed to talk about - XML and it's main parsers are extremely brittle and complex.
Jeez... I just re-read your posts and found more garbage.
"The spirit of RFC's and protocols that has made the internet work(able) is: "Be liberal in in what you accept and conservative in what you send" and it's variations by Jon Postel."
Wow. You completely mis-interpreted that one. Being liberal in what you accept does not mean you have to accept corrupt data. It simply means you have to fail gracefully. Trying to interpret the meaning of corrupted XML documents is dangerously stupid. Instead of just letting both the consumer and producer know that something is wrong, you've risked propogating a bug or possibly corrupting the meaning of data.
Any XML that is impropertly syntacizated should be rejected. If we are to maintain the purity of XML, such transgressions should never be permitted. Indeed, the offenders of The Standard should punished, if not purged. Any lesser response would encourage the dilution of The Master Protocol. [url="http://www.profischnell.com"]Übersetzungsbüro[/url]
I wish more people would write sites like this that are actually helpful to read. With all the garbage floating around on the net, it is refreshing to read a site like yours instead.
Any XML that is impropertly syntacizated should be rejected. If we are to maintain the purity of XML, such transgressions should never be permitted. Indeed, the offenders of The Standard should punished, if not purged. Any lesser response would encourage the dilution of The Master Protocol.
In fact, there should be a web application that immediately punishes, through, say, electrical shock, any deviations in XML format in a document and, if the non-conforming coder insists on publishing such to the web -- then, well, there is no saving the author and he/she should be abended and the document DES-wiped, before he can do any further damage to the structure of the internet.
Since people may not get the "TOG" reference:
TOG - usability expert that puts much more responsibility on the system creators for making systems that "Just Work++" than many programmers would like, after reading too much of his stuff start thinking "Wow, programs should do a lot better job for the user in many cases" http://www.asktog.com/Bughouse/10MostPersistentBugs.html
You've said so many things that are simply wrong or demonstrate that you don't know what you're talking about that I'm not sure where to begin.
XML was designed to be strictly parsed. It is a *requirement* that all XML documents be properly formed. This is part of the standard, look it up. Any program that parses malformed XML documents is broken. That's a bug, not a feature. All valid XML parsers MUST reject a corrupted file. If a TCP/IP packet is corrupted, does your router try to fix it? No, it drops it a requests a resend. XML is a format predominantly for machine intercommunication, it is NOT LIKE HTML. It is unacceptable for an XML document to be improperly formed. What the hell is the receiving computer supposed to do with it? Guess what was intended? The problem is, you don't understand that (chorus) XML IS NOT LIKE HTML. XML marks the *meaning* of data, how in hell do you intend to deal with corruption? Allowing mal-formed XML documents would be the equivalent of lossy-compressing all your business documents, that is, colossally bone-headed.
HTML can be loosely interpreted, because it's just formatting. XML must be strictly interpreted because it's information.