
Nobody Asked Me, But...
I've been writing the "XML Q&A" column for a year now. An anniversary seems a good occasion to think about the questions I wish I'd been able to answer during that time -- questions no one ever asks.
Let's start with an easy one.
Q: Where do I post a question for consideration in "XML Q&A"?
A: All questions I answer here are posted to O'Reilly Network's XML Forum. (Editor's note: The XML Forum is no longer available. We maintain this content for the sense of historical interest.) You have to register to participate in the forum (and yes, posting questions is considered "participating"). The registration is free, though. If you just want to read messages and replies, no registration is required.
It's not exactly that simple. As of this writing, the forum includes over 1800 posts. I don't try to read all of them. Instead, I scroll through those posted in the previous three or four weeks, looking for message subjects which are intriguingly worded or seem to cover areas I haven't covered before. Another criterion is whether someone else on the Forum has already adequately answered the question. (Forum participants don't have to wait for me to add my two cents to the discussion; they can jump in and reply to any message they want.)
I shy away from some subjects: questions about processing XML with Java, Python, COBOL, Perl, and so on -- programming languages and APIs either outside my area of expertise or, heck, just outside of XML itself; questions about obscure XML vocabularies; questions in languages other than English; questions posted to other venues, like the XML-L, XML-DEV, or XSL-List mailing lists; and questions about server configuration.
That leaves plenty of room for things to wonder about, though, from microscopic nooks and crannies in DTD syntax on up to big hairy issues like W3C-sponsored future directions for XML. And if I don't get around to selecting your question for a given month's list of two or three, don't despair: you can always fall back on the collective wisdom of the Internet, other participants in the O'Reilly Network XML Forum, or on any of the subject-specific mailing lists.
Now let's ratchet up the complexity a bit.
Q: I'm designing my own simple XML vocabulary, but I don't understand either DTDs or XML Schema. What can I do?
A: This really drives me crazy -- maybe even crazier than it drives you. There's a terrific, often overlooked answer. Forget validation. Stick with well-formedness.
A little background first: XML shares with its SGML parent the notion of validating a document against some formal structure. This formal structure can be specified in the form of a Document Type Definition (DTD) -- or, more recently, an XML Schema. By declaring a formal structure, you can declare which elements may fit inside which other elements, how many times the contained element may occur, what attributes the element may or must have, and so on.
But the XML 1.0 Recommendation also parted company with SGML by introducing the concept of well-formedness. A well-formed XML document is to a valid document as a simple e-mail message is to a spell-checked one. There's nothing at all wrong with the former for many (I'd argue most) purposes. A simple, non-spell-checked e-mail message still follows some important structural rules, especially in the form of its headers, attachments, and so on. Checking the spelling makes the message content more reliable, more rigorous if you will. But it doesn't make the document necessarily any better.
I don't mean to downplay reliability and rigor, of course. If your proposed vocabulary involves the movement of money, private or confidential information, state or corporate secrets, and so on, then, yes, you definitely will need to validate your documents at some point.
However, it's not true that you must "get" DTDs or XML Schema in order to "get" XML itself; it's not even required to design a perfectly functional, elegant vocabulary. You'd never believe it, though, based on a cursory scan of many of the "introductory" XML references and tutorials. The W3C has contributed to the confusion, in a way, by insisting that XHTML documents (for instance) "must" at some point validate against a DTD -- or they're not true XHTML. There are sound reasons for this insistence in a language intended for general-purpose Web use, having to do (for instance) with platform independence and reducing browser bloat. Just don't generalize from XHTML's example to conclude that your own special-purpose vocabularies are somehow illegitimate if they can't be validated because there isn't a DTD or Schema document lying around
In a slightly different context, The XML FAQ, edited by Peter Flynn, says,
XML allows groups of people or organizations to create their own customized markup applications for exchanging information in their domain (music, chemistry, electronics, hill-walking, finance, surfing, petroleum geology, linguistics, cooking, knitting, stellar cartography, history, engineering, rabbit-keeping, mathematics, genealogy, etc).
Doesn't that sound like a great world -- one in which nearly every imaginable data management and interchange purpose is served by a single markup standard? Unfortunately, the media's tireless emphasis on large-scale, sprawling B-to-B XML applications -- complex (and yes, fully validated) as they must be -- has dimmed the likelihood of such a world ever coming to pass. Here's what I think: The focus on DTDs and XML Schema as the hallmark of so-called real XML has done more to damage XML's widespread use and popularity than all the usual culprits (proliferation of XML-related standards, proprietary extensions, and so on) combined. Maybe that's just me, though.
Now let's move on to one final, less serious (verging on the loopy) question.
Pages: 1, 2 |