XML and Web Sites

October 30, 2002

John E. Simpson

In the USA this week's calendar includes Halloween: a time when spooks, zombies, and assorted denizens of the realm of the undead are said to walk the earth. It seems an appropriate moment to tackle one of the most gruesome questions recently posted to the XML forum.

How do I build a Web site using XML?

Q: I need help learning how to construct a Web site using tools to allow dynamic content creation. Specifically, I am searching for books, tutorials, how-to-bang-my-head-against-the-wall instructions, that sort of thing. This is what I know: XHTML, some basic XML, and CSS. I understanding concepts of programming...

This is what I want to know: how to build a site using open-source (i.e. free) software that allows me to provide dynamic content. This dynamic content would include trivial things such as reporting the weather for a zip code I specify to not-so-trivial things such as allowing a user to change the associated stylesheet so that they could specify font, font-size, background color, and other properties through an interface form and these attributes would be remembered the next time the user visits the site.

I'm also interested in understanding more about using databases to construct content for Web pages.

Technologies that I find fascinating and seem (to me) useful in implementing what I'd like to do are PHP/mySQL, SOAP, XML-RPC, and Perl.

My operating system is Mac OS 10.2. I have webspace that does not allow a command-line interface connection. Is this needed?

A: What a mouthful! Let's see if I can tackle your questions one at a time; and let's hope that you have plenty of time on your hands.

Books and tutorials

A good starting point for this kind of question is's own XML Resources section. (Look down the menu at the left side of this page; it's listed under "Guides".) Once there, for instance, follow the link to the page on Tutorials.

A notable recent addition to the list is Simon St. Laurent's "Monastic XML" site. It's not a tutorial in the traditional sense; there are no code fragments, no detailed discussion of Java or the DOM, no demonstrations (annotated or otherwise) of server- or client-side processing. Instead, it's simply "an ascetic view of XML best practices," directed at developers who want to know not the best way to achieve Result Z with XML, but rather (as St. Laurent says) "what markup is best at." This wouldn't be a bad place to drop by if you're already familiar with XML -- especially if you're so excited by the technology that you want to use it for everything.

Excellent tutorials of the more conventional sort, laden with how-to information and examples, include the following:

  • tutorials: presented by Nic Miloslav (lecturer in the Department of Organic Chemistry Education, ICT Prague) and the Zvon information exchange. A wealth of comprehensive tutorials on XLink, namespaces, RDF, XML Schema, XPath, XSLT, and other important topics.
  • The Web Developer's Virtual Library (WDVL) tutorials: like the Zvon site, above, the WDVL site includes copious lessons on a wide range of XML-related subjects. While there, also check the numerous links to other resources. Don't overlook the resources on some of the other topics you mentioned, like PHP, Perl, databases, and CSS; often, the most pragmatic advice on implementing XML with such tools can be found on sites devoted to the using technology itself, not from sites focused on XML as such.
  • The W3Schools XML tutorial: especially recommended for beginners to XML, with plenty of examples.
  • The XML tutorial (by Mike J. Brown): delves into a subject often overlooked in other tutorials -- Unicode, ISO/IEC 10646-1, and the Unicode/UCS character encoding model.

As for books, I'm partial (perhaps for obvious reasons) to O'Reilly's list. Books listed there have sample chapters posted on-line and are also available in full via subscription to the O'Reilly Safari venture. If you're familiar with O'Reilly titles on other topics, you know what to expect from their XML titles: clear writing, no beating around the bush, and a reputation for technical accuracy and thoroughness.

Another publisher to consider is Prentice Hall-PTR, home to the Charles F. Goldfarb "Definitive XML Series." Goldfarb, as you probably know, is generally credited with shepherding XML's predecessor, SGML, into widespread acceptance. Goldfarb has edited the books in the PH-PTR series; you can be sure therefore that their content will be accurate and authoritative.

(Regardless which book(s) you think you're interested in, always be sure to read not only what the publisher has to say, but also how reviewers and other readers have responded to them. Buying a book on the sheer strength of its title, the author's name, the pictures on the cover, and so on, is a sure prescription for disappointment -- not just your own, but the author's.)

XML software

One of the first (and, I think, still best) sites for general information about software for processing XML is the eponymous site, begun by James Tauber and kept very up-to-date by Tauber and Linda van den Brink. The site is organized according to application type, such as XML parsers, XML editors, database systems, content-management applications, and the like. Since each package is described not only in a brief capsule summary, but also in terms of version, platform, developer, and revision date, finding what you're looking for is simply a matter of using your browser's "search" feature. For instance, while preparing this column I checked the page of information about XML editors; searching on the string "mac" yielded seven Macintosh-compatible editors (as well as a handful of Emacs-based or Emacs-like packages).

Also check Lars Marius Garshol's site for a well-organized list specifically of free XML tools ('s listings including commercial as well as shareware/freeware products). Among the ways in which software is categorized here is by platform -- especially important for users, document authors, and developers occupying the non-Windows/non-*nix portion of the landscape.

XML and databases

By far the best starting point on this subject is Ron Bourret's XML and Databases site. The main page links to specific sections/pages on topics (like "Why use a database?" and "Data versus documents"), but especially to a wonderful sub-site, XML Database Products, which lists (albeit with minimal commentary) a host of these important applications, by category (e.g., XML application servers, XML query engines).

Don't overlook what the big database vendors have to say about XML, either -- in many ways, perhaps surprisingly, they've led the effort to break away from the constraints of proprietary solutions. (Of course they're not wholly altruistic; each has adopted specific strategies for keeping their customers "locked in," while encouraging open and platform-independent data interchange via XML.) IBM and Oracle are the leaders here. Just remember to consult Bourret's XML Database Products, above, for information on other vendors as well.

Providing dynamic content with XML

This is starting to get into the nebulous-buzzword corner of the XML universe. Still, you've supplied some idea of specifically what you're looking for on this subject.

As I mentioned above, the site itself has loads of information -- columns, features, resource guides, and so on -- on just about anything XML-related. You can search the site on "dynamic content" to get an up-to-date list of relevant material. (Note: when I searched on that string just now, I got over 1500 hits. Not all of these are truly full-blown features, however; many are links to pages of simple descriptions of products -- open-source or otherwise -- which then just link to the vendor's own site.)

A different (and maybe forgotten) approach to getting your feet wet here is to check recent (at least reasonably so) issues of technical periodicals. Much of the content for these magazines is available on-line, especially their back issues, and directly addresses the needs of developers. For instance, the April 12, 2001, issue of Network Computing included a brief, clearly written overview of the main products (Microsoft, Oracle, and the Apache Project's open-source Cocoon): "Dishing Up Dynamic Content," by Ahmad Abualsamid. Visual Basic Web Magazine (VBWM) also covered the subject in a brief (but undated) column, "XML Corner: Dynamic content for your applications." I myself have often turned to New Architect (formerly WebTechniques) for this kind of information. Their back-issues search page, as of a few moments ago, turned up 63 articles when fed the string "dynamic AND content AND xml."

Also in XML Q&A

From English to Dutch?

Trickledown Namespaces?

From XML to SMIL

From One String to Many

Getting in Touch with XML Contacts

Remember one thing when it comes to "providing dynamic content": what makes it dynamic isn't the tool, or set of tools, which you as the developer employ to create the XML document(s) and the code which delivers it. The developer can use a simple text editor for any of that work. No, what makes it dynamic is the way the content is served, and/or the way the content is processed by a so-called "user agent" (like a browser or other client package). Your operating system -- be it Mac OS 10.2, or Windows 2000, or Linux, or whatever, whether it has a command-line interface and so on -- needn't be a consideration when you start worrying about delivering dynamic content. Successfully delivering the content depends on the capabilities of the server you choose and/or those of a client. A command-line interface to the server itself isn't generally necessary.