Introduction
The hypertext markup language is an SGML format.The result of that design decision is something of a collision between the World Wide Web development community and the SGML community--between the quick-and-dirty software community and the formal ISO standards community. It also creates a collision between the interactive, online hypermedia technology and the bulk, batch print publications technology.
XML: Principles, Tools, and Techniques heralds the appearance of Extensible Markup Language (XML), a simple, powerful subset of Standard General Markup Language (SGML). In this issue of the World Wide Web Journal you'll find the complete technical specifications, primers, implementation case studies, applications, and even historical and philosophical reflections on the emerging role of XML.
When Tim Berners-Lee designed HTML, he chose to base it on SGML because SGML was an open, extensible technology that facilitated sound information management techniques. He knew extensibility was crucial because data formats on the Web would have to evolve as the system grew and changed. And openness was critical because he didn't want any single company or group to be able to prevent good ideas from other places from taking off.
Anyone who is familiar with HTML knows that its evolution has been far from graceful. What Tim didn't realize was that SGML was so complex and obscure that developers would guess what the standard said rather than looking it up--and they wouldn't always guess right.
The result is that our hard-won interoperability is not based on an open specification, but by the costly, primitive black art of reverse-engineering.
Another result is that today's Web software does not benefit from the extensibility of SGML: the market is full of specialized tools and applications that add tags to HTML for specific purposes--but the Web infrastructure as a whole does not accommodate these extensions.
XML is destined to remedy all that. It is a clean, simple dialect of SGML that developers can understand and implement consistently, and it provides extensibility--room to grow--beyond the centrally maintained set of HTML tags.
"Make the easy things easy and the hard things possible," is an established maxim in computer language design. HTML is successful in making the easy things easy, but if you have something that can't be done with the existing idioms in HTML, you might have to develop a completely new browser from scratch. Component technologies like plug-ins and Java are lowering the bar somewhat, true. But with the deployment of XML and stylesheets, the option of developing your own look and even your own document structures makes it easy enough to appeal to just about any information provider who is willing to tinker around.
Beyond the engineering details of new Web technology, you can look to the Web Journal to provide context about the community behind the ideas and the broader role of standards bodies and industrial developments. In the first section of this book, we present a round table interview with members of the W3C design committees behind XML and lively first-person argument from David Siegel on the virtues of structured information. In "W3C Reports," you'll also hear about Mathematical Markup Language, a flagship XML application, as well as the Document Object Model's hooks for client-side scripting to animate those elements.
In "Technical Papers," several contributors posit how XML is a natural complement to Java and Web automation technologies. In the same way that URLs are shared technology for pointing to and accessing resources in distributed applications, XML provides an interchange format with extremely broad appeal. Each time a common programming task is institutionalized as shared technology, it lowers the cost of applications development and increases efficiency and innovation.
This is not to say that HTML will fade away: not everyone wants to develop a new document structure, stylesheet, or Java applet just to put up a Web page. HTML will always be there making the easy things easy. But with XML, if you want to go beyond the boundaries of HTML, it will be straightforward to do just that.
We'd like to thank those that made it possible for so many in the Web community--ourselves included--to study SGML and structured documents via the Internet by releasing software and documentation, maintaining ftp archives, hypertext bibliographies, and answering countless questions in USENET forums: Robin Cover, Erik Naggum, Dave Hollander, Conleth O'Connell, Darrell Raymond, Eliot Kimber, Lou Burnard, C.M. Sperberg-McQueen, and James Clark. And of course, a tip o' the hat to the dedicated O'Reilly staff for riding this bronco.
Dan Connolly, Austin, TX
Rohit Khare, Ellicott City, MD
September 19, 1997
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.