Developers Driving XML in Montreal
August 28, 1998Liora AlschulerLisa Rein
For all you developers and non-developers who need to know where the cutting edge lies, but who couldn't enjoy the friendly ambiance of Montreal in the summer time--well, we couldn't either because we were deep in the innards of Le Centre Sheraton--here is a rundown of the Developers Conference convened by XML WG Chair Jon Bosak and sponsored by the GCA.
Interest in Python and Perl Abounds
In case any doubt remained, XML is now certified hacker-safe and programmer-approved. Instead of debating whether XML was a suitable data format for a given programming language, programmers were debating which is the best language for working with XML. Paul Prescod gave an inspired pep rally about Python, an object-oriented scripting language that is becoming popular among XML programmers. An open source development platform with a growing community of developer supporting it, Python is noted for its straight-forward syntax and extensibility.
XML.com's Tim Bray reported on Perl and XML, repeating a keynote speech he had recently delivered to 1,500 wildly enthusiastic Perl mongers at the O'Reilly Perl Conference in San Jose. Bray got across the value of XML by pointing out the insanely variable syntax of Linux configuration files. He demonstrated the use of a new XML parser Perl module.
Documents in the Driver's Seat
That developers like XML is not so startling, but the way XML is being used is changing some fundamental assumptions about the relationship between content and software: "documents" (whatever they are) are driving the interface between executable programs, the interface between the executable programs and the user of such, and the behavior of the programs themselves.
In Montreal, evidence of the new document-driven interface paradigm was everywhere:
- RDF is driving Mozilla 5.0 in both appearance and functionality. Mozilla's navigation control is an RDF tree as well as the history file, bookmarks, site maps and the configuration of the browser itself. Toolbars and other features can be customized with an RDF file. For a complete description, see http://www.mozilla.org/rdf/doc/api.html.
- GroveWare, Inc. showed a scheduler where the GUI is driven from an XML document repository.
- UI configuration was one of three possible applications shown for the XML Testbed in Java (aka XXX or eXpandable XML eXploitation, [ed.: his rendering, not ours]). Steve Withall's full testbed is available from XML.com.
- XML filters from Xtenit where XML provides content, context and some rules for document filtering. Applications include information routing, refined text retrieval, and agent servers defined by an XML profile.
- InDelv, Inc. editor/browser lets users modify tool bars and menus from XML documents and the .ini file is replaced with XML. According to the developer, namespaces goes another step toward blurring the line between program and document because they use XML "like an event stream".
- VEO's use of schemata in program generation.
What VEO is up to
Matthew Fuchs of Veo Systems gave an enlightening presentation about the XML schema language being developed by Veo Systems. Although Fuchs did not elaborate on the syntactical nature of Veo's schema language or the company's Common Business Library (CBL), another eagerly awaited XML-based e-commerce-enabling technology, he did comment on general design issues for a schema language.
Fuchs explained that while XML provides the syntax to mark up information, it does not provide sufficient means for modeling it. Fuchs discussed how a simple schema syntax with a mechanism for extension could solve this problem.
Schemas become useful when they make it easier for a computer to understand a markup language and to translate that language into code. Linguistic principles dictate that "the language can determine what you can think". Humans can understand the intention of the designer and can interpret markup through comments and documentation, but a computer needs a schema.
Fuchs stressed that validation, whether it is accomplished by using schemas or DTDs, is essential to making the XML/Schema/Java mapping evident and for communicating in a meaningful way.
When asked about plans for Veo's XML schema language, Maloney stated that it will be submitted to the W3C for consideration in the upcoming round of schema language discussions. Maloney said to expect an announcement from Veo soon on CBL.
No New Developments on XML in the Browser
There was very lukewarm support for direct display of XML in both major browsers: neither Netscape nor Microsoft is giving us a date when we can render XML directly using either CSS or XSL. Microsoft is committed only to supporting the "XML in HTML" that will presumably be spewing forth from countless Office2000-enabled desktops in time for the release of IE5.
There was an informal Birds of a Feather (BOF) session that took place during Thursday afternoon's lunch given by Intervista's Dan Ancona about the future of XML and 3D technologies. Ancona's talk focused on demos of his work on 3DXML, a technology that offers ways of rendering and linking XML via advanced 3D user interface widgets. He demonstrated adding a new piece of information to the space in the XML, re-ran the perl script that generated the VRML, and displayed the new VRML showing the added information. Neil Kipp of Virginia Tech also gave a brief reprise of the demonstration of the Visual Library work he'd presented earlier in the week. Neil discussed some of the difficulties he'd had with VRML and the group had some ideas for resolving them.
Other technologies, such as Perspecta and Inxight's tools were briefly discussed, although the group never got into a discussion of Microsoft's XML-based Chrome.
Here are a few things we expect to happen:
- Expect several (read: more than 2) new submissions to the W3C. We can't be more specific, but we believe this will happen within weeks. At least two are supplementing/complementing/competing with submissions we have already seen.
- XTech '99 and XIO (stands for XML Interoperability) '99 were announced by the GCA. To be co-chaired by Jon Bosak of Sun Microsystems and Tim Bray of Textuality and XML.com, the conferences will be held in San Jose March 8-11, 1999. They will include presentations in two tracks for "strong backs" and "pretty faces" (core implementation issues, back end processing and rendering, and usability issues); a vendor exhibition; and an interoperability demonstration based on supplied XML documents. The conference is co-sponsored by GCA and Sun Microsystems.
XML for Content Developers
Content developers need to track the software developer community for answers to these two burning questions: First, Is it safe to put my content into XML today? And second, Is it safe not to?
Putting business-critical information into XML looks like a better bet all the time. While there is still no strong commitment to direct rendering of XML in Web browsers, there was near unanimous support for XSL, which wasn't hard to do since an implementable working draft won't be available for a while yet (although the sentiment among conference goers was that the August draft was a "good start"). At least one new tool is dedicated to simultaneous production of Web-ready content and high-quality print. InDelv expects to have a "technology introduction" (sort of a prolonged alpha state) ready for download within weeks supporting XSL screen rendering and DSSSL-like print capability.
Further encouragement should be taken from the large number of implementations described and demonstrated:
- IBM's Grand Central Station, a completely customizable "information gatherer" using structured hierarchical information as well as keywords built on an RDF engine. Try this out from the IBM site.
- Sun's XML Library developed by the Java Software Division includes validating and non-validating parsers, preliminary support for XML Beans, and examples of a validation service. The Library will appear on the Java Developer Connection in the "Early Access" section after August 31.
- HP's InfoWorks is extending their document management to "chunks" using XML and their current Documentum document management server. Yes, the "chunk" is back. Dave Hollander explained that small bits of data are well managed by databases and large files by traditional document management, but the optimum size for reuse can run the gamut from data bit to long document and the XML chunk covers the range. The HP mandate is to convert all business-critical documents to this system within the year.
- CWI's SMIL implementations include an authoring and playback environment for multimedia and a "storage-to-presentation generation environment."
- VHG Consulting's Virtual HyperGlossary which is Peter Murray-Rust's integration of terminology with document semantics. See more below and the VHG home page.
- XML.com's Annotated Spec was explicated by Tim Bray. He gave a presentation on the conceptual design and syntactical execution of our Annotated XML Specification, how the annotations were parsed, how he traversed back and forth between the spec and the annotations, and his use of extended links. As Jon Bosak pointed out, this was a rare opportunity to hear an an exegesis of an exegesis; that is, an explanation of a method used on an annotation of a specification. We've also got an article explaining the process on XML.com.
Peter Murray-Rust is the creator of CML, Chemical Markup Language, and Jumbo, the world's first XML-based browser, which can render CML. He has also just joined XML.com as our XML:Geek columnist. Essentially, the VHG system combines the power of DTDs with the Data Categories and Terminology ISO 12620 standard to assign semantics to document markup. A "HyperGlossary", as Murray-Rust defines it, refers to any semantic resource composed of standardized subcomponents, such as data sheets or catalogs.
Like Bray's annotated spec, VHG's use the power of XLink to associate structured data with semantic meaning by pairing up the structural hierarchies. Terms are grouped into different classifications using xml:link "extended" links with "locator" references to the linked terms. Multilingual equivalencies (providing definitions in different languages) are defined through an externally-linked databases, allowing different curators to develop their glossaries in parallel and then link to the associated definitions using ID attributes.
Content and Knowledge Management
The SingleSource Knowledge Management System presented by Kurt D. Fenstermacher from the University of Chicago's Computer Science Department's Intelligent Information Laboratory is also of interest to content providers. Fenstermacher described the design process for constructing an information capture and access program which provides a front-end to search multiple online source, view clustered search results, and provide a summary of the best pickings of the requested information.
Fenstermacher described "Knowledge Management" as the "storing and retrieving of information at a level of abstraction people are comfortable with and find useful." SingleSource uses XML in its knowledge management system to aid researchers in finding information and then store that information for strategic use in later searches.
Moving Ahead Despite Compliance Headaches
Every developer who presented over the two days in Montreal had weighed the risk of aiming at a moving target against the risk of waiting too long to take aim. It would be inappropriate to chide these folks who are making a good faith effort to implement open standards for not being in compliance with the latest and greatest. The new, small players in this field--those who are trying their best to keep up--are staking their future products on compliance with what is a set of open standards in flux.
The considerable changes to XSL between the original member submission and the first Working Draft will present problems for some of the pioneer implementors, but there was little public whining over these difficulties.
Netscape has indicated that they will update their RDF implementation to comply with the July RDF Model and Syntax Working Draft and "we'll update to comply with the latest draft" was a refrain sung by most all presenters.
Microsoft, however, added a coda to this ditty stating that they will preserve
backwards-compatibility with existing, pre-standard implementations. They will retain
second interface to a second object model that will be "in compliance with an earlier
version of XML"--a statement that is nonsensical from a standards perspective. This
will allow users to continue to support illegal XML, such as null end tags
</>) by choosing the non-DOM object model. They also announced that
they were partnering with DataChannel to provide a Java-based XML parser later this
fall that will be DOM-compliant and XML 1.0-compliant, while maintaining the same
backwards-compatibility with pre-existing implementations to protect early adopters.
Acknowledging the need for some coherence in these claims of compliance, OASIS (the Organization for the Advancement of Structured Information Standards, formerly known as SGML Open) announced that it is forming a technical committee to explore XML conformance and solicited input from the developer community on what it wants and expects in conformance testing and monitoring.
You can explore the presentations online from the XML Developers Conference at the GCA site.