Report from Montreal

August 25, 1999

Lisa Rein

Tools Introduced
Topic Map Navigational Demo

Online Topic Map Demonstration
Documentation about the Topic Map Demo

Natrificial LLC's "The Brain"

The Brain's Information and Download Site

Data Descriptors by Example

DDbe Download page

Semantic Networks Python Module

Joe Strout's Artificial Intelligence in Python Page


XED Information and Download
LT PyXML Information and Download


DATAX Documentation
Download and Beta Test DATAX

FOP (Formatting Object to PDF)

FOP Documentation
Download FOP

Expat C++ Wrapper a la SamXa

Direct link to Expat C++ Wrapper
Information and Documentation of SamXa and Expat C++ Wrapper

Last week's MetaStructures 99 and XML Developers' Days conferences left quite an impression on those who attended. Over the course of the week, many new concepts and technologies were presented, familiar issues were expanded upon, and numerous XML-related software applications were demonstrated live. By week's end it was clear to all that many of these technologies that were once considered purely theoretical and impractical to implement were finally coming of age and ready to prove themselves in the marketplace, once and for all.

Many of the tools presented at both conferences are not yet available online, but we’ve collected some links to the software tools that were available when this article was published.

Topic Map Applications A Reality

Topic Maps are an approved ISO standard, to be published this fall, for describing semantic linking relationships between web resources, and associations between links. Although the current Topic Map standard uses HyTime syntax, a "Topic Maps for the Web" spec will soon come out of the W3C that will implement Topic Maps using XLink syntax.

The power of topic maps lies in their simplicity. Even an application that implements less than 10 percent of the specification can still be very useful. Another important point about Topic Maps that was continually demonstrated is that, due to the intuitive nature inherent in any topic map design, no documentation is required for an end user to successfully navigate through a topic-mapped web site.

A number of Topic Map applications were demonstrated at Metastructures 99. Kal Ahmed, a Solutions Architect at Chrystal Software, demonstrated how topic maps are less about different views of the same document (a la XSL), more about different views of the same corpus of documents. (As Henry Thompson explained in his presentation, corpora is a computational linguistics term for "large text collections".)

In other words, topic maps provide an intelligent route around a large body of information. "This excites me because I can see technologies such as topic maps providing a means of moving away from the traditional volume\folder\document hierarchy to a more web-like structure of connected information," explains Ahmed.

As Ahmed explained during his presentation, the Topic Map standard also defines a precise way to merge two topic maps. This gives the potential scenario in which multiple users each have their own view of the same corpus and can swap views with each other, publish their own view, and take the interesting bits of someone else's view and merge it with their own view.

Ahmed demonstrated these concepts using a tool called "The Brain" as a GUI (from Natrificial LLC—see tools below) sitting on top of Chrystal Software's Astoria Web Services suite to provide HTTP access to its repository. The topic map itself was stored in the repository, along with all of the information that it accessed. Ahmed developed a Java applet which parsed the topic map and then populated "The Brain" with "thoughts" representing the topics in his topic map.

Although no demo version of the applet is available yet, it should be in the next few months, along with a white paper detailing its features. For more information about that application after its release, subscribe to the Topic Map mailing list.

Several presentations suggested that the current mechanisms for associating links within Topic Maps have a few problems, but in general the consensus was very positive.

"I was happy to see that topic maps moved from the specification stage to the implementation and concrete example stage," said Didier P.H. Martin, CEO of Talva Corporation. "Many different topic map interfaces were demonstrated that were intuitive and easy to use."

Python Tools Make a Strong Showing

Eric Freese, Isogen Corporation's Director of Consulting Services - Midwest, gave an informative presentation on integrating of Topic Maps with semantic networks. Freese described how the scopes and themes of Topic Maps can disambiguate topics, and demonstrated that concept using a Python-based semantic networks program (see tools sidebar).

Another promising Python tool, introduced by Henry Thompson on day one of the developers' conference, was LT PyXML, an embedding in Python of a fast validating XML parser. Thompson also explained and demonstrated XED (see tools below), his free XML-specific text editor, built on top of LT PyXML (see tools below), which automatically maintains the well-formedness of a document and supports fast keyboard-only authoring. XED's use of LT PyXML allows it to exploit DTDs to assist authoring, with Schema-based authoring and validation coming soon.

Hedge Automata Universally Applauded

One of the most popular presentations at the MetaStructures conference mentioned by both attendees and presenters alike was Paul Prescod's talk on "Hedge Automata". Prescod's Hedge Automata theory, which is largely based on the previous work of Makoto Murata, provides a formalized framework that can be used to describe, compare, and evaluate document representation in schema languages. In theory, such a formalism would enable the successful integration of multiple schemas in a dependable manner, and could potentially provide schemas with the missing validatability of DTDs.

Talva's Didier P.H. Martin said Prescod's presentation was "a positive direction for schemas by providing them with a more formal ground. I've been shopping for a good query notation, and what Paul presented is definitely inspiring."

Schema Scandal?

There was unanimous sentiment among conference attendees that it was rather ironic that none of the members of the XML Schema Working Group were able to attend Prescod's presentation because their face-to-face meeting had been scheduled downstairs on the same day.

"It's scandalous that the XML Schema Working Group met at exactly the same time as the MetaStructures conference," exclaimed Steven Newcomb, co-chair of MetaStructures. "Members of the Schema Group could have been exposed to some ideas, such as hedge automata, that are needed to handle the oncoming glut of conflicting industrial vocabularies."

Luckily, since lunch for the XML Schema Working Group Face-to-Face took place in the same room as the MetaStructures Conference, some Schema Group members did get the idea that they were missing something relevant.

Apparently some of the presenters could have benefited from the schema group's expertise as well.

"I was struck by the apparent confusion about the roles of schemas and namespaces, and by the emerging importance of the information set," said Murray Maloney, independent consultant and member of both the XML Schema and XML Linking Working Groups.

Importance of InfoSet Emerging

Many at the conference noted that a majority of the talks on the first day were more or less focused on integration issues surrounding the emerging Information Set specification.

"A principled provision for the declarative specification of the DocInfoSet<->ApplInfoSet mapping is a crucial requirement on ongoing W3C standardization efforts in the XML area," explained Henry Thompson, of Edinburgh University's Language Technology Group. "Not surprisingly, the XML InfoSet is not what application developers want. They want Java instances/relational database rows/RDF graphs/UML structures, etc."

"A Java/C++/... XML standard binding is definitely something we are looking for in the near future," confirmed Fabio Arciniegas, a graduate student at the Universidad de los Andes in Bogota, Colombia who presented his SamXa Academic Tests Manager at the dev day conference. Arciniegas presented his own Expat C++ wrapper (largely based on an early version of Andy Dent's Expat C++ wrapper) during his presentation of SamXa. The Expat C++ wrapper is freely available for download (see tools sidebar).

Others pondered the changing role of object-oriented programming with regard to the emerging importance of XML in application development.

"Some people at the conference were willing to go so far as to talk about the death of OO programming," observed Ron Lake, President of Galdos Systems, a software development firm in Vancouver, Canada. "While I do not expect this to happen, it is clear that XML programming will shift our emphasis from data encapsulation to data transformation."

"The future of the OO + XML union is still an open question: I don't believe XML will mean a death to everything OO stands for, but definitely it poses some very interesting questions on otherwise holy topics like encapsulation," said University of the Andes' Arciniegas.

Java Tools Introduced at Dev Day

James Tauber, an independent software developer and XML consultant based out of Perth, Australia, provided an update on his FOP (Formatting Object to PDF) open-source XSLT/XSL-based application, which is capable of formatting an object produced by XSLT into a PDF document (see tools sidebar).

Len Burman, from IBM, presented a detailed look at AlphaWorks' DDbE application (Data Descriptors by Example), which is a Java-based component library that can automatically generate an XML DTD or schema from a set of document instances (see tools sidebar).

Another Java-based tool introduced during the Developers' Conference that turned more than a few heads was David Megginson's DATAX. Megginson, the chair of the Information Set Working Group, gave an informative presentation about a real-world RDF implementation currently under development that inspired many to take another look at RDF's potential.

DATAX is a Java 1.2-based library designed to simplify the exchange of structured data records written in any RDF-compliant XML format. Funding for Megginson's DATAX package was provided by Muze, Inc., a music, books, and video database provider. Although currently the DATAX software is only available in beta form for testing purposes, downloading is encouraged (see tools sidebar). A stable version is likely to be released as free open source software later in the Fall.

"David Megginson's DATAX RDF processor gave me my first real hope that I might someday use RDF for something genuinely useful," said Simon St. Laurent, an author of several books on XML who gave a presentation on XPDL (XML Processing Description Language), a syntax for providing machine readable and extensible descriptions of document types at the XML dev day conference. "This is the first attempt I've seen to implement it in a way that is both generic and useful."


Overall, everyone had a good time, even "outsiders" that were just getting their first taste of XML technology.

"I attended the conference to find out how I could apply this technology to our current and upcoming projects in corporate web development," explained Jed Lewis, Manager of Media Development for Design Trust, a Connecticut-based consulting firm. "I was also interested in seeing current applications of XML in the real world, and where XML was standards-wise. The conference satisfied me completely in all respects."