If Ontology, Then Knowledge: Catching Up With WebOnt
May 1, 2002
There are at least two broad plans for the direction in which the Web may evolve and, significantly, each of them has XML as a keystone.
The first, known colloquially as "web services", is largely the domain of the largest corporate IT vendors, most notably Microsoft and IBM. The web services movement is often criticized by its detractors for having, in the end, very little to do with the Web, a criticism most recently levied in these pages by Edd Dumbill's essay, "Kicking Out the Cuckoo".
The second, the "Semantic Web", is largely the domain of the W3C, academic, government, and some industry researchers. Curiously, the W3C seems to have taken the position that the Semantic Web, or something very much like it, is inevitable, if the Web is to mature fully: "For the Web to reach its full potential, it must evolve into a Semantic Web, providing a universally accessible platform that allows data to be shared and processed by automated tools as well as by people" (Semantic Web Activity Statement). The Semantic Web is most often criticized by its detractors for having, in the end, very little to do with reality; or, put less pointedly, for being easier to dream about than to implement. Mike Champion suggested recently that the "conventional wisdom" criticizes the W3C's Semantic Web efforts along three grounds: first, that achieving web services interoperability has a higher priority for W3C member corporations than Semantic Web efforts; second, that Semantic Web efforts have not shown yet sufficient practical fruit; third, that the Semantic Web as a technological program is unlikely to ever live up to its promise.
In what follows I introduce one of the major elements of the W3C's Semantic Web initiative, the Web Ontology Working Group (hereafter, "WebOnt"), explaining its goals, deliverables, and progress to date.
What is WebOnt Supposed to Do?
In short, WebOnt, co-chaired by Professor Jim Hendler (of University of Maryland) and Professor Guus Schreiber (of University of Amsterdam), has been given the task of developing an ontology vocabulary for use in the Semantic Web (which is to be distinguished from an ontology of the Web, i.e., a formal schema of what there is on the Web: hosts, resources, media types, and the like). This ontology vocabulary or ontology language corresponds to the foundational stratum of Tim Berners-Lee's Web Architecture layer cake diagram. But what is an ontology vocabulary? It is a formal schema (in this sense, having little to do with W3C XML Schema) which, as the WebOnt Charter puts it, allows for the "explicit representation of term vocabularies and the relationships between entities in these vocabularies".
Less formally, an ontology language is a markup language -- presumably in XML, but RDF is possible, too -- that allows users to define formal ontologies or schemas (a perfectly unobjectionable synonym of "ontology" before the W3C again co-opted a very generic term for a very specific standard) for particular problem domains. WebOnt is not going to deliver ontologies for particular problem domains; rather, WebOnt intends to create the standardized markup language within which users can formally define specific ontologies for use on the Web. The important point to come to terms with is, whether DAML+OIL or WebOnt or some other project, if Web-scale interoperability is going to be achieved in the area of knowledge representation -- an achievement that the Semantic Web absolutely presupposes -- there needs to be one, preferably well-engineered way to declare and define formal schemas, such that tools which function at Web-scale can be easily implemented and deployed.
WebOnt's ontology language will make it possible to represent, in a machine-readable form, assertions about class and property relationships between logical entities, as well provide a "means to limit the properties of classes with respect to number and type, means to infer that items with various properties are members of a particular class, [as well as] a well-defined model of property inheritance" (WebOnt Charter).
In even shorter, more mundane form, WebOnt is tasked with cleaning up and otherwise standardizing the DAML+OIL ontology language, which was submitted to the W3C as a NOTE in late 2001, and which in turn came out of the DARPA Agent Markup Language and the Ontology Inference Layer projects.
One of the first substantive work products to come out of WebOnt thus far is a Working Draft, "Requirements for a Web Ontology Language", which makes for rather interesting reading, including use cases, design goals, requirements, and objectives.
Among the use cases for a Web Ontology language, the Requirements WD lists web portals, multimedia collections, corporate web site management, design documentation, intelligent agents, and ubiquitous computing. A web portal, for example, could use a formal ontology covering the knowledge domain the portal focuses on; for example, medical research about the origin of AIDS. In order to collect, analyze, share, and structure information, it would be helpful if the portal had a public ontology available for its own use and the use of its information-sharing partners, including medical publishers, university researchers, health organizations, and the like. An appropriately-encoded and rigorous ontology "can provide an expressive terminology for describing content, and inferences sanctioned by the ontology can be used to improve the quality of search on the portal" (WebOnt Requirements WD). In the use case of intelligent agents, to take another common example, an ontology can be useful in order to allow software constructs to manipulate and make inferences about knowledge (as opposed to mere data) it retrieves from various Web resources.
The WebOnt Requirements WD also lists eight design goals of a Web Ontology language. The first is that it should provide for ontology sharing, including the ability of one public ontology to extend another public ontology by, perhaps, specializing some of its terms or properties.
The second design goal is to support the change or drift over time of ontologies and their constitutive parts. As knowledge domains evolve, ontologies must be able to evolve in order to continue to formally represent their domain. Revisions and versions of ontologies must be supported by the underlying ontology language.
Third, a web ontology language must support interoperability, which means that it must offer some way to map similarities between disparate ontologies, at least insofar as two ontologies which ostensibly formalize the same problem domain have significant conceptual similarities. It is important to note, however, that adding mapping primitives to the standard web ontology language will never of itself be sufficient to achieve interoperability across any particularly instance of disparate, competing ontologies. It may be the case that no such interoperability is possible; that is, there is little, if any theoretical assurance that any two competing ontologies are practically reconcilable. (For an argument which addresses the issue of competing schemas, see "The Politics of Schemas", a two-part XML.com article series from early 2001.)
The fourth design goal -- which seems presently underspecified and vague, in my view -- says that "Different ontologies or data sources may be contradictory. It should be possible to detect these inconsistencies". It does not, however, specify what kind of inconsistency -- logical, factual, or some other? -- should be detectable, or whether the fact of a kind of inconsistency is to be part of a resolved ontology, which seems to be the implication of noting that "RDF and RDFS do not allow inconsistencies to be expressed". It is not very clear, then, whether this design goal is about the detection of some kind of inconsistency among ontologies (or data sources? Or both?) or whether it is also about the inclusion in a web ontology language of some way to represent some kind of inconsistency in an ontology.
Fifth, a web ontology language should strike a balance between, on the one hand, expressiveness of knowledge representation and, on the other, scalability of the processing or reasoning model.
Also in XML-Deviant
The sixth design goal is that a web ontology language should be easy to use, and both syntax and semantics are mentioned. There is some implication that more human-friendly syntax might be warranted.
Seventh, a web ontology language should be serializable in XML.
Eighth, a web ontology language must pay due attention to the formally global character of the Web, which, one may dare to hope, will mean more than simply falling back on the claim that XML is Unicode and internationalization-friendly.
So far, so good. There have been unexpected benefits due to the increased attention being paid, by the members of WebOnt and by many others under the auspices of the Semantic Web Coordinating Group, to relevant existing W3C standards, particularly RDF and RDF Schema. Problems and unresolved questions have been identified which must be addressed if the work of WebOnt is to be move forward and prosper.
If the Semantic Web is worth pursuing at all, something like a W3C-blessed Web Ontology Language is not only desirable but necessary. Though I lament yet another W3C generically named specification as unhelpful and confusing, the WebOnt WG which has been assembled to produce this crucial element of the Semantic Web is so far proving to be determined and capable. But only time will tell.