Menu

The Semantic Web is Closer Than You Think

August 20, 2003

Kendall Grant Clark

In order to make due enquiries into all these, and many other particulars which go toward the complete and comprehensive idea of any being, the science of ontology is exceeding necessary. This was what was wont to be called the first part of metaphysics in the peripatetic schools. -- Isaac Watts, Logic, or the Right Use of Reason

Last year I wrote an article for XML.com, "If Ontology, Then Knowledge: Catching Up With WebOnt", in which I introduced the W3C's web ontology language effort to the XML developer community. As a result of a long journey filled with hard work, the W3C's web ontology language, now called OWL, was advanced to W3C Candidate Recommendation on 19 August.

While there is a lot of talk these days about the Semantic Web being the crack-addled pipe dream of a few academic naifs, in reality it's a lot closer to realization than you might be thinking. Now I want to be clear about this point: I'm not suggesting that we stand on the brink of a fully achieved, widespread Semantic Web. I am suggesting that some of the major pieces of the puzzle are now or will soon be in place. OWL, along with RDF, upon which it builds, are two such very major pieces of the Semantic Web puzzle.

An Ontology Language

So what is OWL anyway? The first thing to mention is that OWL is specified in six W3C documents:

OWL is an ontology language for the Web, which builds on a rich technical tradition (of both formal research and practical implementation), including SHOE, OIL, and DAML+OIL. The technical basis for much of OWL is the part of the formal knowledge representations field known as Description Logic (aka "DL"). DL is the main formal underpinning of such diverse kinds of knowledge representation formalisms as semantic nets, frame-based systems, and others. (If you're interested to learn more, consider looking at The Description Logic Handbook.) While I won't say much more about the parts of OWL in this article, I should mention that OWL includes an RDF/XML interchange syntax, an abstract, non-XML syntax, and three sublanguages or variants, each of different expressivity and implementational complexity (OWL Lite, OWL DL, and OWL Full).

The takeaway point is simple: OWL is real stuff; whether it's the right real stuff, whether it can gain critical mass, whether it can or will operate at web scale -- these are and will remain open questions for the foreseeable future. But the foundation is solid.

Yet you may be asking what can be done with an ontology language for the Web? In short, you can formally specify a knowledge domain, describing its most salient features and constituents, then use that formal specification to make assertions about what there is in that domain. You can feed all of that to a computer which will reason about the domain and its knowledge for you. And, here's the most tantalizing bit, you can do all of this on, in, and with the Web, in both interesting and powerful ways.

Let's look at the logically prior question for a moment: what is an ontology language? What is an ontology? As you probably know, "ontology" is an English word derived (by way of Latin and French) from the ancient Greek words for "being" and "inquiry". Originally ontology was an aspect of metaphysics, the one which studied not particular things in themselves but the being-ness of all things which exist or could conceivably exist.

Today an ontology language is a means by which one can formally describe a knowledge domain, with the goal of enabling computers to provide various kinds of reasoning services about that domain, and about the knowledge described by an ontology for that domain. In our current, technical usage, an ontology is a formal specification of a knowledge domain: what individuals and classes of individuals there are in that domain, the relationships which obtain between these individuals and classes, their proper and apparent parts, and so on.

Two quick points are worth making here. First, we all spend some amount of our brain power -- almost entirely without consciously knowing that this is what we are doing -- dealing with informal, implicit ontologies. In order to act meaningfully at all within particular social contexts, we need to have understood something roughly like an ontology of that context. In any situation or context there will be features which we attend to, because they just are the salient features of that context, and an even larger number of things about the situation which we do not attend to, which we cannot even call features, because they are the background noise against which salience emerges. The homo sapiens form of the mammalian brain is very good at doing this. It's so good, in fact, that it has figured out how to get computers to do something (very) roughly like this, too.

Second, unlike humans, computers can only provide reasoning services over a knowledge domain because the domain and the knowledge have been formally and rigorously specified in advance, and because some human has implemented various reasoning algorithms in a way which that computer can apply.

From these two points we may be able to conclude that ordinary people, with the right support and motivation, can learn to use the formal tools of computerized ontology languages, like OWL, to represent the things which they already know in a way which computers can then reason about, as a supplement and aid to human interests.

An Ontology Language For the Web

So far nothing I have said about ontology languages and reasoning systems is specific to OWL as an ontology language for the Web. What is it about OWL that makes it different? OWL has been specifically crafted out of its Webbish forerunners, particularly SHOE and DAML+OIL, to take advantage of some of the interesting things about the Web. What is interesting about the Web? Lots of things, including its scale, its distributedness, its relatively low barriers of access and accessibility. OWL is intended to be an ontology language that has some of these features: it should operate at the scale of the Web; it should be distributed across many systems, allowing people to share ontologies and parts of ontologies; it should be compatible with the Web's ways of achieving accessibility and internationalization; and it should be, relative to most prior knowledge representation systems, easy to get started with, non-proprietary, and open.

Insofar as OWL accomplishes or will accomplish these goals, it will do so by virtue of the fact that it was designed by a collection of DL and Web experts, using the Web's foundational knowledge representational tools, RDF and XML, with the explicit goal of making a formal knowledge representation (KR) language work on the world's first globally distributed hypermedia system. This is a relatively new thing to aim at in the history of KR systems. In some ways, the OWL Working Group is among the most ambitious of the W3C's many WGs. It is often said of W3C WGs that they are not meant to do new work, that is, to do new research into some field; rather, they are meant to standardize and specify things which are already known in such a way that makes open computing possible and proprietary vendor lock-in improbable. In the case of the OWL WG, however, this general rule was broken. While OWL has precursors, the most important of which is DAML+OIL, it took a non-trivial amount of real, new technical work to make OWL into a practical ontology language for the Web.

A Tempered Enthusiasm

Despite my enthusiasm for the intellectual achievement constituted by OWL, I have to temper it with a dose of realism. OWL can be and probably is everything good which people have said about it; if so, that in and of itself will not mean that the Semantic Web visions will be widely achieved. Whether or not the Semantic Web ever happens, in as robust and important a sense as the original Web happened, depends on a complex set of factors and their interactions, only some of which are under anyone's direct control.

Having OWL means a few things are no longer true. First, it is no longer true that the Semantic Web can be dismissively written off as a bit of magical, wishful thinking on the part of some Utopian-leaning technologists. OWL provides a real foundation, rooted in the rich research and engineering tradition of KR and DL, for the Semantic Web. Second, it is no longer true that RDF and RDFS are the obvious choices for a certain class of Web applications. OWL must soon be considered in some cases a better choice than RDF alone; it is more expressive and, in the OWL Full variant, upwardly compatible with RDF.

The real achievement of OWL, then, at least as I see it, is to provide a solid foundation, both formally and implementationally, for the Semantic Web. It satisfies one of the necessary conditions of the possibility of there being a Semantic Web at all. And for that, all of us who appreciate the Web, as it is and as it could be, should be grateful.