Ontology Building: A Survey of Editing Tools
Editor's Note: An update to this article has been posted here on 7/14/04.
As the hype of past decades fades, the current heir to the artificial intelligence legacy may well be ontologies. Evolving from semantic network notions, modern ontologies are proving quite useful. And they are doing so without relying on the jumble of rule-based techniques common in earlier knowledge representation efforts. These structured depictions or models of known (and accepted) facts are being built today to make a number of applications more capable of handling complex and disparate information. They appear most effective when the semantic distinctions that humans take for granted are crucial to the application's purpose. This may mean handling the common sense lurking in natural language excerpts or the expertise embedded in domain-specific explications and working repositories.
The semantic structuring achieved by ontologies differs from the superficial composition and formatting of information (as data) afforded by relational and XML databases. With databases virtually all of the semantic content has to be captured in the application logic. Ontologies, however, are often able to provide an objective specification of domain information by representing a consensual agreement on the concepts and relations characterizing the way knowledge in that domain is expressed. This specification can be the first step in building semantically-aware information systems to support diverse enterprise, government, and personal activities.
Examples span several areas including: Semantic Web research; the creation of medical guidelines for managing patient health; mapping the genomes of plants and animals; searching for specific public information resources; collaborative engineering design; in-depth security analysis; and the automated exchange of electronic information among commercial trading partners.
In the Semantic Web vision, unambiguous sense in a dialog among remote applications or agents can be achieved through shared reference to the ontologies available on the network, albeit an always changing combination of upper level and domain ontologies. We just have to assume that each ontology is consensual and congruent with the other shared ontologies (e.g., ontologies routinely include one another). The result is a common domain of discourse that can be interpreted further by rules of inference and application logic. Note that ontologies put no constraints on publishing (possibly contradictory) information on the Web, only on its (possible) interpretations.
Kinds of Ontologies
Ontologies may vary not only in their content, but also in their structure and implementation.
Level of description
Building an ontology means different things to different practitioners. How one goes about describing something reflects a progression in ontologies from simple lexicons or controlled vocabularies, to categorically organized thesauri, to taxonomies where terms are related hierarchically and can be given distinguishing properties, to full-blown ontologies where these properties can define new concepts and where concepts have named relationships with other concepts, like "changes the effect of" or "buys from".
Ontologies also differ in respect to the scope and purpose of their content. The most prominent distinction is between the domain ontologies describing specific fields of endeavor, like medicine, and upper level ontologies describing the basic concepts and relationships invoked when information about any domain is expressed in natural language. The synergy among ontologies -- exploitable by a vertical application -- springs from the cross-referencing between upper level ontologies and various domain ontologies.
All ontologies have a part that historically has been called the terminological component. This is roughly analogous to what we know as the schema for a relational database or XML document. It defines the terms and structure of the ontology's area of interest. The second part, the assertional component, populates the ontology further with instances or individuals that manifest that terminological definition. This extension can be separated in implementation from the ontology and maintained as a knowledge base. The dividing line, however, between treating a thing as a concept and treating it as an individual is usually an ontology-specific decision. Whether the 1965 Ford Mustang GT is an individual Ford automobile, or the vehicle with license plate number AXL429 is an individual Ford (as an instance of the subclass 1965 Ford Mustang GT), may vary between two valid automotive ontologies.
Ontologies are not all built the same way. A number of possible languages can be used, including general logic programming languages like Prolog. More common, however, are languages that have evolved specifically to support ontology construction. The Open Knowledge Base Connectivity (OKBC) model and languages like KIF (and its emerging successor CL -- Common Logic) are examples that have become the bases of other ontology languages. There are also several languages based on a form of logic thought to be especially computable known as description logics. These include Loom and DAML+OIL, which is currently being evolved into the Web Ontology Language (OWL) standard. When comparing ontology languages, what is given up for computability and simplicity is usually language expressiveness, which isn't always a bad deal. A language need only be as rich and expressive as is necessary to represent the nuance and intricacy of knowledge that the ontology's purpose and its developers demand.
The wide array of information residing on the Web has given ontology use an impetus, and ontology languages increasingly rely on W3C technologies like RDF Schema as a language layer, XML Schema for data typing, and RDF to assert data.
The basic steps in building an ontology are straightforward. Various methodologies exist to guide the theoretical approach taken, and numerous ontology building tools are available. The problem is that these procedures have not coalesced into popular development styles or protocols, and the tools have not yet matured to the degree one expects in other software practices. Further, full support for the latest ontology languages is lacking.
An ontology is typically built in more-or-less the following manner:
Acquire domain knowledge
Assemble appropriate information resources and expertise that will define, with consensus and consistency, the terms used formally to describe things in the domain of interest. These definitions must be collected so that they can be expressed in a common language selected for the ontology.
Organize the ontology
Design the overall conceptual structure of the domain. This will likely involve identifying the domain's principal concrete concepts and their properties, identifying the relationships among the concepts, creating abstract concepts as organizing features, referencing or including supporting ontologies, distinguishing which concepts have instances, and applying other guidelines of your chosen methodology.
Flesh out the ontology
Add concepts, relations, and individuals to the level of detail necessary to satisfy the purposes of the ontology.
Check your work
Reconcile syntactic, logical, and semantic inconsistencies among the ontology elements. Consistency checking may also involve automatic classification that defines new concepts based on individual properties and class relationships.
Commit the ontology
Incumbent on any ontology development effort is a final verification of the ontology by domain experts and the subsequent commitment of the ontology by publishing it within its intended deployment environment.
Software tools are available to accomplish most aspects of ontology development. While ontology editors are useful during each step outlined above, other types of ontology building tools are also needed along the way.
Development projects often involve solutions using numerous ontologies from external sources as well as existing and newly developed in-house ontologies. Ontologies from any source may progress through a series of versions. In the end, careful management of this collection of heterogeneous ontologies becomes necessary to keep track of them. Tools also help to map and link between them, compare them, reconcile and validate them, merge them, and convert them into other forms. Ontologies may be derived from or transformed into forms such as W3C XML Schemas, database schemas, and UML to achieve integration with associated enterprise applications.
Still other tools can help acquire, organize, and visualize the domain knowledge before and during the building of a formal ontology.
When starting out on an ontology project, the first and reasonable reaction is to find a suitable ontology software editor. It's hoped this broad summary of available editors will give prospective ontology developers a head start.
Pages: 1, 2