XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.


Introduction to DAML: Part I

January 30, 2002

RDF was developed by the W3C at about the same time as XML, and it turns out to be an excellent complement to XML, providing a language for modeling semi-structured metadata and enabling knowledge-management applications. The RDF core model is successful because of its simplicity. The W3C also developed a purposefully lightweight schema language, RDF Schema (RDFS), to provide basic structures such as classes and properties.

As the ambitions of RDF and XML have expanded to include things like the Semantic Web, the limitations of this lightweight schema language have become evident. Accordingly, a group set out to develop a more expressive schema language, DARPA Agent Markup Language (DAML). Although DAML is not a W3C initiative, several familiar faces from the W3C, including Tim Berners-Lee, participated in its development.

This article series introduces DAML, including practical examples and basic design principles. This first article presents basic DAML concepts and constructs, explaining the most useful modeling tools DAML puts into the designer's hands. In the next article we shall take a more in-depth look, introducing more advanced features and outlining a few useful rules of thumb for designers. Keeping the concepts straight between RDF, RDFS and DAML+OIL can be difficult, so the third article will serve as a reference of constructs, describing each, and pointing to the relevant spec where each is defined.

Introducing DAML

RDF and provide a basic feature set for information modeling. RDF is very similar to a basic directed graph, which is a very well understood data structure in computer science. (Note that there are some important variations RDF makes on graph theory [RDFLG]). This simplicity serves RDF very well, making it a sort of assembly language on top of which almost every other information-modeling method can be overlaid. However, users have desired more from RDF and RDF Schema, including data types, a consistent expression for enumerations, and other facilities. Logicians, some of whom see RDF as a possible tool for developing long-promised practical AI systems, have bemoaned the rather thin set of facilities provided by RDF.

In response the DARPA Agent Markup Language (DAML) [DAMLHOME] sprang from a U.S. government-sponsored effort in August 2000, which released DAML-ONT, a simple language for expressing more sophisticated RDF class definitions than permitted by RDFS. The DAML group soon pooled efforts with the Ontology Inference Layer (OIL) [OILHOME], another effort providing more sophisticated classification, using constructs from frame-based AI. The result of these efforts is DAML+OIL, a language for expressing far more sophisticated classifications and properties of resources than RDFS. The most recent release was March 2001 [DAMLOIL], which also adds facilities for data typing based on the type definitions provided in the W3C XML Schema Definition Language (XSDL).

And the W3C itself is getting into the act. This DAML+OIL specification and its relationship to RDF and RDFS are also available as a series of W3C notes [DAMLOILNOTES], which are used as the base specifications in this article and the following parts. Furthermore, the newly commissioned Web Ontology (WebONT) Working Group ([WEBONT]) has taken on the task of producing an ontology language, with DAML+OIL as its basis.

Readers of these articles are expected to be familiar with RDF and RDF Schema. See [RDFINTRO] and [RDFTUT] to gain some of this background.

More Facilities for Properties

Let's dive right in by looking at some examples of how DAML+OIL might be used in practice. We'll use an imaginary online sports store, Super Sports Inc. as an example.

RDF Schema allows you to define classes by direct declaration:

<rdfs:Class rdf:ID="Product">
  <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment>

You can make similar definitions of properties:

<rdfs:Property rdf:ID="productNumber">
  <rdfs:label>Product Number</rdfs:label>
  <rdfs:domain rdf:resource="#Product"/>
  <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>

You define instances of these classes by defining resources to be of the relevant RDF type, and then give them relevant properties:

<Product rdf:ID="WaterBottle">
  <rdfs:label>Water Bottle</rdfs:label>

Data typing, multiple ranges

This is all very well, as far as it goes. The problem is that it doesn't go very far. For one thing, based on the name, and the example above, one would reasonably assume that the values of the productNumber property are numbers. We did specify that they must be literals, but literals can be any string, including those we can't interpret as numbers. DAML+OIL allows property values to be restricted to the data types defined in XSDL or to user-defined data types. One does this by using a specialization of RDF properties: DatatypeProperty.

<daml:DatatypeProperty rdf:ID="productNumber">
  <rdfs:label>Product Number</rdfs:label>
  <rdfs:domain rdf:resource="#Product"/>
  <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/>

We use the "daml" prefix here to represent the DAML+OIL namespace http://www.w3.org/2001/10/daml+oil#. DAML+OIL adds primitives, like DatatypeProperty, as needed, and it defers to RDFS otherwise. There are some changes to the semantics of rdfs:domain and rdfs:range in DAML+OIL systems, the most important of which is that a property can have multiple ranges. Note that DAML+OIL effectively considers every literal used as the value of a property to have some data type. It's worth considering James Clark's speech at the XML 2001 conference [JJCOV] where he exhorted designers of XML-based languages not to lock users into narrow sets of data types. In DAML+OIL, any property that is not a daml:DatatypeProperty is an daml:ObjectProperty, whose range must be a class defined in DAML+OIL or RDF. This provides the sort of flexibility James Clark calls for.

A property can also be defined as identical to another. DAML+OIL provides two ways to express this: with either the daml:equivalentTo or daml:samePropertyAs property. We shall use only the latter because it has the advantage of being subclassed from rdfs:subPropertyOf, which allows some measure of backward compatibility. Suppose that, after Super Sports Inc. has deployed its RDF-based system, an online retailer consortium comes up with a standardized vocabulary for merchandise information. If this new vocabulary uses a property called productID to indicate product number, Super Sports does not have to hack all their code to use the new property. Thanks to DAML+OIL, they can simply extend the definition of productNumber as follows:

<rdf:Description about="#productNumber">
  <daml:samePropertyAs rdf:resource="http://consortium-of-shoppers.org/vocab/productID"/>

Unique properties

One can also specify that a property be unique, meaning that there can only be one value of the property for each instance. Note that there can be more than one statement with each instance as subject and the unique property as predicate, as long as the objects of all these statements are identical. We can require that each Super Sports product have a unique product number by modifying the property definition to the following:

<daml:DatatypeProperty rdf:ID="productNumber">
  <rdfs:label>Product Number</rdfs:label>
  <rdfs:domain rdf:resource="#Product"/>
  <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/>
  <rdf:type rdf:resource="http://www.w3.org/2001/10/daml+oil#UniqueProperty"/>

You may notice that we are stating that the productNumber property has two types: daml:DatatypeProperty and daml:UniqueProperty. While this might seem odd if you're coming from the object-oriented world, it's perfectly natural in knowledge representation. But it's not the complete expression we want. It permits a product to have more than one product number, as long as each is unique. DAML+OIL also allows us to express the cardinality of a property, i.e. the number of statements of this property that can be made for a given instance. This requirement, however, is primarily associated with the class rather than the property, and we shall discuss it, along with some other property predicates used infrequently, in the next article.

Broadening the Concept of Class

The most important facilities provided by DAML+OIL are those that give designers more expressiveness in classifying resources. The class daml:Class is defined as a subclass of rdfs:Class, and it adds many new facilities.

One example of an added feature in daml:Class is built-in support for enumerations, a very common need in RDF design. An enumeration defines a class by giving an explicit list of its members. So for instance, "US State" is a class with 50 instances. It isn't necessarily hard to come up with a way to express enumerations in RDF. The RDFS spec even hints at the most common approach:

<rdfs:Class rdf:ID="MaritalStatus"/>

<MaritalStatus rdf:ID="Married"/>
<MaritalStatus rdf:ID="Divorced"/>
<MaritalStatus rdf:ID="Single"/>
<MaritalStatus rdf:ID="Widowed"/>

Basically, this is just the definition of the instances of the class in the schema. The main problem is that the enumeration is not closed. Anyone can merrily add items to the enumeration without restriction. In a few cases, this might not be a problem, but usually there needs to be tight control over the items in an enumeration. DAML+OIL allows you to state that a class is defined by being one of a given set of instances.

<daml:Class ID="Availability">
  <daml:oneOf parseType="daml:collection">
    <daml:Thing rdf:ID="InStock">
      <rdfs:label>In stock</rdfs:label>
    <daml:Thing rdf:ID="BackOrdered">
      <rdfs:label>Back ordered</rdfs:label>
    <daml:Thing rdf:ID="SpecialOrder">
      <rdfs:label>Special order</rdfs:label>

The daml:oneOf element defines an enumeration, using a DAML+OIL construct that allows us to define closed lists, the daml:collection parse type.

The daml:collection parse type allows a DAML+OIL agent to interpret the body of a property element as a special form of list, which is made up of each of the instances that appear in the element body. This list has a special representation in the RDF model: it is actually a set of statements that recursively breaks the list down into the first element and the sub-list consisting of the rest of the elements (when the last element is reached, the sublist of remaining elements is a special DAML resource daml:nil). Figure 1 illustrates this data structure. This approach should be familiar to functional language programmers, and it has the benefit that you cannot add items to this list without replacing one of the statements that make up the list, unlike RDF containers, to which elements can be added without disrupting the existing list statements.

Figure1: structure of a DAML+OIL list

Each item in the enumeration is defined as an instance of daml:Thing, a special DAML+OIL type which universally includes all instances of all Classes. We don't define each item in the enumeration as an instance of the Availability class directly, because this would be a circular definition.

Pages: 1, 2

Next Pagearrow