Menu

Introduction to DAML: Part I

January 30, 2002

Roxane Ouellet and Uche Ogbuji

RDF was developed by the W3C at about the same time as XML, and it turns out to be an excellent complement to XML, providing a language for modeling semi-structured metadata and enabling knowledge-management applications. The RDF core model is successful because of its simplicity. The W3C also developed a purposefully lightweight schema language, RDF Schema (RDFS), to provide basic structures such as classes and properties.

As the ambitions of RDF and XML have expanded to include things like the Semantic Web, the limitations of this lightweight schema language have become evident. Accordingly, a group set out to develop a more expressive schema language, DARPA Agent Markup Language (DAML). Although DAML is not a W3C initiative, several familiar faces from the W3C, including Tim Berners-Lee, participated in its development.

This article series introduces DAML, including practical examples and basic design principles. This first article presents basic DAML concepts and constructs, explaining the most useful modeling tools DAML puts into the designer's hands. In the next article we shall take a more in-depth look, introducing more advanced features and outlining a few useful rules of thumb for designers. Keeping the concepts straight between RDF, RDFS and DAML+OIL can be difficult, so the third article will serve as a reference of constructs, describing each, and pointing to the relevant spec where each is defined.

Introducing DAML

RDF and provide a basic feature set for information modeling. RDF is very similar to a basic directed graph, which is a very well understood data structure in computer science. (Note that there are some important variations RDF makes on graph theory [RDFLG]). This simplicity serves RDF very well, making it a sort of assembly language on top of which almost every other information-modeling method can be overlaid. However, users have desired more from RDF and RDF Schema, including data types, a consistent expression for enumerations, and other facilities. Logicians, some of whom see RDF as a possible tool for developing long-promised practical AI systems, have bemoaned the rather thin set of facilities provided by RDF.

In response the DARPA Agent Markup Language (DAML) [DAMLHOME] sprang from a U.S. government-sponsored effort in August 2000, which released DAML-ONT, a simple language for expressing more sophisticated RDF class definitions than permitted by RDFS. The DAML group soon pooled efforts with the Ontology Inference Layer (OIL) [OILHOME], another effort providing more sophisticated classification, using constructs from frame-based AI. The result of these efforts is DAML+OIL, a language for expressing far more sophisticated classifications and properties of resources than RDFS. The most recent release was March 2001 [DAMLOIL], which also adds facilities for data typing based on the type definitions provided in the W3C XML Schema Definition Language (XSDL).

And the W3C itself is getting into the act. This DAML+OIL specification and its relationship to RDF and RDFS are also available as a series of W3C notes [DAMLOILNOTES], which are used as the base specifications in this article and the following parts. Furthermore, the newly commissioned Web Ontology (WebONT) Working Group ([WEBONT]) has taken on the task of producing an ontology language, with DAML+OIL as its basis.

Readers of these articles are expected to be familiar with RDF and RDF Schema. See [RDFINTRO] and [RDFTUT] to gain some of this background.

More Facilities for Properties

Let's dive right in by looking at some examples of how DAML+OIL might be used in practice. We'll use an imaginary online sports store, Super Sports Inc. as an example.

RDF Schema allows you to define classes by direct declaration:


<rdfs:Class rdf:ID="Product">

  <rdfs:label>Product</rdfs:label>

  <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment>

</rdfs:Class>

You can make similar definitions of properties:


<rdfs:Property rdf:ID="productNumber">

  <rdfs:label>Product Number</rdfs:label>

  <rdfs:domain rdf:resource="#Product"/>

  <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/>

</rdfs:Property>

You define instances of these classes by defining resources to be of the relevant RDF type, and then give them relevant properties:


<Product rdf:ID="WaterBottle">

  <rdfs:label>Water Bottle</rdfs:label>

  <productNumber>38267</productNumber>

</Product>

Data typing, multiple ranges

This is all very well, as far as it goes. The problem is that it doesn't go very far. For one thing, based on the name, and the example above, one would reasonably assume that the values of the productNumber property are numbers. We did specify that they must be literals, but literals can be any string, including those we can't interpret as numbers. DAML+OIL allows property values to be restricted to the data types defined in XSDL or to user-defined data types. One does this by using a specialization of RDF properties: DatatypeProperty.


<daml:DatatypeProperty rdf:ID="productNumber">

  <rdfs:label>Product Number</rdfs:label>

  <rdfs:domain rdf:resource="#Product"/>

  <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/>

</daml:DatatypeProperty>

We use the "daml" prefix here to represent the DAML+OIL namespace http://www.w3.org/2001/10/daml+oil#. DAML+OIL adds primitives, like DatatypeProperty, as needed, and it defers to RDFS otherwise. There are some changes to the semantics of rdfs:domain and rdfs:range in DAML+OIL systems, the most important of which is that a property can have multiple ranges. Note that DAML+OIL effectively considers every literal used as the value of a property to have some data type. It's worth considering James Clark's speech at the XML 2001 conference [JJCOV] where he exhorted designers of XML-based languages not to lock users into narrow sets of data types. In DAML+OIL, any property that is not a daml:DatatypeProperty is an daml:ObjectProperty, whose range must be a class defined in DAML+OIL or RDF. This provides the sort of flexibility James Clark calls for.

A property can also be defined as identical to another. DAML+OIL provides two ways to express this: with either the daml:equivalentTo or daml:samePropertyAs property. We shall use only the latter because it has the advantage of being subclassed from rdfs:subPropertyOf, which allows some measure of backward compatibility. Suppose that, after Super Sports Inc. has deployed its RDF-based system, an online retailer consortium comes up with a standardized vocabulary for merchandise information. If this new vocabulary uses a property called productID to indicate product number, Super Sports does not have to hack all their code to use the new property. Thanks to DAML+OIL, they can simply extend the definition of productNumber as follows:


<rdf:Description about="#productNumber">

  <daml:samePropertyAs rdf:resource="http://consortium-of-shoppers.org/vocab/productID"/>

</rdf:Description> 

Unique properties

One can also specify that a property be unique, meaning that there can only be one value of the property for each instance. Note that there can be more than one statement with each instance as subject and the unique property as predicate, as long as the objects of all these statements are identical. We can require that each Super Sports product have a unique product number by modifying the property definition to the following:


<daml:DatatypeProperty rdf:ID="productNumber">

  <rdfs:label>Product Number</rdfs:label>

  <rdfs:domain rdf:resource="#Product"/>

  <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/>

  <rdf:type rdf:resource="http://www.w3.org/2001/10/daml+oil#UniqueProperty"/>

</daml:DatatypeProperty>

You may notice that we are stating that the productNumber property has two types: daml:DatatypeProperty and daml:UniqueProperty. While this might seem odd if you're coming from the object-oriented world, it's perfectly natural in knowledge representation. But it's not the complete expression we want. It permits a product to have more than one product number, as long as each is unique. DAML+OIL also allows us to express the cardinality of a property, i.e. the number of statements of this property that can be made for a given instance. This requirement, however, is primarily associated with the class rather than the property, and we shall discuss it, along with some other property predicates used infrequently, in the next article.

Broadening the Concept of Class

The most important facilities provided by DAML+OIL are those that give designers more expressiveness in classifying resources. The class daml:Class is defined as a subclass of rdfs:Class, and it adds many new facilities.

One example of an added feature in daml:Class is built-in support for enumerations, a very common need in RDF design. An enumeration defines a class by giving an explicit list of its members. So for instance, "US State" is a class with 50 instances. It isn't necessarily hard to come up with a way to express enumerations in RDF. The RDFS spec even hints at the most common approach:


<rdfs:Class rdf:ID="MaritalStatus"/>



<MaritalStatus rdf:ID="Married"/>

<MaritalStatus rdf:ID="Divorced"/>

<MaritalStatus rdf:ID="Single"/>

<MaritalStatus rdf:ID="Widowed"/>

Basically, this is just the definition of the instances of the class in the schema. The main problem is that the enumeration is not closed. Anyone can merrily add items to the enumeration without restriction. In a few cases, this might not be a problem, but usually there needs to be tight control over the items in an enumeration. DAML+OIL allows you to state that a class is defined by being one of a given set of instances.


<daml:Class ID="Availability">

  <daml:oneOf parseType="daml:collection">

    <daml:Thing rdf:ID="InStock">

      <rdfs:label>In stock</rdfs:label>

    </daml:Thing>

    <daml:Thing rdf:ID="BackOrdered">

      <rdfs:label>Back ordered</rdfs:label>

    </daml:Thing>

    <daml:Thing rdf:ID="SpecialOrder">

      <rdfs:label>Special order</rdfs:label>

    </daml:Thing>

  </daml:oneOf>

</daml:Class>

The daml:oneOf element defines an enumeration, using a DAML+OIL construct that allows us to define closed lists, the daml:collection parse type.

The daml:collection parse type allows a DAML+OIL agent to interpret the body of a property element as a special form of list, which is made up of each of the instances that appear in the element body. This list has a special representation in the RDF model: it is actually a set of statements that recursively breaks the list down into the first element and the sub-list consisting of the rest of the elements (when the last element is reached, the sublist of remaining elements is a special DAML resource daml:nil). Figure 1 illustrates this data structure. This approach should be familiar to functional language programmers, and it has the benefit that you cannot add items to this list without replacing one of the statements that make up the list, unlike RDF containers, to which elements can be added without disrupting the existing list statements.

Figure1: structure of a DAML+OIL list

Each item in the enumeration is defined as an instance of daml:Thing, a special DAML+OIL type which universally includes all instances of all Classes. We don't define each item in the enumeration as an instance of the Availability class directly, because this would be a circular definition.

The Little Shop of Knowledge

Let us look at an example illustrating the DAML features we have learned so far. We've acquired a trendy new online store for sports products, Super Sports. It's not enough to have cutting-edge merchandise, we also want to use the best technology for running the store, and so we shall come up with a DAML+OIL system for describing and classifying the products we sell.

The first step is to define a user-defined type that we wish to use in the product descriptions. The following listing is an XML schema definition for our data type.


<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema

  targetNamespace="http://rdfinference.org/eg/supersports/dt"

  xmlns:dt="http://rdfinference.org/eg/supersports/dt"

  xmlns:xsd="http://www.w3.org/2001/XMLSchema"

>



  <!-- Pack capacity (in liters)-->

  <xsd:simpleType name="packCapacity">

    <xsd:restriction base="xsd:positiveInteger">

      <xsd:maxExclusive value="50"/>

    </xsd:restriction>

  </xsd:simpleType>

We define a back pack's capacity as an integer range from 1 to 50 using a restriction on the core positive integer type from XSDL.

You can see how we use this customized data type, and other constructs we've introduced, in the following DAML+OIL schema for the Super Sports product catalog.


<?xml version="1.0" encoding="UTF-8"?>



<rdf:RDF

  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 

  xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" 

  xmlns:daml="http://www.w3.org/2001/10/daml+oil#" 

  xmlns:dt="http://rdfinference.org/eg/supersports/dt"

  xmlns:ss="http://rdfinference.org/eg/supersports/metadata"

  xmlns:xsd="http://www.w3.org/2000/10/XMLSchema#" 

  xml:base="http://rdfinference.org/eg/supersports/metadata"

>



  <daml:Ontology rdf:about="">

    <daml:versionInfo>1.0</daml:versionInfo>

    <rdfs:comment>An ontology of Super Sports Inc. store products

    </rdfs:comment>

    <daml:imports rdf:resource="http://www.w3.org/2001/10/daml+oil"/>

  </daml:Ontology>

  

  <daml:Class rdf:ID="Product">

    <rdfs:label>Product</rdfs:label>

    <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment>

  </daml:Class>

  

  <daml:Class rdf:ID="Department">

    <rdfs:label>Department</rdfs:label>

    <rdfs:comment>A Super Sports Inc. department</rdfs:comment>

  </daml:Class>

  

  <!-- *****************SIMPLE INHERITANCE***************** -->

  

  <daml:Class rdf:ID="Tool">

    <rdfs:label>Tool</rdfs:label>

    <rdfs:comment>Tools used in sports, 

        ice axe for instance.</rdfs:comment>

    <rdfs:subClassOf rdf:resource="#Product"/>

  </daml:Class>

  

  <daml:Class rdf:ID="Shoe">

    <rdfs:label>Shoe</rdfs:label>

    <rdfs:subClassOf rdf:resource="#Product"/>

  </daml:Class>

  

  <daml:Class rdf:ID="SleepingBag">

    <rdfs:label>Sleeping Bag</rdfs:label>

    <rdfs:subClassOf rdf:resource="#Product"/>

  </daml:Class>

  

  <daml:Class rdf:ID="BackPack">

    <rdfs:label>Back Pack</rdfs:label>

    <rdfs:subClassOf rdf:resource="#Product"/>

  </daml:Class>

  

  <!-- *****************ENUMERATION*****************-->

  

  <daml:Class rdf:ID="Activity">

    <rdfs:label>Activity</rdfs:label>

    <rdfs:comment>A sport activity</rdfs:comment>

    <daml:oneOf rdf:parseType="daml:collection">

      <daml:Thing rdf:ID="Hiking">

        <rdfs:label>Hiking</rdfs:label>

      </daml:Thing>

      <daml:Thing rdf:ID="Travel">

        <rdfs:label>Travel</rdfs:label>

      </daml:Thing>

      <daml:Thing rdf:ID="Camping">

        <rdfs:label>Camping</rdfs:label>

      </daml:Thing>

     <daml:Thing rdf:ID="Mountaineering">

        <rdfs:label>Mountaineering</rdfs:label>

      </daml:Thing>

    </daml:oneOf>

  </daml:Class>

  

  <daml:Class rdf:ID="Availability">

    <rdfs:label>Availability</rdfs:label>

    <rdfs:comment>The availability of a product</rdfs:comment>

    <daml:oneOf parseType="daml:collection">

      <daml:Thing rdf:ID="InStock">

        <rdfs:label>In stock</rdfs:label>

      </daml:Thing>

      <daml:Thing rdf:ID="BackOrdered">

        <rdfs:label>Back ordered</rdfs:label>

      </daml:Thing>

      <daml:Thing rdf:ID="SpecialOrder">

        <rdfs:label>Special order</rdfs:label>

      </daml:Thing>

    </daml:oneOf>

  </daml:Class>

  

  <!-- *****************DATATYPE PROPERTY*****************-->

  

  <daml:DatatypeProperty rdf:ID="productNumber">

    <rdfs:label>Product Number</rdfs:label>

    <daml:samePropertyAs rdf:resource="<a

href="http://rosettanet.org/FundamentalBusiness">http://rosettanet.org/FundamentalBusiness</a>

DataEntities#ProprietaryProductIdentifier"/>

    <rdfs:domain rdf:resource="#Product"/>

    <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/>

    <rdf:type rdf:resource="http://www.w3.org/2001/10/daml+oil#UniqueProperty"/>

  </daml:DatatypeProperty>

  

  <daml:DatatypeProperty rdf:ID="packCapacity">

    <rdfs:label>capacity</rdfs:label>

    <rdfs:comment>The capacity of a back pack</rdfs:comment>

    <rdfs:domain rdf:resource="#BackPack"/>

    <rdfs:range rdf:resource="http://rdfinference.org/eg/supersports/dt#packCapacity"/>

  </daml:DatatypeProperty>

  

  <!-- *****************OBJECT PROPERTY*****************-->

  

  <daml:ObjectProperty rdf:ID="usedFor">

    <rdfs:label>usedFor</rdfs:label>

    <rdfs:comment>The activity for which a product is used</rdfs:comment>

    <daml:domain rdf:resource="#Product"/>

    <daml:range rdf:resource="#Activity"/>

  </daml:ObjectProperty>

  

  <!-- *****************INSTANCE***************** -->



  <ss:BackPack rdf:ID="ReadyRuck">

    <rdfs:label>Ready Ruck back pack</rdfs:label>

    <rdfs:comment>The ideal pack for your most rugged adventures</rdfs:comment>

    <ss:productNumber>23456</ss:productNumber>

    <ss:packCapacity>45</ss:packCapacity>

    <ss:usedFor rdf:resource="#Hiking"/>

  </ss:BackPack>

  

</rdf:RDF>

At the top level of the document is the RDF envelope element, as in most RDF/XML files. We define special namespaces to be used to construct URIs in our descriptions. Also note that we use an xml:base attribute [XMLBASE] to set the base URI of this document. This affects the actual URI to which RDF IDs are mapped in the model. For instance, given our declared base URI, rdf:ID="Product" yields a resource with URI http://rdfinference.org/eg/supersports/metadata#Product. This reduces the element of surprise when an RDF/XML document is retrieved from different URIs. The problem with this facility is that not all RDF or even XML processors support XML Base, which is a recent W3C Recommendation. We recommend for RDF usage in general that either XML Base is always used in documents with RDF IDs or that only rdf:about with fully resolved URIs is used.

Next up is the DAML+OIL header. This provides certain metadata for the ontology itself. In many cases, you should just be able to cut and paste our example header, changing the few fields as necessary. By having an empty rdf:about, we are treating the very document as the resource, using the XML Base we declared, of course. daml:versionInfo is an arbitrary value that describes versioning information for the document. It can be a simple string, as we have it here, or even a complex RDF or literal XML structure (such as revhistory from DocBook [DOCBOOK]).

The daml:imports allows you to incorporate another RDF model into the current one. This is similar to xsl:include in XSLT, not xsl:import because there is really no concept of import precedence for DAML+OIL. It is conventional to import the DAML+OIL specification itself, as we do, although the specifications are not clear on whether this is necessary (i.e whether such an import is implied).

Next we define a few simple classes, which should hold no surprises at this point. Then we define an enumeration of activities associated with Super Sports products and an enumeration of product availability codes. We next define some properties, using standard XSD data types, as well as the custom type we defined. One of these properties, productNumber, is a unique property: each product can only have one product number. Again, we define the product number as semantically equivalent to another property, the ProprietaryProductIdentifier from RosettaNet [ROSETTANET], an organization for standardization of business-to-business interchange. Note that this is just a made-up ID for the actual entry from the RosettaNet business dictionary.

And, finally, we define a single instance, as an example. It shows the use of a data-type property and an object property with an enumerated range.

More to come

In the next part of this series, we shall look at more complex DAML+OIL features, including restrictions, which are a fundamental provision of the language.

References

DAMLHOME: The DARPA Agent Markup Language (DAML) Program home page

DAMLOIL: DAML+OIL (March 2001)

DAMLOILNOTES-1, DAMLOILNOTES-2, DAMLOILNOTES-3, DAMLOILNOTES-4: A series of notes covering DAML+OIL as W3C technical reports

DOCBOOK: The home page of the DocBook Technical Committee

JJCONV: Five challenges for XML, a capsule of James Clark's keynote at the XML 2001 conference

OILHOME: The Ontology Inference Layer (OIL) home page

RDFHOME: Resource Description Framework (RDF): W3C Home Page

RDFINTRO: An Introduction to RDF, by Uche Ogbuji

RDFLG: Peter F. Patel-Schneider clarifies the common formula that RDF defines simple edge-labeled graphs

RDFSPEC: RDF Model and Syntax Specification, W3C Recommendation, 22 February 1999

RDFSSPEC: RDF Schema Specification 1.0 (W3C Candidate Recommendation, 27 March 2000)

RDFTUT: Pierre-Antoine Champin's RDF Tutorial

ROSETTANET: The RosettaNet home page

WEBONT: The W3C Web-Ontology (WebOnt) Working Group home page

XMLBASE: XML Base (W3C Recommendation, 27 June 2001)