Introduction to DAML: Part I
RDF was developed by the W3C at about the same time as XML, and it turns out to be an excellent complement to XML, providing a language for modeling semi-structured metadata and enabling knowledge-management applications. The RDF core model is successful because of its simplicity. The W3C also developed a purposefully lightweight schema language, RDF Schema (RDFS), to provide basic structures such as classes and properties.
As the ambitions of RDF and XML have expanded to include things like the Semantic Web, the limitations of this lightweight schema language have become evident. Accordingly, a group set out to develop a more expressive schema language, DARPA Agent Markup Language (DAML). Although DAML is not a W3C initiative, several familiar faces from the W3C, including Tim Berners-Lee, participated in its development.
This article series introduces DAML, including practical examples and basic design principles. This first article presents basic DAML concepts and constructs, explaining the most useful modeling tools DAML puts into the designer's hands. In the next article we shall take a more in-depth look, introducing more advanced features and outlining a few useful rules of thumb for designers. Keeping the concepts straight between RDF, RDFS and DAML+OIL can be difficult, so the third article will serve as a reference of constructs, describing each, and pointing to the relevant spec where each is defined.
RDF and provide a basic feature set for information modeling. RDF is very similar to a basic directed graph, which is a very well understood data structure in computer science. (Note that there are some important variations RDF makes on graph theory [RDFLG]). This simplicity serves RDF very well, making it a sort of assembly language on top of which almost every other information-modeling method can be overlaid. However, users have desired more from RDF and RDF Schema, including data types, a consistent expression for enumerations, and other facilities. Logicians, some of whom see RDF as a possible tool for developing long-promised practical AI systems, have bemoaned the rather thin set of facilities provided by RDF.
In response the DARPA Agent Markup Language (DAML) [DAMLHOME] sprang from a U.S. government-sponsored effort in August 2000, which released DAML-ONT, a simple language for expressing more sophisticated RDF class definitions than permitted by RDFS. The DAML group soon pooled efforts with the Ontology Inference Layer (OIL) [OILHOME], another effort providing more sophisticated classification, using constructs from frame-based AI. The result of these efforts is DAML+OIL, a language for expressing far more sophisticated classifications and properties of resources than RDFS. The most recent release was March 2001 [DAMLOIL], which also adds facilities for data typing based on the type definitions provided in the W3C XML Schema Definition Language (XSDL).
And the W3C itself is getting into the act. This DAML+OIL specification and its relationship to RDF and RDFS are also available as a series of W3C notes [DAMLOILNOTES], which are used as the base specifications in this article and the following parts. Furthermore, the newly commissioned Web Ontology (WebONT) Working Group ([WEBONT]) has taken on the task of producing an ontology language, with DAML+OIL as its basis.
More Facilities for Properties
Let's dive right in by looking at some examples of how DAML+OIL might be used in practice. We'll use an imaginary online sports store, Super Sports Inc. as an example.
RDF Schema allows you to define classes by direct declaration:
<rdfs:Class rdf:ID="Product"> <rdfs:label>Product</rdfs:label> <rdfs:comment>An item sold by Super Sports Inc.</rdfs:comment> </rdfs:Class>
You can make similar definitions of properties:
<rdfs:Property rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/01/rdf-schema#Literal"/> </rdfs:Property>
You define instances of these classes by defining resources to be of the relevant RDF type, and then give them relevant properties:
<Product rdf:ID="WaterBottle"> <rdfs:label>Water Bottle</rdfs:label> <productNumber>38267</productNumber> </Product>
Data typing, multiple ranges
This is all very well, as far as it goes. The problem is that it
doesn't go very far. For one thing, based on the name, and the example
above, one would reasonably assume that the values of the
productNumber property are numbers. We did specify that
they must be literals, but literals can be any string, including those
we can't interpret as numbers. DAML+OIL allows property values to be
restricted to the data types defined in XSDL or to user-defined data
types. One does this by using a specialization of RDF properties:
<daml:DatatypeProperty rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/> </daml:DatatypeProperty>
We use the "daml" prefix here to represent the DAML+OIL namespace
http://www.w3.org/2001/10/daml+oil#. DAML+OIL adds
DatatypeProperty, as needed, and it
defers to RDFS otherwise. There are some changes to the semantics of
rdfs:range in DAML+OIL
systems, the most important of which is that a property can have
multiple ranges. Note that DAML+OIL effectively considers every
literal used as the value of a property to have some data type. It's
worth considering James Clark's speech at the XML 2001 conference
[JJCOV] where he exhorted designers of XML-based languages not to
lock users into narrow sets of data types. In DAML+OIL, any property
that is not a
daml:DatatypeProperty is an
daml:ObjectProperty, whose range must be a class defined
in DAML+OIL or RDF. This provides the sort of flexibility James Clark
A property can also be defined as identical to another. DAML+OIL
provides two ways to express this: with either the
property. We shall use only the latter because it has the advantage of
being subclassed from
rdfs:subPropertyOf, which allows
some measure of backward compatibility. Suppose that, after Super
Sports Inc. has deployed its RDF-based system, an online retailer
consortium comes up with a standardized vocabulary for merchandise
information. If this new vocabulary uses a property called
productID to indicate product number, Super Sports does
not have to hack all their code to use the new property. Thanks to
DAML+OIL, they can simply extend the definition of
productNumber as follows:
<rdf:Description about="#productNumber"> <daml:samePropertyAs rdf:resource="http://consortium-of-shoppers.org/vocab/productID"/> </rdf:Description>
One can also specify that a property be unique, meaning that there can only be one value of the property for each instance. Note that there can be more than one statement with each instance as subject and the unique property as predicate, as long as the objects of all these statements are identical. We can require that each Super Sports product have a unique product number by modifying the property definition to the following:
<daml:DatatypeProperty rdf:ID="productNumber"> <rdfs:label>Product Number</rdfs:label> <rdfs:domain rdf:resource="#Product"/> <rdfs:range rdf:resource="http://www.w3.org/2000/10/XMLSchema#nonNegativeInteger"/> <rdf:type rdf:resource="http://www.w3.org/2001/10/daml+oil#UniqueProperty"/> </daml:DatatypeProperty>
You may notice that we are stating that the
productNumber property has two types:
daml:UniqueProperty. While this might seem odd if you're
coming from the object-oriented world, it's perfectly natural in
knowledge representation. But it's not the complete expression we
want. It permits a product to have more than one product number, as
long as each is unique. DAML+OIL also allows us to express the
cardinality of a property, i.e. the number of statements of this
property that can be made for a given instance. This requirement,
however, is primarily associated with the class rather than the
property, and we shall discuss it, along with some other property
predicates used infrequently, in the next article.
Broadening the Concept of Class
The most important facilities provided by DAML+OIL are those that
give designers more expressiveness in classifying resources. The
daml:Class is defined as a subclass of
rdfs:Class, and it adds many new facilities.
One example of an added feature in
built-in support for enumerations, a very common need in RDF
design. An enumeration defines a class by giving an explicit list of
its members. So for instance, "US State" is a class with 50
instances. It isn't necessarily hard to come up with a way to express
enumerations in RDF. The RDFS spec even hints at the most common
<rdfs:Class rdf:ID="MaritalStatus"/> <MaritalStatus rdf:ID="Married"/> <MaritalStatus rdf:ID="Divorced"/> <MaritalStatus rdf:ID="Single"/> <MaritalStatus rdf:ID="Widowed"/>
Basically, this is just the definition of the instances of the class in the schema. The main problem is that the enumeration is not closed. Anyone can merrily add items to the enumeration without restriction. In a few cases, this might not be a problem, but usually there needs to be tight control over the items in an enumeration. DAML+OIL allows you to state that a class is defined by being one of a given set of instances.
<daml:Class ID="Availability"> <daml:oneOf parseType="daml:collection"> <daml:Thing rdf:ID="InStock"> <rdfs:label>In stock</rdfs:label> </daml:Thing> <daml:Thing rdf:ID="BackOrdered"> <rdfs:label>Back ordered</rdfs:label> </daml:Thing> <daml:Thing rdf:ID="SpecialOrder"> <rdfs:label>Special order</rdfs:label> </daml:Thing> </daml:oneOf> </daml:Class>
daml:oneOf element defines an enumeration,
using a DAML+OIL construct that allows us to define closed lists,
daml:collection parse type.
daml:collection parse type allows a DAML+OIL agent
to interpret the body of a property element as a special form of list,
which is made up of each of the instances that appear in the element
body. This list has a special representation in the RDF model: it is
actually a set of statements that recursively breaks the list down
into the first element and the sub-list consisting of the rest of the
elements (when the last element is reached, the sublist of remaining
elements is a special DAML resource
daml:nil). Figure 1
illustrates this data structure. This approach should be familiar to
functional language programmers, and it has the benefit that you
cannot add items to this list without replacing one of the statements
that make up the list, unlike RDF containers, to which elements can be
added without disrupting the existing list statements.
Each item in the enumeration is defined as an instance of
daml:Thing, a special DAML+OIL type which universally
includes all instances of all Classes. We don't define each item in
the enumeration as an instance of the
directly, because this would be a circular definition.
Pages: 1, 2