Introducing SKOS
SKOS (Simple Knowledge Organization System), recently introduced by the W3C, is a model for expressing knowledge organization systems in a machine-understandable way, within the framework of the Semantic Web. The SKOS Core Vocabulary is an RDF (Resource Description Framework) application. Using RDF allows data to be linked and merged with other RDF data by Semantic Web applications. SKOS Core provides a model for expressing the basic structure and content of concept schemes, including thesauri, classification schemes, subject heading lists, taxonomies, terminologies, glossaries, and other types of controlled vocabulary. This article will provide some examples for using SKOS and discuss the general principles of building such knowledge bases.
The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to process and integrate information available on the Web. The Semantic Web relies on XML's ability to define schemes and RDF's flexible approach to representing data. The next element required for the Semantic Web is OWL, the Web Ontology Language, which can formally describe—using, most commonly, a logical formalism known as Description Logic—the semantics of classes and properties used in Web documents.
OWL adds a layer of expressive power to RDF and provides powerful tools for defining complex conceptual structures, which can be used to generate, among other things, rich metadata. However, the class-oriented, logically precise modeling required to construct useful web ontologies is demanding in terms of expertise, effort, and therefore cost. In many cases this type of modeling may be unnecessary or unsuited to requirements. So there is a need for a language to express vocabularies of concepts for use in semantically rich metadata, which is powerful enough to support semantically enhanced search, but simple enough to be undemanding in terms of the cost and expertise required to use it.
The SKOS Core Vocabulary is a set of RDF properties and RDFS classes that can be used to express the content and structure of a concept scheme as an RDF graph.
As an example of the kind of structure SKOS was designed to represent, let's look at the example definition of the word "canals" from Alexandria Digital Library Thesaurus:
canals » A feature type
category for places such as the Erie Canal.
Used for: » The category canals is used instead of any of the following.
Broader Terms: hydrographic structures » Canals is a sub-type of "hydrographic structures."
Related Terms: » The following is a list of other categories related to canals (non-hierarchical relationships)
Scope Note: Manmade waterway used by watercraft or for drainage, irrigation, mining, or water power.
Now let's represent this complex structure using the SKOS Core Vocabulary:
|
|
The corresponding machine-readable representation in RDF-XML (source code):
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#">
<skos:Concept rdf:about="http://www.my.com/#canals">
<skos:definition>A feature type category for places
such as the Erie Canal</skos:definition>
<skos:prefLabel>canals</skos:prefLabel>
<skos:altLabel>canal bends</skos:altLabel>
<skos:altLabel>canalized streams</skos:altLabel>
<skos:altLabel>ditch mouths</skos:altLabel>
<skos:altLabel>ditches</skos:altLabel>
<skos:altLabel>drainage canals</skos:altLabel>
<skos:altLabel>drainage ditches</skos:altLabel>
<skos:broader rdf:resource="http://www.my.com/#hydrographic%20structures"/>
<skos:related rdf:resource="http://www.my.com/#channels"/>
<skos:related rdf:resource="http://www.my.com/#locks"/>
<skos:related rdf:resource="http://www.my.com/#transportation%20features"/>
<skos:related rdf:resource="http://www.my.com/#tunnels"/>
<skos:scopeNote>Manmade waterway used by watercraft
or for drainage, irrigation, mining, or water
power</skos:scopeNote>
</skos:Concept>
</rdf:RDF>
The current edition of SKOS Vocabulary replaces the earlier SKOS Core 1.0 Guide published by the SWAD-Europe Thesaurus Activity. The origins and background of technologies preceding SKOS are well defined in a XTech 2005 Proceedings SKOS report, so let's skip history description and go directly to language definition.
Let's look at the RDF-XML more closely. The skos:Concept
class says that a resource is a conceptual resource. This sounds
vague, but according to the RDF Semantics standard: assertion
is any expression which is claimed to be true; class is a
general concept, category or classification; resource is an
entity or anything in the universe. Actually skos:Concept
is used to define an atomic conceptual resource. In the example above,
the SKOS document defines a thesaurus entry for the
entity "canals".
skos:Concept is not the only class available in
SKOS. There are also other top-level classes:
skos:Collection is a meaningful collection of
concepts. Labelled collections can be used with collectable
semantic relation properties (skos:narrower),
where you would like a set of concepts to be displayed under a
node label in the hierarchy;skos:CollectableProperty is a property which
can be used with a skos:Collection;skos:ConceptScheme is a set of concepts,
optionally including statements about semantic relationships
between those concepts. Thesauri, classification schemes,
subject-heading lists, taxonomies, terminologies, glossaries
and other types of controlled vocabulary are all examples of
concept schemes;skos:OrderedCollection is an ordered
collection of concepts, where both the grouping and the
ordering are meaningful.
SKOS Core uses labeling properties to assign tokens to a resource,
where the token is intended to denote the resource in natural language
or other representations intended for human
consumption. The skos:prefLabel
and skos:altLabel properties allow you to assign
preferred and alternative lexical labels to a resource. Under normal
circumstances prefLabel and altLabel values
can be considered synonyms. However, when labeling resources of
type skos:Concept, it is not necessary to restrict
preferred and alternative lexical labels to precise synonyms. For
example, the following is valid:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#">
<skos:Concept rdf:about="http://www.my.com/#good">
<skos:prefLabel>good</skos:prefLabel>
<skos:altLabel>bad</skos:altLabel>
</skos:Concept>
</rdf:RDF>
Abbreviations and acronyms may also be used to label concepts, and the
choice of whether to use them as preferred or alternative terms is
unconstrained. However, misspelled words are normally included among
the hidden labels. A hidden lexical label is a lexical label for a
resource, where you would like that character string to be accessible
to applications performing text-based indexing and search operations,
but you would not like that label to be visible otherwise. To assign a
hidden lexical label to a resource, use
the skos:hiddenLabel property. The most common use of
hidden labels is to include misspelled variants of other lexical
labels. The value of the properties skos:prefLabel
and skos:altLabel should be a plain literal. A plain
literal is a character string with optional language tag, and the
language tag may be used to restrict the scope of a lexical label to a
particular language. The values permissible as language tags are given
by RFC3066. Here's an example:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#">
<skos:Concept rdf:about="http://www.my.com/#good">
<skos:prefLabel xml:lang="en">good</skos:prefLabel>
<skos:altLabel xml:lang="en">bad</skos:altLabel>
<skos:prefLabel xml:lang="fr">bon</skos:prefLabel>
<skos:altLabel xml:lang="fr">mauvais</skos:altLabel>
</skos:Concept>
</rdf:RDF>
Symbolic labeling means labeling a concept with an image. To
assign preferred and alternative symbolic labels to a concept, use
the skos:prefSymbol and skos:altSymbol
properties.
There are eight properties you can use to add human-readable documentation
to the description of a concept. The properties
are skos:publicNote, skos:privateNote, skos:definition, skos:scopeNote, skos:example, skos:historyNote, skos:editorialNote
and skos:changeNote. Descriptive notes for a concept can
be public or private
(skos:publicNote, skos:privateNote). Only skos:editorialNote
and skos:changeNote are private notes, others are
public. Thus a skos:definition is also
a skos:publicNote, a skos:editorialNote is
also a skos:privateNote and so on. To clarify the
difference between skos:definition
and skos:scopeNote, a definition should be an attempt to
completely explain the meaning of a concept, whereas a scope note may
consist of partial information about what is or is not included within
the meaning (or scope) of a concept. To clarify the difference between
a skos:historyNote and a skos:changeNote, a
history note is a piece of information intended for users of the
scheme, documenting significant changes to the meaning, form, or state
of a concept, whereas a change note is intended for documenting
fine-grained changes to a concept for the purposes of administration
and management.
There are three recommended usage patterns for the SKOS Core documentation properties:
An RDF Literal is the simplest pattern for using the SKOS Core documentation properties, where the property value (i.e. the object of the triple) is an RDF literal. This is the way we used it in our example SKOS document:
<skos:scopeNote>Manmade waterway used by watercraft or for
drainage, irrigation, mining, or water power</skos:scopeNote>
Actually this is a simplified example; presented in core RDF it
will look a bit more complicated (using rdf:value tags).
Related Resource Description allows you to structure documentation as
a related resource description. Document Reference is a pattern that
allows you to refer to documentation that is itself a document, via
the URI of that document. For example,
<skos:scopeNote rdf:resource="http://www.my.com/note.txt"/>
The SKOS Core Vocabulary includes the following properties for
asserting semantic relationships between
concepts: skos:semanticRelation, skos:broader, skos:narrower
and skos:related. In a property
hierarchy semanticRelation is the top semantic
relationship and others are children relationships. To assert that one
concept is broader in meaning (i.e. more general) than another, where
the scope (meaning) of one falls completely within the scope of the
other, use the skos:broader property. To assert the
inverse, that one concept is narrower in meaning (i.e. more specific)
than another, use the skos:narrower property. This is how
we used it in our example document:
<skos:Concept rdf:about="http://www.my.com/#canals">
<skos:broader rdf:resource="http://www.my.com/#hydrographic%20structures"/>
</skos:Concept>
The properties skos:broader
and skos:narrower are each other's inverse. Both the
properties skos:broader and skos:narrower
are transitive properties. To assert an associative relationship
between two concepts, use the skos:related
property:
<skos:Concept rdf:about="http://www.my.com/#canals">
<skos:related rdf:resource="http://www.my.com/#channels"/>
<skos:related rdf:resource="http://www.my.com/#locks"/>
</skos:Concept>
You can create and define a meaningful group of concepts. However,
meaningful collections of concepts are still unstable and can be
changed in the future. SKOS Core has special vocabulary to handle
collections. However, RDF has some generic vocabulary
(rdf:Bag and rdf:Seq) to handle ordered and
unordered groups of resources; while preparing a W3C Working Draft,
there has been extended discussion in mailing lists as to whether
these should be used. The choice has been made provisionally not to
use rdf:Bag and rdf:Seq for this purpose.
(See
the explanation
if you're curious.)
To define a meaningful collection of concepts, use
the skos:Collection class and
the skos:member property. To assign a lexical label to a
collection, use the rdfs:label property. The most common
use of a labelled collection is to enhance a hierarchical display. You
can describe narrower and broader relationships between a concept and
a collection. The class skos:CollectableProperty supports
a generic mechanism by which collections can be involved in semantic
relationships (and other sorts of statements). To define an ordered
collection of concepts, use the skos:OrderedCollection
class with the skos:memberList property. An ordered
collection may also have a label
(use rdfs:label). Ordered collections can be used with
semantic relation properties in the same way as unordered collections
(skos:OrderedCollection is a subclass
of skos:Collection).
Usually concepts are defined in relation to other concepts, as part of
an internally coherent concept scheme. As mentioned in the
introduction, a concept scheme is defined here as a set of concepts,
optionally including statements about semantic relationships between
those concepts. The skos:ConceptScheme class allows you
to assert that a resource is a concept scheme.
There are still some open issues, where no firm consensus has been reached and where readers can potentially help to improve future SKOS W3C Recommendations.
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.