RDF: Ready for Prime Time
July 30, 2003
Shelley Powers is the author of Practical RDF
Not long ago, Marc Canter, one of the early founders of Macromedia, talked about RDF and the Semantic Web in his weblog. Specifically, he wrote:
"I've been spending more and more time trying to grok the RDF folks. I have to say I like what I see and hear, but what I don't see are many apps and services actually up and running and working.
We have a saying over here: "put up or shut up." I'm still looking for two different RDF apps or services to work together in some meaningful way. Then bring on the books."
Considering that I'm "bringing on a book" on RDF this month, I thought it appropriate to answer Marc's plea for meaningful, working examples of RDF apps and services, especially those that work with other RDF-based services. My problem, though, is that I have only a limited amount of time and space in this article; I can only cover a few of them. However, best to just start, but first, a little digression into RDF and XML.
RDF/XML: The Syntax That Could
You probably know that RDF has both a defined model as well as a preferred serialization, RDF/XML. In many ways there's been far less criticism of RDF than there has been of the RDF/XML syntax. Tim Bray, one of the creators of XML has said:
"Speaking only for myself, I have never actually managed to write down a chunk of RDF/XML correctly, even when I had the triples laid out quite clearly in my head. Furthermore, once again speaking for myself, I find most existing RDF/XML entirely unreadable. And I think I understand the theory reasonably well."
Tim even went so far as to offer his own version of RDF/XML, which he called RPV.
I've found that the more a person works with markup such as XML, the more they dislike RDF/XML. I've also found that no matter the alternative proposed, someone else will dislike it just as much, which makes RDF/XML a bit of a "damned if you do, damned if you don't" proposition.
Ultimately, if RDF is ready for prime time, then so is RDF/XML. Regardless of our views of it, it's official, it's real, and it's here now. So on to the RDF applications, starting with the basics: the APIs.
For every programming language you're interested in, there's most likely an RDF API and a library implementing it. If you're interested in Java, one of the most popular Java RDF libraries for RDF is Jena, from HP's Semantic Web Research Lab. The current version of Jena is 1.6.1, which is the one I've used, but there is a beta-release of a new version (Jena2), and it's the one you'll most likely want to investigate. As you'll see later, Jena is used for several utilities and applications.
For those interested in Python, the most popular RDF library -- which also includes a triplestore with several different backends -- is Daniel Krech's RDFLib. Want something a little more unusual? Try Wilbur, a Common Lisp RDF library, written by Ora Lassila, one of the creators of RDF.
For those who work primarily with Microsoft development environments, there is a C# RDF Parser called Drive, which provides an API to parse RDF/XML into an in-memory RDF graph for manipulation. It's fully compatible with the .NET platform, and it can also be used with the open source variant of .NET, Mono.
If Perl is more your thing, there's Ginger Alliance's PerlRDF, a library I've used in several small applications at my site. And other, popular applications like Six Apart's weblogging application, Movable Type, are also using it. Six Apart extended the PerlRDF module by creating a new module, XML::FOAF, which enables autodiscovery and processing of FOAF files. FOAF, or Friend-of-a-Friend, is an RDF vocabulary for defining hierarchies of acquaintances and is now one of the most popular uses of RDF/XML.
If you want support for multiple RDF languages as well as a more sophisticated framework and data persistence, you'll want to check out Dave Beckett's Redland. In addition to providing a persistent data store, as well as multiple language support (Python, Perl, Java, Tcl, and Ruby), Redland also provides support for an independent RDF parser called Raptor. Raptor has been used, independently, in other applications, including several FOAF apps, as well as RDF Gateway, a commercial product I'll discuss later in this article.
FOAF is one of the more popular vocabularies of RDF/XML. Just a quick perusal at the FOAF web site will show dozens of uses of FOAF in tools ranging from a FOAFBot, created by Edd Dumbill and used to provide services within chat forums, to uses of FOAF in desktop tools within the OS X environment for managing contacts. My own FOAF file is at http://weblog.burningbird.net/foaf.rdf, and consists of pointers to friends I know online, though the list is incomplete.
The beauty of FOAF lies in its simple way of describing personal information, including our work and academic affiliations. The power of FOAF lies in its ability to list acquaintances who themselves may have FOAF files. Over time, this interlinked network can expand until it's a simple matter of mapping out who is connected, directly to indirectly, to each other.
Another RDF vocabulary in popular use is RSS 1.0. Webloggers and other online publications use RSS 1.0 to provide information about updates at their web sites, including the date of the update, the author, an excerpt of the material and so on.
A third RDF vocabulary is the RDF/XML used to describe Creative Commons licenses, a new way to provide more detailed information about use of copyrighted material.
All three vocabularies use, in one way or another, elements from the Dublin Core Metadata Initiative (DCMI), as defined in RDF/XML. However, these vocabularies aren't the only ones available using RDF/XML. In fact, the W3C uses RDF/XML to define the underlying syntax for its own Web Ontology Language (OWL) effort. With RDF providing the underlying model, and OWL adding higher-level ontology support, it's only a matter of time before a host of sophisticated, domain-specific ontologies spring up, all of them interoperable because of the underlying use of RDF/XML.
In fact, there's a host of tools and utilities people can use right now to work with RDF/XML directly or with OWL.
Tools and Utilities to Work with RDF/XML
As much as I like RDF/XML, even I'll admit that it requires time to understand and work with, and not everyone has either a desire or an inclination for this effort. Thankfully, there's plenty of tools available to allow people to manually create or read RDF/XML.
The most commonly used RDF utility is the RDF Validator, a tool to check your RDF/XML to ensure that it's valid, as well as to generate different views of the model data. I find that when working with an API, I'll use the Validator to validate my sample RDF/XML, view the model to ensure I've created the appropriate one, and then create the triples to use as a pattern with my RDF/XML API calls, in whatever language I'm coding.
Another handy utility for working with RDF/XML is the BrownSauce RDF Browser. This web application uses Jena. It can open an RDF/XML document and provide easily readable and hypertext-linked pages of the RDF data contained in the document. Best of all, the browser also opens any associated RDF Schema documents that provide information about the RDF elements themselves, through the relationships described in the schema, and through comments provided with the schema elements.
A long-time advocate of RDF and a friend of mine, Danny Ayers, has been busy at work on Ideagraph, a tool for visually mapping ideas and then generating RDF/XML from the results. In addition to this effort, the tool can also act as a RDF-based weblogging tool, as well as an RSS aggregator.
Isaviz is another popular visual-editing tool for creating, importing, and working with RDF documents in RDF/XML, and within other serialization formats such as Notation 3 and N-Triple format. It's particularly useful when you're creating a new RDF vocabulary and want to use a visual tool for this effort rather than trying to create the vocabulary in RDF/XML manually. However, I prefer to use the tool to work with existing RDF/XML documents, particularly larger ones, because the tool has a way of being able to zoom in on components of a model, to create snapshots of particular paths, and to query on specific elements. In particular, if you're documenting an existing RDF/XML vocabulary, Isaviz can be useful for providing snapshots of particular instances of data.
Most of these tools are geared more for working directly with RDF/XML vocabularies. If you're working with an ontology instead, then you must look at Protege, from Stanford University. This tool not only allows you to define an ontology using an easy-to-use user interface, you can then create forms to capture the ontology data. Once the forms are defined, the tool can then be used to capture instances of data based on the ontology. Currently. effort is underway to provide support for OWL files, and mapping between Protege's own ontology language and the W3C language. Regardless, the data captured by Protege can be output in multiple formats, most particularly RDF/XML.
Peripheral RDF Support in Other Tools and Utilities
Of course, tools that focus purely on RDF, whether to create RDF or to consume RDF, are handy when you're starting work with RDF--but what about RDF in the real world?
Probably one of the first uses of RDF/XML was by those involved in the Mozilla effort, which still uses RDF/XML for all of its automated Table of Contents data and processing. In fact, it was through my interest in the Mozilla development environment that I became exposed to RDF/XML (see www.mozilla.org/rdf/doc/).
If you've worked with Linux then you're most likely familiar with RPM, a way of packaging Linux applications for easy installation. What you may not know is that RDF has been used with RPM to provide metadata about the package being installed. A utility created by Daniel Veillard, rpmfind, uses RDF to discover RPM installations on Rpmfind.Net, a database of RPM packages maintained by the W3C. Though the original creator of the product is no longer maintaining rpmfind directly, the source is now located at sources.redhat.com, and I'm still using rpmfind for my own server.
Earlier I mentioned Movable Type and its use of RDF for autodiscovery of FOAF files. The application also uses RDF/XML to annotate weblog entries with trackback information, which can be used to document links from one weblog to another and provide reverse link information. This same functionality has been isolated for use by other tools, weblogging or otherwise.
Spring, a Mac OS X desktop tool created by Robb Beal, provides support for dragging and dropping FOAF files. Find an FOAF link in a web page? Click on it and drag it to Spring in order to automatically transform the FOAF contents into the tool.
As ubiquitous as RDF is becoming, creeping its way into a favorite tool or utility near you, the power of the RDF model's inferential capability is particularly apparent when you look at some of the larger applications that are being built on RDF.
People at MIT are working on an application, called DSpace, which will maintain a digital repository of information. The application is geared to any larger organization such as a college or university that wants to maintain a searchable index of publications from its members. DSpace is a freely available, open source application that makes use of an ontology, Harmony/ABC and RDF to maintain the historical subsystem. RDF Gateway is a Semantic Web application server that uses RDF as the core of all of its services. With the Gateway, you get access to a persistent data store that can be queried using an inferential engine that goes beyond normal SQL-like queries. Included with the application is support for server-side scripting similar in nature to both ASP (Active Scripting Pages) and JSP (Java version of same).
Siderean Software's Seamark is another commercial application that makes use of RDF and a persistent data source, but Seamark focuses primarily on site navigation. Plugged In Software's Tucana Knowledge Store provides sophisticated knowledge-based querying of large stores of data, again based on RDF.
These companies are just the first to start looking at RDF and the RDF data model for use in large-scale, sophisticated applications. And then there's the Semantic Web.
The Semantic Web
It's funny in a way, but I can sit down and rattle off a dozen uses for the RDF data model and the associated RDF/XML without once mentioning its primary purpose, which is to provide support for the Semantic Web efforts. All uses of RDF for any purpose are good because they increase our familiarity with the specification as well as the syntax. In addition, applications that increase the level of RDF/XML out on the web add to the pool of accessible data on which we are slowly building the Semantic Web. Through the use of RDF, we know that all of the vocabularies are compatible.
Beyond these good and practical uses of RDF I've described earlier in the article, and unlike XML or HTML or XHTML, the RDF model, and its associated syntax, brings with it the ability to define statements about data, rather than to just record pieces of data. Add to this the use of OWL, and we begin to have the ability to mine for knowledge, not just words.
Consider poetry. My favorite poem is Walt Whitman's "Song of the Open Road", with its friendly and positive imagery of life as an adventure, a road to follow with glee. In fact, the use of "road" as a metaphor for life and life's journey is quite common in poetry. (See an excellent article, "Poetry of the Open Road.") However, it's the very use of imagery and metaphor in poetry that defeats traditional web discovery techniques.
Currently we have the ability to use keyword searches within search engines such as Google, and with this we can find poems that mention the word "road". This is all well and good, but in the future, as the use of RDF and RDF/XML expands, we'll be able to do searches that not only provide links to poems that have used "road", but also know which poems use the word as a metaphor for life, which have used it metaphorically to describe freedom, and which are just talking about roads as roads.
Eventually as RDF insinuates itself throughout the web, as it has already started, you'll be able to search on "road" and "poem" and "metaphor for life" and not get this article back as a result. As much as I like the thought of people reading this article, that search result will be a good thing because this article is not about poems, metaphors, and life. It's about RDF and how it is now more than ready for prime time.
O'Reilly & Associates recently released (July 2003) Practical RDF.