XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Being Resourceful

March 08, 2000


Table of Contents

Really, Darn Frustrating?
Model versus Syntax
Implementation Difficulties
Different Viewpoints
Simply Very Groovy

Resource Description Framework (RDF) is the latest W3C specification to be shaken down by the members of XML-DEV. The discussion captures the current perception of RDF in the XML community. XML Deviant has been following the debate....

Really, Darn Frustrating?

"XML is a cinch—but with RDF you have to make yourself a choice; Either RDF is stupid—or you are!"

The above comment made during a W3C/WAP Forum workshop, was repeated in a message from Greg FitzPatrick outlining his disappointment with the RDF specifications. Other contributors were quick to echo the observation. Frank Boumphrey, author and editor of several XML books, commented that, in comparison to other specifications,

...the RDF spec is particularly obtuse, and every time I have to write something on RDF my heart sinks, because I know it will take me a good 1-3 days research before I am sure I have got it right!

At first sight, it would appear that this is yet another case of a specification being less than accessible, a topic we've seen crop up repeatedly over the last few months. Michael Champion was not impressed by this prospect:

"Sigh, now we have Groves, Architectural Forms, and RDF as technologies that could save the world if only people could look past their complexity and the turgidity of their specifications. The world does need saving, so no one wants to overlook potentially helpful technologies. On the other hand, most of us have more lucrative ways of spending our time than deconstructing these specs."

(You can read more about the Groves debate, in a previous issue of XML Deviant, "A Road Well-Travelled".)

Other XML-DEV members believed that the root cause of the problem wasn't the RDF specification itself. RDF requires data modeling to produce useful vocabularies: something hard to get right first time. Bill dehOra pointed out that modeling requires certain skills:

It seems naive to expect that people will automatically generate good to reasonable RDF models just because RDF exists, in the same [way] we can't expect well designed web pages because HTML exists. Not that RDF doesn't have its faults, it does, but RDF is only a language, it won't teach you how to speak.

Lack of experience may be an important factor. Additional documentation and illustrative examples would go some way towards fixing this problem. Hopefully the W3C is listening. It's encouraging to see the Schema Working Group taking great pains to answer criticisms of the Schema Drafts. (See "Spotlight on Schemas" for some background.) The latest Schema Working Drafts included a "primer," which concisely describes the Schema facilities. An RDF Primer would be an excellent addition.

Model versus Syntax

One aspect of the specification repeatedly criticized was the mixture of modeling and syntax. Without some RDF experience, it's hard to separate the two. William Grosso pointed out that this makes understanding the RDF model harder.

What's more, forcing people to focus on syntax (*any* serialization syntax, not just XML) makes modeling orders of magnitude more difficult (it forces people to think at the conceptual level and pay attention to things like "did I get the tags right" at the same time).

The RDF syntax includes several variations, including "Abbreviated" forms. While this provides flexibility that allows almost any XML document to be interpreted as RDF, it complicates matters. Jeff Sussna, worried that RDF will be neglected, commented that

...the frustration with RDF comes primarily from the casting of the model into XML syntax(es), not from the writing of the spec.

David Megginson disagreed, believing that the RDF model documented in the specification doesn't reflect the XML syntax:

...the XML syntax for RDF has too many annoying variations, granted, but the main problem is that the underlying RDF data model is much, much more complicated than the spec suggests.

Megginson believed that the only way to understand the RDF data model was to reverse-engineer it from the XML syntax. He later commented that:

The problem is that the model as presented is naively simple, and the WG failed to notice that the XML syntax is not based directly on that simple model.

Implementation Difficulties

There are few tools available for manipulating RDF, and more importantly for the developer, no standardized APIs. But why? The RDF Model and Syntax Specification has been a W3C Recommendation since February 1999, but the RDF Schema specification has been languishing as a Proposed Recommendation since March 1999.

David Megginson, having written a Java RDF API called DATAX, recounted his experience:

My biggest problem in writing an RDF library was trying to puzzle out what the real RDF model was, and my second biggest problem was trying to figure out how to support it. The RDF XML syntax itself was an annoyance, but it wasn't a major problem once I figured out how to manage the different states.

Mark Birbeck suggested that RDF is difficult to implement, and this has limited the widespread development of useful tools.

I would suggest that the biggest problem is that it is very difficult to implement many of the truly radical aspects of RDF/S, and so people find it hard to picture how it would work. It's also a bit odd because the applications of RDF are not really "advertised" in the RDF spec.

This kind of problem should now be less frequent. The recently-introduced W3C Candidate Recommendation phase allows time for a specification to be implemented. This should highlight these kinds of ambiguities in a model or syntax. Dealing with the complexity of RDF has lead to many implementors omitting support for difficult features, effectively subsetting the specification.

Different Viewpoints

One thing that became clear in the discussion was that there were different opinions about what RDF is actually for. Suggestions came from two apparently disparate perspectives. David Megginson saw RDF as a means for sharing data objects:

To exchange serialized objects independent of protocols or programming language (forget about the semantic web hooey). RDF is suboptimal for this, but it gets a lot of things right (i.e., extensibility) and there doesn't seem to be another reasonable candidate out there yet.

Megginson also believed that explaining RDF from this perspective was a better approach than using Knowledge Representation (KR).

When they realize that RDF is just a way to serialize objects in XML, and that they can safely ignore all of the bizarre pseudo-grammatical and pseudo-KR terminology (sometimes after several wasted days puzzling over it), they warm up to RDF a little.

Mark Birbeck agreed that KR wasn't the best starting point. Birbeck thought that anyone with good relational database skills already understood the core RDF concepts. Birbeck explained RDF as follows.

  1. The internet is a massive relational database.
  2. The brilliant invention of URIs allows us to give everything a primary key.
  3. We have no control over someone else's 'tables' (resources) so we can't add to their stuff, so we have to add to our own tables and do joins to theirs (in other words, make statements).
RDF is just like those many-to-many tables you've been doing for years in your databases.

Birbeck had different opinions on the basic utility of RDF. From this viewpoint RDF is actually the key to building a true web of information, and will deliver on much of the promise of XML. (See Tim Bray's article, "RDF and Metadata".) Birbeck believes that RDF is about more than serialization.

...explaining RDF in terms of serialization does not actually explain RDF. I'm sure many people on these lists have grappled with the notion "why should I serialize with RDF when I can just use a straight XML document?" The answer is unfortunately "you don't have to." If you remove the distributed nature of RDF, then it cannot represent any more than an XML DTD can (to put it another way, I can easily interpret XML documents as 'nodes and arcs').

This exchange illustrates the difficulty in grasping RDF. What many miss is that RDF is capable of describing data structures that "vanilla" XML syntax cannot: structures other than the hierarchical ones inherent in the XML syntax. Without a clear idea of the purpose of a specification, it's harder to see how to apply it. The saying goes that when you have a hammer, everything looks like a nail. Extending the analogy, RDF is a Swiss army knife. More flexible, but it needs some additional thought on how use it.

Simply Very Groovy

In contrast to RDF, one W3C specification is currently enjoying a great deal of good publicity. Simple Vector Graphics is an XML-based language for describing two-dimensional graphics and text. The last Working Draft garnered a lot of good feedback from the developer community. Peter Murray-Rust publicly congratulated the members of the W3C Working Group:

In practice the members have produced a high-quality spec, with very exciting early proof-of-concept from several members (and non-members). We can reasonably see SVG being transparently incorporated into the browsers and becoming part of web life. At that stage many of the Web folk will have to learn XML to be able to use SVG to its full extent.

It's probably too early to describe SVG as a "Killer App," but it's definitely the cool tool of the moment. One the W3C got right.