Super Model

September 20, 2000

Leigh Dodds

The RDF Interest Group has recently been gathering momentum, and the XML-Deviant takes a look at the progress they're making towards improving understanding of RDF.

A Look Back

Regular XML-Deviant readers will recall that RDF has received scrutiny from the XML developer community on more than one occasion. Back in March, we reported on an XML-DEV debate on the complexity of RDF (see "Being Resourceful"). There was uncertainty about how best to use RDF; and general agreement that the mixture of modeling and syntax descriptions in the RDF Model and Syntax specification hindered, rather than helped, understanding.

More recently, the Deviant explored discussion on the RDF Interest mailing list about the RDF syntax. (see "Instant RDF"). Developers disagreed about the importance of focusing on syntactic details and the degree to which this syntax should affect XML documents.

Recognizing that these points, among others, are recurring themes in RDF circles, Dan Brickley has started compiling an RDF issues list.

As you all know, discussion threads on this list tend to revisit old themes, and we're dealing with a rather complex web of overlapping problems and puzzles.

I've finally put up a skeletal RDF Interest Group 'Issue Tracking' page as an effort towards gathering common issues, strategies and resolution proposals from the RDF community.

Brickley had hoped that the RDF community would add to the list and help flesh out the details of each issue. To date, the issues include problems with both the syntax and the model. Cataloging these problems has lead to further discussion of the specification.

Removing the Syntax

Lee Jonas suggested that separating the RDF model into a separately maintained document would be the best way forward.

I suggest it would be useful to put together a document describing the RDF model aspects of the M&S spec without any reference to syntax. Not only would this be useful in its own right, it can become the focal point for resolving model issues and subsequently driving out the design of the new syntax (or else for eva[lu]ating other web data graph syntax for the job).

Other RDF Interest Group members agreed, although Brian McBride observed that this should become the definitive description of the RDF model.

Speaking for myself, I would find a model only specification very useful, PROVIDED the result was a clarification of the model.

We must ensure that we don't add to the confusion by having two documents telling different stories and folks not sure which to go by. Any new document must be authoritative.

James Tauber suggested that stripping the syntax details from the current Model and Syntax specification would be a good way to start the process.

[A] useful thing to start with might be a version of the rdf m&s spec that is just the m, without any changes to the model. Once we have that as a straw man, we can discuss the issues relating purely to the model and keep them orthogonal to the syntax.

Contributions were not long in coming. Dan Brickley obligingly took a knife to the Model and Syntax specification, to produce an initial discussion document for review.

I'm circulating this as-is in the hope we'll collectively get some sense of whether there's enthusiasm and resources for progressing this work here or in some future W3C Working Group. The job, informally characterized, is that of extracting the core RDF Model from the M+S 1.0 REC, and figuring out where, if at all, it needs clarification and/or refinement. My personal opinion is that any effort to come up with an improved RDF syntax, or bug fixes for the RDF 1.0 XML grammar, will need to have a clean sense of where syntax stops and model starts.

This discussion document has been well-received. Lee Jonas believed that the document proves the simplicity of the RDF model.

The RDF model document proves that the basic tenet of RDF Model is sound and quite straightforward, once the complexity the syntax brings has been removed. I feel confident that such an overview would be far easier to comprehend by the newcomer. Hopefully any alternative syntax we can devise will be equally straightforward.

It's hard to argue with this view. Despite being an early draft, the document manages to convey succinctly the central concepts of RDF. The removal of syntax details reduces the size of the document considerably, making it much more manageable. A separate syntax-only document would make a useful companion. Budding RDF developers could do worse than to read this document first.

Answering concerns with how the discussion document stood in relation to the official specification, Dan Brickley stressed that the model-only document does not supplant the existing specification.

The RDF Model and Syntax document is a W3C Recommendation and remains as such. No RDF Interest Group discussion document(s) can affect that. Implementors should keep on implementing.

The document comes with an "implementors health warning" to avoid it being used as a reference for implementors (at least in its current incarnation).

Brickley continued by suggesting that further work on this document could involve collaboration between the RDF Working Group and the members of the Interest Group mailing list.

Regarding W3C Process, the main [thing] I want to say at this point is that while a W3C Interest Group such as this is not the place to do Recommendation-track work (that's what Working Groups do), we certainly can use the RDF IG to work collaboratively on documents that refine our understanding of the technology and specifications. These could/should set things up for spec-oriented Working Groups to finish the job.

This is a nice illustration of how a community of users can help guide W3C process.

Syntax is Serialization

Syntax is the other side of the RDF coin. While progress on the model may now be easier with a more manageable description, the syntax issue still has to be addressed. Alternative syntax proposals have been steadily accumulating.

"Alternative" is the key word here. RDF has always supported the notion of having multiple syntaxes: there is no need to deprecate the existing RDF syntax, so existing applications will not be affected by an exercise to simplify the syntax.

Essentially the RDF syntax provides a way to serialize an RDF model as XML. Serialization is a process which is not unique to RDF. Dan Brickley has begun compiling a list of relevant references, and the introduction to his "XGraph" document outlines its intent:

There seems to be some consensus around the claim that RDF has a useful data model but a problematic XML syntax. This document is an attempt to gather together the various discussion documents and proposals that relate this topic to the broader context of XML-based graph serialization systems.

One area in which serialization of object models is important is in XML-based protocols. The SOAP specification, for example, includes a set of serialization rules used to encode data as a SOAP message.

This week the W3C announced the charter of a new Protocols Working Group whose scope is to build a foundation for XML-based protocols. The Working Group will use SOAP as an initial starting point. Of particular relevance here is the following note on data representation:

The Working Group should propose a mechanism for serializing data representing non-syntactic data models in a manner that maximizes the interoperability of independently developed Web applications. Furthermore, as data models change, the serialization of such data models may also change. Therefore it is important that the data encapsulation and data representation mechanisms are designed to be orthogonal.

There appears to be an opportunity for cross-pollination of ideas between the RDF and Protocols activity in the area of data serialization.