XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Introducing RDFa, Part Two
by Bob DuCharme | Pages: 1, 2

Reification (Sort of)

Reification is the assignment of metadata to metadata. This sounds pretty abstract, but if you consider that metadata is data to track, just like any other, it's easier to see the value of reification. For example, if a document has an RDF triple saying, "this document was created by Richard Mutt," another triple saying that the triple about the document's creator was created on 2007-04-19 would be metadata about that metadata.

RDFa's designers had reification on the original list of RDF features that RDFa would eventually be able to represent, but they're having second thoughts, and the latest version of the RDFa Primer no longer mentions it. The plan for RDFa was always to make it a subset of RDF, and reification may not make the cut. (XML came to exist via a similar cutting out of potentially complex and confusing features, as its designers were creating a subset of SGML.) Still, I couldn't resist demonstrating a reification-like technique with RDFa that can be useful in web or other hypertext applications.

An HTML a linking element describes a relationship between the document containing the a element and the resource that it points to. If you're really interested in tracking metadata about your hypertext links, you can add an about attribute to the a element and add empty span element children, as shown here, to store metadata about the linking element.

<p>Mr. Breakfast has a nice
  <a about="link23"
     href="http://www.mrbreakfast.com/article.asp?articleid=17">
<span property="fb:addedBy" content="BD"/>
<span property="fb:lastChecked" content="2007-03-15"/>
scrambled eggs recipe</a>.</p>

This is not really reification because it's not metadata about metadata. In this case, it's metadata about a specific HTML element: the a element with an about value of "link23", which happens to link to another element. It's still useful, and may whet your appetite for proper reification as a feature of more full-featured RDF syntaxes.

Showing Some Class

In addition to specifying properties and values of a resource, RDFa can identify the resource as an individual of a particular class. When you have an ontology of information about a set of classes, you have additional information about individuals of those classes, so knowing an individual's class membership lets you do more with it. For example, if you know that a resource is a widgetShipment, ontology information about this class may have relevant storage and safety information.

This is a nice example of RDFa building on an obvious bit of HTML syntax to add some RDF power: you simply use the class attribute, which has been around since HTML 2.0.

For example, the class attribute in the following example tells us that the fbi:xbi432 resource is an individual of the fb:widgetShipment class:

<tr about="[fbi:x432]" class="fb:widgetShipment">
  <td><span property="fb:shipmentID">x432</span></td>
  <td><span property="fb:date"
            datatype="xs:date">2007-04-23</span></td>
  <td><span property="fb:amount"
            datatype="xs:integer">34</span></td>
</tr>

Extracting the triples and converting them to RDF/XML would result in something like this:

<fb:widgetShipment rdf:about="http://www.foobarco.com/ns/ID#x432">
  <fb:anodized rdf:datatype="http://www.w3.org/2001/XMLSchema#boolean">true</fb:anodized>
  <fb:amount rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">34</fb:amount>
  <fb:date rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2007-04-23</fb:date>
</fb:widgetShipment>

(Because this is a newer aspect of RDFa, no RDFa extractors support it as of this writing, but I'm looking forward to it being supported in the future.)

Auto-Generation of RDFa Metadata

All of my examples so far have been hand-coded, but when you consider the huge infrastructure of HTML-generating systems, it's not difficult to find opportunities for automatically generating large amounts of useful, machine-readable RDF triples inside of web pages. Templating languages typically give you a way to add HTML (or, if you prefer, XHTML) markup around the templating language's codes that indicates which values to plug in from another data source.

For example, the rhtml template files of a Ruby on Rails application let you specify the markup for one row of an HTML table, and then tell the Ruby interpreter to generate a row with that markup for each row of a table retrieved as part of a database query. You can add about attributes and span wrapper elements to the table markup as easily as you can add td elements and align attributes, and pretty soon your Ruby on Rails application is automatically generating triples of machine-readable typed values similar to those in the widget shipment table shown above. The same principle works with PHP scripts, Active Server Pages, and HTML generated by XQuery servers.

Weblogging platforms also provide customizable templates to control the HTML that they generate. My host provider offers Movable Type as a weblogging platform, so I've been using it for a few years. When I insert RDFa markup into a template with Movable Type tags such as <$MTeEntryPermalink$> and <$MTSubCategoryPath$> inside that markup, the Movable Type engine replaces its tags with the appropriate values for each weblog entry page being generated. For example, I added some RDFa markup with Movable Type tags in the head section of the template, like this:

<meta about= "<$MTEntryPermalink$>">
  <link rel="trackback:ping" href="http://madskills.com/public/xml/rss/module/trackback/"/>
  <link rel="dc:identifier" href="<$MTEntryPermalink$>"/>
  <link rel="dc:subject" href='http://www.snee.com/bobdc.blog/<$MTSubCategoryPath$>'/>
</meta>

and I wrapped some span elements around body content, like this:

<h3 class="entry-header"><span property="dc:title"><$MTEntryTitle$></span></h3>

For one recent weblog entry, Moveable Type generated this for the header:

<meta about= "http://www.snee.com/bobdc.blog/2007/03/new_eric_van_der_vlist_book_on.html">
  <link rel="trackback:ping" href="http://madskills.com/public/xml/rss/module/trackback/"/>
  <link rel="dc:identifier" href="http://www.snee.com/bobdc.blog/2007/03/new_eric_van_der_vlist_book_on.html"/>
  <link rel="dc:subject" href='http://www.snee.com/bobdc.blog/xml'/>
</meta>

and it generated this for the h3 part shown above:

<h3 class="entry-header"><span property="dc:title">New Eric van der Vlist book on 
Schematron out</span></h3>

An RDFa extractor gets (among other triples) the following RDF out of the document, shown here in RDF/XML:

<rdf:Description rdf:about="http://www.snee.com/bobdc.blog/2007/03/new_eric_van_der_vlist_book_on.html">
  <trackback:ping rdf:resource="http://madskills.com/public/xml/rss/module/trackback/"/>
  <dc:subject rdf:resource="http://www.snee.com/bobdc.blog/xml"/>
  <dc:identifier rdf:resource="http://www.snee.com/bobdc.blog/2007/03/new_eric_van_der_vlist_book_on.html"/>
  <dc:title rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">New Eric van der 
   Vlist book on Schematron out</dc:title>
</rdf:Description>

Movable Type creates the RDFa I've shown here for each new file that it creates. And, for that matter, for each old file that it creates as well, because it's easy enough to tell Movable Type to regenerate all of them. So shortly after I made this change to the template, I had nice RDFa metadata in all the weblog entries I'd ever written on this system. To harvest that metadata, I could use a script with a single wget or curl call for each weblog entry to combine that metadata into a single file, and then I could create specialized tables of contents, reports, Topic Maps, and other applications around this content collection.

Whenever you see HTML being generated automatically, you have an opportunity to create RDFa. Movie timetables, price lists, and so many other web pages where we look up information are generated from a backend database. This is fertile ground for easy RDFa generation, which could make RDFa's ease of incorporating proper RDF triples into straightforward HTML one of the great milestones in the building of the semantic web.