Thinking out loud: Wordform and Dynamic RDF

Recovered from the Wayback Machine.

An issue about attaching metadata recorded as RDF/XML to a web object, particularly a web page, is that there is no clean way to embed the XML into an (X)HTML document; at least, embed the data and still have the page validate.

Yet creating separate files just for the RDF/XML can get messy, as I found when I generated PostCon data for my weblog post entries a while back.

However, with a dynamic page application, such as WordPress, another approach is to have the application provide the appropriate data, based on passed parameters.

For instance, with WordPress, attaching “/trackback/” at the end of the post name doesn’t serve up the post page; instead it triggers the trackback web services. Doing the same with “/feed/”, returns the RSS syndication feed, and so on.

WordPress also has a way of attaching keyword-value pairs to a specific post. I’ve used this data to provide sidebar meta information about a post, here and at Burningbird, and I plan on using this to more depth within Wordform, my customized fork of WordPress.

I’ve been asked whether I would be using this capability to generate PostCon entries. I could, but a slight modification of an RSS syndication feed could do this just as easily. What interests me more is the ability to support RDF/XML generation for a variety of models (i.e. specific vocabularies), architected using built-in utility functions within the weblogging tool. These then would map the data to a structure that could be used to drive out RDF/XML when attaching the specific model name to the post, such as “/postcon/” for PostCon, and “/poetry/” for the Poetry Finder.

Yeah, easier said then done.

What would be nice would be to integrate existing RDF tools and applications to handle as much as this extended semantic modeling and metadata management as possible. A PHP-based API, such as RAP (RDF API for PHP) could be used to handle much of this, and should integrate nicely into the PHP-based weblogging functionality–but how to simplify modeling relationships when your user is barely conversant with HTML, much less something more complex?

The best approach would be to use a plug-in architecture to provide simplified, user-friendly front-ends to collect the metadata based on a specific model. Based on this there would be an RDF Poetry Finder plug-in to collect the poetry metadata, which would then incorporate this data into triples in the database. Associated with the plug-in would be a backend process that maps to a ‘data type’ passed to the tool (that previously mentioned ‘/poetry/’) and generates the RDF/XML for that model.

Wordform is based on a cut of the code of WordPress 1.3, which I believe will be incorporating the capability of adding plug-ins to the administration pages–another piece of the puzzle provided. If not, this is a functionality that should be added – extending the admin UI. Without using DHTML.

So the workflow for Poetry Finder would be:

Create the post using basic weblogging functionality.
Annotate the post with poetry metadata, using the Poetry Finder administrative plug-in.
Use RAP to add the data to the database.

When the Poetry metadata is accessed, by an application passing “/poetry/” as an extension to the post name, the poetry plug-in intercepts the request, via Wordform/Wordpress filter, and uses RAP to pull the data from the database, and generate valid RDF/XML to return.

The same workflow should work with category data, and even at the weblog level. For instance, this could be used to generate a FOAF file if one wished. The strength of this approach, though, is for individual and category archives.

To make the data useful, it would then need to be aggregated, but we have successful examples of how this can be done with RSS and FOAF. A centralized store would need to be created of collected data, and be searchable, but that’s for another late night brainstorming session.