Myths about RDF/RSS

Lots of discussion about the direction that RSS is going to take, which I think is good. However, the first thing that happens any time a conversation about RSS occurs is people start questioning the use of RDF within the RSS 1.0 specification, and the necessity of keeping RSS “simple”.

Mark Pilgrim writes:

Many people in the RSS community feel that, while the lack of extensibility in RSS 0.9x is too limiting, the full-blown RDF syntax of RSS 1.0 is overkill for the purposes of syndicating weblogs.

Jon Udell writes:

RSS is becoming too complex. It needs to remain simple, human-readable and -writable.

Well, this just plain peeves me. Not Mark or Jon’s statements, but the idea that a) RSS must be human readable and writeable and b) RDF makes RSS overly complex.

Specifically, there are three myths I want to address:

Myth 1. RDF adds complexity to RSS because the RDF Seq element is unnatural and adds an extra layer of processing.

Hanging from a tree dressed in orange, purple, and lime green while reciting the Gettysburg Address and drinking a glass of water dyed blue at the same time is unnatural. The use of RDF containers (which is what the Seq element is) in RSS is to provide some structure to the data. (See my RDF/RSS file, generated by Movable Type for examples during this dicussion.)

The RDF Seq container provides an explicit ordering — top down — to all the elements contained within the tag. Without the Seq element, there is only an implicit assumption that all items are processed in a certain order.

I’m not fond of RDF containers myself principally because there is built-in processing associated with them, though I understand their use in maintaining relationships between elements; however if I was a tool builder, I would at least understand what Seq means, and that helps eliminate confusion about the specification. If you didn’t have the RDF Seq container, there might be an assumption that the item ordering is important, but there’s nothing enforcing this assumption.

Not using the Seq container is as bad as the defining the <em> element in HTML — exactly what are we, as tool builders, supposed to provide with this element?

Joe: Well, I’m building my browser to use italic font, same weight and line height as the surrounding text. That’s emphasis.

Sara: Well, I’m building my browser to use a bolder font, and to increase the size as well. This is emphasis we’re trying to define here.

Dubya: Em? Auntie Em?

Myth 2. RSS must be human readable/writeable

Let’s get real about markup: markup is not human readable and writeable. I don’t care if you’re talking SGML, HTML, or XML, markup is not meant to be created and consumed by humans. Now, we may adapt and learn to work with markup. However, we can also adapt to spending 8 or more hours a day in a small, cramped, walled in, windowless, artificially lighted and ventilated environment, too, and that’s no more human than markup reading and writing. Markup exists to be generated by automated processes and consumed by automated processes.

All you webloggers out there that create your RSS feeds by hand, raise your hands. Now, those with their hands in the air, dump whatever tools you’re using to build your weblogs and get Moveable Type and let the machines do what we pay them to do.

Myth 3: RDF doesn’t add anything to RSS

I remember a debate several years back about how the relational data model was too complex and didn’t add any value to a company’s business.

RDF is the relational data model of XML. Now, it’s true, I’m writing a book on the subject and am biased. However, I’m writing the book because I believe in the concepts of RDF, I don’t believe in RDF because I’m writing a book on it.

RDF provides a structured meta-data language that can be used to define any XML vocabulary, providing rules to ensure that all instances of the XML that use the vocabulary are consistent with one another. In addition, with RDF you have a host of pre-built tools and APIs that allow you to access the data from many different business vocabularies with little or no change to the underlying technology. May not seem like much, but believe me, this will get you buy in on new technology at a company faster than whether there’s a version tag in the specification. After all, it worked for Oracle.

I’ll have more to say on this debate but it’s late, and I’m tired. Another day.