Categories
Semantics Technology

RSS: Proof is in the implementation

Sam Ruby had taken a first shot at RSS 2.0 with an RSS document demonstrating the new, simplified RSS syntax. No evidence of RDF, RSS version, no RDF Seq.

Mark expanded on this with what looks to be the same specification, different examples and the use of included HTML (parseLiteral in RDF terms). (Correct me if I misread this Mark).

Since Sam has published an example of his version, allow me to work with the assumption that whatever works with his proposed RSS 2.0 should work with Mark’s, with the addition of HTML literals.

In this weblog page, I have PHP processing for the Book recommendation list. I copied the page and modified it to process Sam’s new proposed RSS file. You can see it in action here. The process took me about 10 minutes because the SHIFT key on my laptop doesn’t work well, and I am using vi to make the edits.

Now, I want to show you something. Here is my MT generated RDF/RSS file. Taking this and Sam’s and Mark’s proposed RSS 2.0, I came up with a simplified RDF/RSS syntax, seen in this file and also duplicated here:

<?xml version=”1.0″?>

<rdf:RDF xmlns:rdf=”http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:dc=”http://purl.org/dc/elements/1.1/” xmlns=”http://purl.org/rss/1.0/”>

<channel rdf:about=”http://weblog.burningbird.net/”>
<title>Burningbird</title>
<link>http://weblog.burningbird.net/</link>
<description></description>

<item>
<rdf:Description rdf:about=”http://weblog.burningbird.net/archives/000514.php”>
<link>http://weblog.burningbird.net/archives/000514.php</link>
<title>Myths about RDF/RSS</title>
<description>Lots of discussion about the direction that RSS is going to take, which I think is good. However, the first thing that
happens any time a conversation about RSS occurs is people start questioning the use of RDF within the…</description>
<dc:subject>Technology</dc:subject>
<dc:creator>shelley</dc:creator>
<dc:date>2002-09-06T00:53:16-06:00</dc:date>
</rdf:Description>
</item>

<item>
<rdf:Description rdf:about=”http;//weblog.burningbird.net/archives/000515.php”>
<link>http://weblog.burningbird.net/archives/000515.php</link>
<title>ThreadNeedle Status</title>
<description>I provided a status on ThreadNeedle at the QuickTopic discussion group. I wish I had toys for you to play with, but no
such luck. To those who were counting on this technology, my apologies for not having it for…</description>
<dc:subject>Technology</dc:subject>
<dc:creator>shelley</dc:creator>
<dc:date>2002-09-06T00:19:28-06:00</dc:date>
</rdf:Description>
</item>

</channel>

</rdf:RDF>

Differences are:

 

  1. RDF element rather than RSS
  2. No versioning – not necessary with the concept of namespaces
  3. Use of namespaces to differentiate modules
  4. Surrounding the ITEM’s properties with a RDF:Description. The ITEM can have either literal data or XML elements that should be parsed. By using RDF:Description, I’m giving a hint to the processors that what follows is XML data to be parsed for new elements, so turn off literal text processing optimization, and use the more memory and CPU intensive XML parser, please.

Notice that there is no RDF:Seq in this RDF/RSS version. Why? You don’t have to use the Seq element for valid RDF. I believe Seq was used with RSS 1.0 because the originators of RSS 1.0 wanted to provide ordering information to the tool builders. However, this really seems to be an absolute sticking point with everyone. Fine. Dump it.

Run my new RDF/RSS through the RDF validator (here), and you’ll see it’s valid RDF.

Now, I created a third copy of my weblog page with the PHP processing and had it parse and print out this new RSS file. The changes necessary? I changed DC:DATE to DC:CREATOR — I wanted to print out the latter not the former. Here’s the new page.

Next, I copied the PHP page and had the code process my original RDF/RSS 1.0 file, the one that’s generated automatically from MovableType. Changes to the code? Nada. Not one single change other than the name of the RDF file. Time to make change? 4 seconds. See the new page here.

Now, all of these pages (including this one) use PHP-based XML processing to process the data (xml_parser). No specialized RSS or RDF APIs. Pure XML processing. And it took me about, well, honestly, probably a couple of hours to write the original code for my Books RDF/RSS application. That darn shift key you know.

I’m not trying to downplay other’s concerns or existing work or effort, and I realize that I have a better understanding of RDF than most of you (not bragging, but give me this as an accepted for discussion purposes at this moment) and that this gives me an edge when working with RDF.

What I’m trying to show is that keeping RDF in the RSS specification doesn’t nececssarily mean that simplified processing is impossible, or that we can’t use ‘regular’ XML tools, and that there will be a huge burden on tool writers.

We don’t have to keep Seq if it really bothers everyone. Let’s work this change. Let’s. Let us work this change. I like that phrase, don’t you?

By keeping RDF in RSS now — and really are those changes I made to the proposed RSS 2.0 so hard to swallow? — we keep the door open for the benefits that will be accured some day when RDF does have broader use.

I guess what I’m trying to show, demonstrate, prove is that RDF doesn’t have to make things arbitrarily complicated, or confusing. That we can write documentation that clarifies those few bits of RDF in the specification so that it isn’t complicated for folks writing or reading this stuff by hand (or processing it with various languages).

I’m hoping with this demonstration that I’ll convince a few of you that we can keep the door open on this discussion rather than arbitrarily throwing RDF out — a specification I’d like to gently remind you all that’s been in work for years by some of the best markup minds in the business. And as easy as it is to criticize the RDF working group for taking time, remember that they’re trying to create a specification that will stand the test of of time, rather than break through every version, as we had with HTML.

Mark provided a summary of the RSS issue, and I know that this discussion has been going on for years. And I know that there are a lot of people who say, let’s just fork. But folks, this didn’t work for SQL and QUEL (remember QUEL?) years ago when the decision was being made about which query format to use when accessing relational database data. I really do want to see these specs come together, with members and players from all sides.

And I’ll also be honest and say that I really don’t want to see this owned by any private company or person. Sorry, but I just can’t accept this, it goes everything I believe in. I am not belittling Dave’s and Userland’s contribution to RSS. I realize that Userland popularized RSS and a debt is owed.

What I am asking is that Dave become part of a team working on this, a team that’s open to people who literally have something to contribute on this issue, each with an equal vote. Yes, people like me, like Mark, like Sam, Jon, Joe, Bill — all the people who have something to contribute to make this specification rock. And hopefully prevent something like this from happening again in the future.

Am I too late though? Is the decision made? Can’t we talk?

Where’s the fire?

(Archived page and comments at Wayback Machine)

Categories
Technology

Threadneedle status

Recovered from the Wayback Machine.

I provided a status on ThreadNeedle at the QuickTopic discussion group. I wish I had toys for you to play with, but no such luck. To those who were counting on this technology, my apologies for not having it for you, and unless someone can point out an obvious solution to the problems I’ve recounted that I’m too dense to see (a good possibility that), chances are ThreadNeedle will remain an RDF schema without an implementation.

On a related not, I also wanted to specifically mention that Ben & Mena’s (of MovableType) development of the standalone TrackBack server was a remarkably generous gesture, one that I hadn’t given the proper due. I believe that the Trotts have gone above and beyond in how much they’ve given the weblogging community, and deserve kudos from me, not tired, cranky grumblings.

I don’t know if you all have taken the time to read what the Trotts have said, but they’re putting this server out under Artistic license and encouraging people to use their technology, no charge. This means developers can incorporate this technology into their own applications — such as ThreadNeedle, which is one thing I am examining. Or into other webloggings tools for that matter.

Damned if I’ve seen anyone thank Ben & Mena for this. I didn’t. I’ll amend this now — thank you Ben and Mena.

 

Categories
Technology

Myths about RDF/RSS

Lots of discussion about the direction that RSS is going to take, which I think is good. However, the first thing that happens any time a conversation about RSS occurs is people start questioning the use of RDF within the RSS 1.0 specification, and the necessity of keeping RSS “simple”.

Mark Pilgrim writes:

Many people in the RSS community feel that, while the lack of extensibility in RSS 0.9x is too limiting, the full-blown RDF syntax of RSS 1.0 is overkill for the purposes of syndicating weblogs.

Jon Udell writes:

RSS is becoming too complex. It needs to remain simple, human-readable and -writable.

Well, this just plain peeves me. Not Mark or Jon’s statements, but the idea that a) RSS must be human readable and writeable and b) RDF makes RSS overly complex.

Specifically, there are three myths I want to address:

Myth 1. RDF adds complexity to RSS because the RDF Seq element is unnatural and adds an extra layer of processing.

Hanging from a tree dressed in orange, purple, and lime green while reciting the Gettysburg Address and drinking a glass of water dyed blue at the same time is unnatural. The use of RDF containers (which is what the Seq element is) in RSS is to provide some structure to the data. (See my RDF/RSS file, generated by Movable Type for examples during this dicussion.)

The RDF Seq container provides an explicit ordering — top down — to all the elements contained within the tag. Without the Seq element, there is only an implicit assumption that all items are processed in a certain order.

I’m not fond of RDF containers myself principally because there is built-in processing associated with them, though I understand their use in maintaining relationships between elements; however if I was a tool builder, I would at least understand what Seq means, and that helps eliminate confusion about the specification. If you didn’t have the RDF Seq container, there might be an assumption that the item ordering is important, but there’s nothing enforcing this assumption.

Not using the Seq container is as bad as the defining the <em> element in HTML — exactly what are we, as tool builders, supposed to provide with this element?

Joe: Well, I’m building my browser to use italic font, same weight and line height as the surrounding text. That’s emphasis.

Sara: Well, I’m building my browser to use a bolder font, and to increase the size as well. This is emphasis we’re trying to define here.

Dubya: Em? Auntie Em?

Myth 2. RSS must be human readable/writeable

Let’s get real about markup: markup is not human readable and writeable. I don’t care if you’re talking SGML, HTML, or XML, markup is not meant to be created and consumed by humans. Now, we may adapt and learn to work with markup. However, we can also adapt to spending 8 or more hours a day in a small, cramped, walled in, windowless, artificially lighted and ventilated environment, too, and that’s no more human than markup reading and writing. Markup exists to be generated by automated processes and consumed by automated processes.

All you webloggers out there that create your RSS feeds by hand, raise your hands. Now, those with their hands in the air, dump whatever tools you’re using to build your weblogs and get Moveable Type and let the machines do what we pay them to do.

Myth 3: RDF doesn’t add anything to RSS

I remember a debate several years back about how the relational data model was too complex and didn’t add any value to a company’s business.

RDF is the relational data model of XML. Now, it’s true, I’m writing a book on the subject and am biased. However, I’m writing the book because I believe in the concepts of RDF, I don’t believe in RDF because I’m writing a book on it.

RDF provides a structured meta-data language that can be used to define any XML vocabulary, providing rules to ensure that all instances of the XML that use the vocabulary are consistent with one another. In addition, with RDF you have a host of pre-built tools and APIs that allow you to access the data from many different business vocabularies with little or no change to the underlying technology. May not seem like much, but believe me, this will get you buy in on new technology at a company faster than whether there’s a version tag in the specification. After all, it worked for Oracle.

I’ll have more to say on this debate but it’s late, and I’m tired. Another day.

Categories
Technology Weblogging

Threadneedle meets BlogMD

Recovered from the Wayback Machine.

I spent some time today hanging around at the BlogMD discussion group, talking about RDF, RSS, embedding problems, data models and so on.

As much of a lone wolf as I must seem to people, I prefer working these types of problem as a team. There is something about multiple heads working together that can make the most complicated problem seem solvable.

Unfortunately, it doesn’t seem as if any of the weblogging tool builders are involved in this effort. Too bad. The only way something like ThreadNeedle, or TrackBack, is really going to work is if we can get buy in from, at the least, Userland, Movable Type, and Blogger.

Categories
Technology

News Readers

Recovered from the Wayback Machine.

Ben Hammersley, the author of the upcoming O’Reilly book on RSS, Content Syndication with XML and RSS has a new article out in the Guardian about RSS Newsreaders. A nice read on the subject.

I don’t know what it is about Ben’s writing, but he makes technology seem so approachable and folksy. Put the water on to boil for coffee and glance through the RSS newsfeeds as you wait for the whistle. Probably spreads a bit of Marmite on his toast as he reads the full articles, but we’ll forgive him that.