Categories
Technology

Myths about RDF/RSS

Lots of discussion about the direction that RSS is going to take, which I think is good. However, the first thing that happens any time a conversation about RSS occurs is people start questioning the use of RDF within the RSS 1.0 specification, and the necessity of keeping RSS “simple”.

Mark Pilgrim writes:

Many people in the RSS community feel that, while the lack of extensibility in RSS 0.9x is too limiting, the full-blown RDF syntax of RSS 1.0 is overkill for the purposes of syndicating weblogs.

Jon Udell writes:

RSS is becoming too complex. It needs to remain simple, human-readable and -writable.

Well, this just plain peeves me. Not Mark or Jon’s statements, but the idea that a) RSS must be human readable and writeable and b) RDF makes RSS overly complex.

Specifically, there are three myths I want to address:

Myth 1. RDF adds complexity to RSS because the RDF Seq element is unnatural and adds an extra layer of processing.

Hanging from a tree dressed in orange, purple, and lime green while reciting the Gettysburg Address and drinking a glass of water dyed blue at the same time is unnatural. The use of RDF containers (which is what the Seq element is) in RSS is to provide some structure to the data. (See my RDF/RSS file, generated by Movable Type for examples during this dicussion.)

The RDF Seq container provides an explicit ordering — top down — to all the elements contained within the tag. Without the Seq element, there is only an implicit assumption that all items are processed in a certain order.

I’m not fond of RDF containers myself principally because there is built-in processing associated with them, though I understand their use in maintaining relationships between elements; however if I was a tool builder, I would at least understand what Seq means, and that helps eliminate confusion about the specification. If you didn’t have the RDF Seq container, there might be an assumption that the item ordering is important, but there’s nothing enforcing this assumption.

Not using the Seq container is as bad as the defining the <em> element in HTML — exactly what are we, as tool builders, supposed to provide with this element?

Joe: Well, I’m building my browser to use italic font, same weight and line height as the surrounding text. That’s emphasis.

Sara: Well, I’m building my browser to use a bolder font, and to increase the size as well. This is emphasis we’re trying to define here.

Dubya: Em? Auntie Em?

Myth 2. RSS must be human readable/writeable

Let’s get real about markup: markup is not human readable and writeable. I don’t care if you’re talking SGML, HTML, or XML, markup is not meant to be created and consumed by humans. Now, we may adapt and learn to work with markup. However, we can also adapt to spending 8 or more hours a day in a small, cramped, walled in, windowless, artificially lighted and ventilated environment, too, and that’s no more human than markup reading and writing. Markup exists to be generated by automated processes and consumed by automated processes.

All you webloggers out there that create your RSS feeds by hand, raise your hands. Now, those with their hands in the air, dump whatever tools you’re using to build your weblogs and get Moveable Type and let the machines do what we pay them to do.

Myth 3: RDF doesn’t add anything to RSS

I remember a debate several years back about how the relational data model was too complex and didn’t add any value to a company’s business.

RDF is the relational data model of XML. Now, it’s true, I’m writing a book on the subject and am biased. However, I’m writing the book because I believe in the concepts of RDF, I don’t believe in RDF because I’m writing a book on it.

RDF provides a structured meta-data language that can be used to define any XML vocabulary, providing rules to ensure that all instances of the XML that use the vocabulary are consistent with one another. In addition, with RDF you have a host of pre-built tools and APIs that allow you to access the data from many different business vocabularies with little or no change to the underlying technology. May not seem like much, but believe me, this will get you buy in on new technology at a company faster than whether there’s a version tag in the specification. After all, it worked for Oracle.

I’ll have more to say on this debate but it’s late, and I’m tired. Another day.

Categories
Technology Weblogging

Threadneedle meets BlogMD

Recovered from the Wayback Machine.

I spent some time today hanging around at the BlogMD discussion group, talking about RDF, RSS, embedding problems, data models and so on.

As much of a lone wolf as I must seem to people, I prefer working these types of problem as a team. There is something about multiple heads working together that can make the most complicated problem seem solvable.

Unfortunately, it doesn’t seem as if any of the weblogging tool builders are involved in this effort. Too bad. The only way something like ThreadNeedle, or TrackBack, is really going to work is if we can get buy in from, at the least, Userland, Movable Type, and Blogger.

Categories
Technology

News Readers

Recovered from the Wayback Machine.

Ben Hammersley, the author of the upcoming O’Reilly book on RSS, Content Syndication with XML and RSS has a new article out in the Guardian about RSS Newsreaders. A nice read on the subject.

I don’t know what it is about Ben’s writing, but he makes technology seem so approachable and folksy. Put the water on to boil for coffee and glance through the RSS newsfeeds as you wait for the whistle. Probably spreads a bit of Marmite on his toast as he reads the full articles, but we’ll forgive him that.

Categories
Technology Weblogging

Tech stuff

Recovered from the Wayback Machine.

Back is still quite painful, and has now been joined by cable modem. A case of new technology on old wires — for the modem, that is, not the back. Until the repair person comes out a week from tomorrow, my online access is going to be sporadic.

Sam Ruby has provided a very abbreviated introduction to RSS. I appreciate Sam’s effort, though I think it’s important to note that the RSS 0.9x and the RSS 1.0 efforts are following two separate and not necessarily parallel paths. Small correction — I believe the original expansion for RSS was “Rich Site Summary”. (Thanks to Mark for link.)

There’s a new effort for defining weblogging data with the BlogMD initiative. I’m not sure whether the group would be interested in the RDF vocabulary I designed for ThreadNeedle. From current discussion in the associated forum, probably not.

Speaking of which: the active effort of embedding RDF (data) for ThreadNeedle in each weblog posting is out — doesn’t work with existing weblogging tools. I’m now working on a webbot and scanning for links and building discussions from same, which will then be stored in a respository. From this I will then generate RDF documents of a discussion.

Frankly, after the rather unenthusiastic response I’m seeing with TrackBack, I’m not sure weblogging really needs or wants some of the technology the techies keep wanting to provide.

Categories
Technology

A few points of clarification on RSS

Recovered from the Wayback Machine.

Dave has a long multi-part posting today about RSS as well as article that covers RSS and aggregators, which he blasts but won’t link to or provide a means for us to discover said article.

He writes:

A note to people writing articles about RSS-based news aggregators. UserLand wrote and deployed the first one, in the spring of 1999. It was called My.UserLand.Com and was quite popular.

The concept of news aggregators, as well as RSS, had roots that extended back beyond Dave and My.Userland.Com. The concept of using XML to provide news feeds had implementations as channels with both Netscape and Microsoft (and other specialized companies that didn’t survive the dot-com implosion).

Don’t believe me? Then read an article I wrote and managed to salvage through the Wayback Machine from the now defunct Netscape Enterprise Developer magazine, January, 1998. In it, I showed how XML provided for IE channels, known as CDF could actually be picked up and used in other “aggregators” — except they were called “applications”. Jon Udell also writes about this, and references the use of RDF for describing channels in this Webbuilder article.

To provide more background material, Dan Brickley did a very nice overview of the history of RSS at the Yahoo-based RSS Development discussion group.

So, technically, Dave not invent the concept of using XML for aggregation. Nor did he write the first implementation of a “news aggregator”. And the examples I just cited are what is known in legal circles as apriori art, which means that Dave should use caution when he throws around “patent” with the implication that he’s the inventor of RSS or aggregation.

What Dave did do was help provide an implementation that gained popularity for the idea, especially when Netscape dropped out of the picture. For this, the RSS folks do owe Dave a debt of gratitude. However, at this point in time, it’s time for the concepts to slip out of one person’s hands and into the public domain where it belongs.

The RSS 0.9x family of RSS has been and will always be under the dominion of Userland. Debt of gratitude or not, I would rather put my money on a specification that isn’t owned by any one person or any one company.

The RSS 1.0 specification has two advantages. First, it’s based on RDF, which means much of the existing work and APIs and technologies that can be used with other RDF applications can be used with RSS. Secondly, it’s an effort that’s based on a team effort, with no one person ‘owning’ the effort at any one time. In fact, the RSS development team just voted to allow several new members on the team due to their outstanding contributions to the specification.

Dave asked us last week about what the RSS in RSS 1.0 means. He then printed up a page of our efforts, and then…nothing. Why did he do this? The only reason I can see — the only one — was to look for something with which to discredit the RSS 1.0 effort. And since those few of us who took the bait didn’t give him a weapon he could use against us, he somehow latched on to an email sent to him from an RSS user with little markup experience who believes that somehow RSS 0.9x is simpler than RSS 1.0.

News for those who think RSS 1.0 is too ‘complicated’ to work with: The RSS functionality I built into this weblog page uses straight XML processing to parse the RSS 1.0 page, and incorporate the contents. It was built using PHP, and took me about, oh, 20 minutes to write.

Piece of cake.

I’m not going to repeat my reasons for supporting an RDF base with RSS 1.0. I am going to ask my own question, instead:

Dave, don’t you think it’s about time you stopped fostering the split between the two specifications and work with a team of folks — a team — in making sure that the RSS 1.0 specification maintains its simplicity, even with the use of RDF?

In addition, Dave, you might as well know right now that I wrote about aggregators in my RDF book and didn’t include Radio. Why? Radio’s only one application, it’s not the first, and aggregation is only one part of its functionality.

Regardless, don’t you think it’s a bit bald to write the statement “Sloppy habits that come from working in a corrupt industry” when referencing both a publication and a reporter who can’t then defend themselves because you won’t even privide a link or a name?

Second Update: I received an email from a friend that perhaps Dave didn’t link to the article because he was trying not to be directly confrontational, and was trying to be nice.

Any professional author and publication can take the heat — I know I have for my articles more than once. Dave withholding the link for these reasons just doesn’t wash, though I will acknowledge that his intentions could have been ‘good’.

However, by criticizing the article, saying it was ‘wrong’ for not including Radio, and then not providing a link to this article, Dave’s preventing us from reading it and judging for ourselves. Now all we’re left with is rumor and inuendo.

Update: In the comments associated with this posting, b!x pointed out a thread on RSS and the ethics of republication at Blogroots.com.

In my January 1998 article, mentioned above, I covered the problems associated with providing a data feed using XML on the web and how the material could be republished in ways not intended.

There’s also more of a story with this article — I proved this concept and associated problem out with Wired’s CDF file at the time, republishing their data at my web site as an example of the problems associated with published XML feeds. Wired discovered this — and the awful page I plastered their data into, leading them to start protecting their XML feed at that point.

Now, almost five years later, we’re starting to question the ethics of republishing RSS feeds.

My darling webloggers, what did you think people were doing with all that data — printing it out on pretty paper and using it to paper the baby’s room?