Categories
RDF

Threadneedle and RSS

The problem with a developer being around during the design phase of an application is that the developer tends to pull things back to an implementation viewpoint – we can’t help ourselves.

However, a discussion about ThreadNeedle and RSS is, I feel, important at this time.

Why am I not creating ThreadNeedle as a new module on RSS (Rich Site Summary)? After all, as webloggers we’re familiar with RSS, weblogging tools already generate RSS files, and we’re used to using aggregation tools that process RSS. Why am I not piggy-backing ThreadNeedle on to the RSS specification?

RSS started as a way of recording information about channels – sources of information of interest. The adoption of RSS within the weblogging community grew out of Dave Winer’s and Userland’s support of RSS as an XML vocabulary to describe individual weblog postings. With RSS, news aggregators can grab this information, providing it for quick purusal.

RSS 1.0 is based on RDF – Resource Description Framework. RDF is, in reality, a meta-language, a way to describe languages so that any vocabulary can be described in RDF. One aspect of RDF is that it can be used to describe XML vocabularies, something we’ve desperately needed since the inception of XML.

In a manner similar to the relational data model being used to describe different business data within commercial database systems, with RDF you can create different vocabularies for different business uses, and the same tools and technology can work with each. So, I can create a RDF vocabulary for a post-content management system, and a vocabularly for ThreadNeedle, and process both with the exact same Java and Perl APIs as I can use with RSS 1.0. For instance, I’ve processed RDF from all three types of XML documents using Jena (Java API) with absolutely no change to the code I used.

Very powerful. Very handy. What’s been missing from XML since day one.

Best of all, through the use of “namespaces” – ways of identifying which elements belong to what vocabularly – I can combine different vocabularies in one document and the namespace designation prevents element collision: two elements with the same name from two different vocabularies combined in one document.

Within RSS, the use of namespaces is being used to add “modules” to the RSS specification -new additions to the vocabulary to record information about new types of sites, such as WikiWeb. These modules are, in reality, new vocabularies that can stand alone, but are meant to be used with RSS. With this, the core RSS specification doesn’t need to be modified to meet new business requirements (i.e. aggregate information from WikiWeb sites).

Good stuff.

However, RSS has a specific business purpose – to aggregate information from various sources of information, including weblogs, and to allow subscription to same. The point of focus of RSS is a specific news source – a weblog or a WikiWeb or a web site (technically referred to as “channel” within RSS) – and vocabulary elements become adjectives of same.

ThreadNeedle has a different business purpose. For instance, it’s main entity of interest is the discussion thread, which transcends any one source of any one point on the dialog thread. In addition, there is a connectivity between thread points that is critical information to capture – again something that’s not important from a business requirement standpoint for RSS.

Bottom line: trying to add blogthreading as a module to RSS would be the same as trying to use a banking database for an insurance company application. Yes, both are financial applications and both support customers and have to meet certain levels of accountability (government, stock holders, and so on). However, at this point the similarity ends – the business models differ.

More information:

RSS 1.0 spec
W3C RDF
RDF Primer

Categories
RDF Technology Weblogging

Technology to enable community

Recovered from the Wayback Machine.

Serendipity is such a major component of my life, never more so than when I read Gary’s attempt to manually connect the multiple threads to the whole discussion about Identity.

While I’m on my long journey through distance and time, I’m working on a new application that will provide a means to track cross-blog discussions, such as those my own virtual neighborhood (and others) participate in. The specs for the application are:

 

Project is called Thread the Needle, or “Needley” for short. Its purpose is to track cross-blogging threads.

How it works:

You register your weblog, once, with an online application I’ll provide (i.e. provide your weblog location, name of weblog, email). Frequently throughout the day, the Needle service bot will visit the weblog looking for RDF (an XML meta-language, used for RSS and other applications) embedded within the weblog page. Note that this may change to scan weblogs.com for changed weblogs that are registered, or based on the first time a person clicks the link or some other procedure – testing these out as you read this.

The RDF will be generated by the service now and copied and pasted into the posting; hopefully someday it will be generated automatically by the weblogging tools.

The RDF either starts a weblogging subject thread – starts a new subject – or continues an existing thread. The bot pulls this information in and when someone clicks on a small graphic/link attached to the posting, a page opens showing all related threads and their association with each other.

Example:

AKMA writes a posting on Identity. Because he starts the discussion thread he creates and embeds RDF “thread start” XML into the posting (generated by the tool using very simple to use form, results cut and pasted into posting). Included in this RDF is thread title, brief description, posting permalink, weblog name, and posting category, accessed from pulldown list.

The generated code also contains a small graphic and link that a person clicks to get to the Needley page. Clicking another small graphic/links opens up a second form for a person wanting to respond to this posting, with key information already filled in.

The posting would look like:

 

This is posting stuff, posting stuff, words, more words more words
more words and so on.

link/graphic to view page Needle thread page,
link/graphic to respond to current posting

Posted by person, date, comment

 

The embedded RDF is invisible.

David Weinberger creates his own posting related to AKMA’s posting, and clicks AKMA’s “respond” link and a form opens with pre-filled fields. He adds his own permalink info, pushes a button and a second page opens with generated RDF that David then embeds into his posting.

Stavros comes along wanting to continue on David’s discussion and follows same process. Jeneane responds directly to AKMA, and Jonathon, responds to Stavros, and Mike responds to David, and Steve responds to Jeneane and AKMA responds to David and Steve, who responds back to AKMA.

The Needle page for this thread shows:

AKMA
David
Stavros
Jonathon
AKMA
Mike

Jeneane
Steve
AKMA

Each of the above names is a hypertext link to the discussion posting. Some visual cue will probaby be added to assist in the reading of the hierarchy of discussion. (I’ll also work to make sure that this page and its contents are fully accessible.)

If a person is responding to two or more of the threaded postings, they can add the generated RDF for each posting they’re responding to – there’s no limit. So Dorthea responds to Jonathon’s and AKMA’s original posting:

AKMA
David
Stavros
Jonathon
Dorothea*
AKMA
Mike

Jeneane
Steve
AKMA

Dorothea*

The asterisk shows that the posting is one response to multiple postings.

It will take approximately 30 seconds to click, complete, generate, cut and paste the RDF for a response; about 1 minute for starting a thread.

The results can either be hierarchy ordered, by response, or time ordered. The thread page starts with the thread title, category, description, date started, date of last update and each weblog entry is associated with a link that will take a person directly to the specific posting.

With this, people can see all those who’ve responded, can reply with new posting, and the conversation can continue cross-blog, many threaded.

I’ll probably try to add in graphics to create a flow diagram, similar to the RDF validation tool (see at http://www.w3.org/RDF/Validator/ and use http://burningbird.net/example12f.rdf as test RDF file to demonstrate).

Discussion thread titles and associated descriptions and categories will go on a main page that is continuously updated, with a link to the main thread page for each discussion. I’d like to add search capability by category, weblog, and keyword.

(e.g. “Show me all discussions that AKMA has originated that feature Identity”)

 

I’ve already incorporated RDF into Movable Type postings and have been able to successfully scrape and process the information.

I’ll be asking for beta testers of this new technology in July, and will be hosting the discussion server at first. My wish is to distribute this application rather than centralize it, and will look at ways this can occur (one major reason why I went with embedded RDF).

Update: AKMA and Gary Turner are collecting suggestions and requirements from the weblogging community for this application. A basic infrastructure is in place, but the user community needs to provide information about how this product will work, and what it will do. Please see AKMA’s posting to get additional information.


 

Just read Meg’s What we’re doing when we blog article. Though I can agree with many of Meg’s sentiments, I totally disagree with Meg’s philosophy that the weblogging format is the key to weblogging. Last time I looked, I thought it was the people. Meg truly missed the boat on this one. In fact, she wasn’t even at the dock to wave her handkerchief good-bye when the boat left.

The Thread the Needle application will help weblogger discussions, but it’s just an enabler – weblogging discussions can continue without it. We are connecting because of what we say, not the technology we use. Weblogging tools help, but they don’t create community.

Another instance of serendipity because the same day Meg’s article appears, I stated in the Pixelview interview:

 

Too many people focus on the technology of the web, forgetting that technology is nothing more than a gateway to wonderous things. The web introduces us to beauty, creativity, truth, new people and new ideas. I genuinely believe there are no limits to what we can accomplish given this connectivity.

Categories
RDF Weblogging

Doing my part: RSS auto-discovery

Since weblogging is all about RSS and aggregation, I’ve added the Mark Pilgrim RSS auto-discovery code to my weblog’s template.

Note: In the interests of disclosing any bias, be aware that I am writing a book on RDF, and that I support RSS 1.0 based on the RDF specification.

Categories
Semantics

Search engine

Since Google is going to the birds we should check out this new search engine that Allan found, Teoma.

I tried it and have found some really fascinating results based on burningbirdburning bird, and Shelley Powers.

For instance, with “burningbird”, Phoenix Systems who owns burningbird.com (note that I own burningbird.net), shows on the first page. In google, I have no idea where this poor company shows.

This new search engine promises hours of new fun. We’ll have to see how resistant it is to search engine bombs and assorted other weblogging games.

Categories
RDF

RSS debate

Well, now. Do you think that Dave is talking about the recent RSS posts from myself, Jonathon Delacour, and Jon Udell when he writes the following under the title of Meta-Blogging:

Aggregation: Is goodness. Think of it as a way of upping the bandwidth of people whose minds are sponges and want to learn as much as possible. In time of crisis think of it as the Web’s Emergency Broadcast System.

I won’t get into issues of quantity versus quality, and indiscriminate sucking up of data regardless of worth, but Emergency Broadcast System?

In my area there’s a horn that goes off every Tuesday at noon that’s a test of the emergency notification system in my area. So tell me, Dave — are you saying that aggregation is equivalent to a loud raucous noise that drives all intelligent thought from your mind so that you’ll react instinctively.

NOISE! Argghhh! Click the next link!

NOISE! Argghhh! Click the next link!

As we saw with the events of September 11th, there were few weblogs that weren’t focused on what was happening in the Eastern part of the US. In times of crises the very act of aggregation negates the usefulness of aggregation because all links lead to one event, one act.

As a compliment I will say that in a crises, I first turn to Scripting News for information because I know that most webloggers will point out new and breaking information directly to Dave and he’ll pass the information along. Aggregation, yes. But intelligent aggregation.

And whatever happened to the art of debate.