Categories
Technology

Tagback seed is sprouting

As you can see from the initial , created for yesterday’s post, several people have added weblog posts that tagback to the original item. In addition, a new del.icio.us tag, tagback, was created, and since neither the original del.icio.us bbintroducingtagback and tagback tag entries are being pulled into the Technorati tagback page (anyone know why?), I used furl to add links to both delicious tag pages, as reference. And others have added the Technorati tagback page to the del.icio.us tagback page, as cross-reference.

Now when you access the page, you’ll find weblog posts that respond to the original post, my funky photos, as well as cross-references to related but not directly linked material, including material from a rival bookmarking site.

There has been considerable, and good, discussion about using a tag, or even the name I used yesterday and I’m going to cover these in more detail in my long awaited – you are waiting for it, still, I hope–sequel to tags and folksonomies, which should be out late tonight. However, before I expanded on the concept of tagbacks, I did want to see Technorati’s reaction, first. After all, I am proposing to utilize more, perhaps considerably more, of the organization’s resources. However, from Dave Sifry’s early response, the company is cool with the concept.

Speaking of such heavy utilization of Technorarti, Kaf asked the question in my comments about whether I am changing my mind about centralized services. After all, Technorati is centralized, and trackback is distributed. My answer is that once a resource has been corrupted by outside interests, as trackback and comments have been, then I would rather centralize that resource in the care of skilled technicians who are motivated to keep the resource clean, then put the burden on all the poor souls who don’t know SQL and don’t understand XML-RPC and pagerank, or who don’t have the tools to easily clean up their sites.

There is a risk that Technorati may go away someday, or put up a costwall between us and the data, especially if investment companies urge this. However, by making use of many resources, such as del.icio.us, furl, even flickr, (and other tag based entities sure to pop up), and cross referencing the material, we should be able to pick up the threads if need be. And I am making an assumption about Technorati: that the organization doesn’t intend to cause harm. They might put ads into the tagback pages, but we’ve seen ads embedded in all of the facilities we use, and they don’t cause harm. Still, to repeat: we are backing up the threads by using cross-references in other tag-enabled tools, no offense Dave and Kevin and other folk at Technorari.

In addition, if I understand the documentation with Technorati tags correctly, the URIs we use don’t have to be to Technorati, though I’m not sure how this works yet, especially in regards to tagback–still experimenting around.

Another personal refinement is that I decided not to generate new tagbacks automatically with each post, because some posts, such as this, are an addition to one published previously. I’ll use the original tagback for all posts on the same thread. In addition, not every one of my posts needs a tagback page, though if I don’t add one, with tag systems such as delicious and furl, flickr, and other systems sure to spring up, as well as webloggers ready to wield that mighty link to create a tagback page, someone can always create one for me if they disagree.

The tag for this post is bbintroducingtagback. To add an item to the discussion surrounding this post, you can use this tag with a flickr photo or as a del.icio.us or furl bookmark tag. You can also include the following Technorati tag in your post: .

Categories
Technology Weblogging

Bad Webloggers. Bad.

Recovered from the Wayback Machine.

As you can see, I’m still getting pingbacks, even with removing the link to the pingback server from my page header. The reason for this, most likely, is because in the WordPress code somewhere, my site is responding affirmatively to an XML-RPC request, and the pingback is then sent. I’ve since moved the xmlrpc.php file elsewhere, though this means I can’t remotely post for now. But I rarely do anyway.

The pingbacks are from a post that Jonathon Delacour wrote on the recent trackback and nofollow issues, over at Writable Web, the new weblog he’s writing in conjuction with Marius Coomans. In this writing, Jonathon provides a nicely done comparison of pingbacks and trackbacks and how the two have become somewhat synonymous in most webloggers minds, primarily because of trackback autodiscovery. He also covers the new nofollow attribute, automatic addition of in weblog tools such as TypePad has led the spammers this last week to basically hit webloggers across the nose with a rolled up newspaper, going “Bad, webloggers. Bad.”

In the meantime, here’s a surefire method of preventing comment spam:

Open up robots.txt, or create one, and add the following two lines:

User-agent: *
Disallow: /

It could take a couple of months, but eventually you’ll find you’ll have no more comment spam. Of course, you’ll have no Google or other search engine pagerank, either. But why bleed pagerank out of the weblogs slowly with nofollow, when we can do it quickly with robots.txt?

Seriously, bite the bullet, cut the cord, and be comment spam free. Isn’t this what everyone wants?

Categories
Technology

Daily hits via Technorati

Through Technorati I found a post where Roland Tanglao referenced my post on trackback being dead. There was a discussion in comments about Technorati opening up Watchlists and API queries.

Hmmm.

I then created a watchlist of my base URL, http://weblog.burningbird.net, which you can access directly with this URL. This returns an RSS feed of the watchlist for the entire weblog — a watchlist being all links to my weblog on any specific day.

I took my old Backtrack application, which used to backtrack trackbacks and print out who else has trackbacked a specific post, and modified it to consume the RSS that Technorati provides, instead. I then posted a link to this at the top of my sidebar, and you can also check it out here.

If you want to do the same, create a watchlist for your weblog, copy the source code for Backtrack, and then modify the look and feel to match whatever you want. You’ll want to leave the PHP bits in the body alone, except to replace my watchlist URL with your own.

This will give you a list of links to your weblog, tracked by Technorati, on a daily basis. The question remains, though, how this alternative to trackbacks will scale, because Technorati is a centralized service, and one that can get sloggy at times.

Update: to add Technorati and Bloglines links to your posts

I’ve added Technorati and Bloglines links to each of my posts.

For WordPress, the Technorati link is:

<a href=”http://www.technorati.com/cosmos/search.html?rank=fresh&url=<?php echo get_permalink() ?>” >Technorati Links</a>

If you’re not using WordPress, you’ll need to replace the function call to print out the permalink with whatever your tool supports. Just see what the tool uses for your permalink and copy this into the placeholder of the Technorati link.

For Bloglines citations (thanks to Dare for pointing this out):

<a href=”http://www.bloglines.com/citations?url=<?php echo get_permalink() ?>&submit=Search” >Bloglines Citation</a>

Again, replace the WordPress permalink function call for whatever your tool uses.

These will return the links, in Technorati or in Bloglines, for a specific post. Now, Bloglines was just bought out by Ask Jeeves, so who knows how long this functionality will last. And I’m sure someone somewhere is about to buy out Technorati, so ditto. But might as well make use of the functionality for now.

Categories
Technology

Throttling the Trackback

Recovered from the Wayback Machine.

I was hit with 781 trackbacks last night, all of which went into moderation, but all of which triggered my comment throttle (trackbacks are stored in the same table as comments in WordPress), so if you tried to comment and couldn’t you’ll know why.

I added throttles now to the trackback code–only allows ten trackbacks in a minute, 30 in a day. My site is using customized code, but I created a customized wp-trackbacks.php file for WordPress 1.22, which you can access here. Note, I’ve not done a thorough job testing the throttle code on trackbacks (it has been in use for months at Burningbird for comments) so use at your own risk. If someone spots a bug, let me know.

Search in the code for the Burningbird throttle comment, and change the 10 or 30 to whatever value you want.

I imagine that this is notice being given by the comment spammers that nofollow won’t stop them. Contrary to what you read in the Register though, pagerank is the primary reason for comment spams, not click through. While I am not making the issue into a religion, as Scoble asserts, I don’t agree that nofollow is going to be a solution for comment spam. However, I’m also not going to ignore spammer FUD: I imagine the only reason that “Sam” agreed to the interview with the Register was to cast additional doubt on nofollow. This isn’t because he’s concerned about nofollow driving him out of business, but because he knows he’ll have to send that much more spam to make up for sites that are using nofollow.

Categories
RDF Technology Weblogging

A credible coder

Recovered from the Wayback Machine.

I’ve been silent in this weblog, primarily because I’ve been working on a couple of other projects. I had talked with a good, and wise, friend of mine about this effort and he made a point that I felt was valid: that I should implement those applications or functionalities I’ve talked about previously first, before taking on a new project. It’s all about credibility you see.

Of course, I could point that I’ve delivered numerous tricks, tips, code fixes, not to mention how-tos and tutorials and what not, as well as helping to install 109 weblogs, and answering whatever questions others have asked. I had assumed that this gave me some credibility; Still, I understand what he was saying: complete applications shout where help and tips and fixes only whisper.

So I’ve been working on three major projects. The first is a ‘comment package’ that has much of the nifty comment functionality out at Burningbird, wrapped up into a package that could be used by people using other weblogging tools. This includes live preview, spell check, and post-edit functionalities. I have it finished for WordPress, but I’m also creating a Movable Type and Textpattern backend. These latter are for a couple of friends that have asked for the functionality. More to the point, though, I wanted to demonstrate that one can extend the tools outside of the traditional plug-in environment, and with that extension, make the functionality available to a variety of tools because the data and functionality between the tools is so similar.

This one is almost finished, but incorporating these changes into the templates for the other tools is a bit tricky because they are a changing, not the least because of that nofollow nonsense.

The second project is to provide updates to the material in my chapters for the Practical RDF book. This has been interesting and fun, and I am pleased to see a maturity in both specs and data. I’d like to see the technology a little more easily embeddable, which means lightweight language specific frameworks. But they’re coming.

I’m actually using the technology that I’m covering above in the final project, which is a variation of the Poetry Finder I talked about long ago. However, rather than just poetic annotation, this application will allow one to specify any field of data, such as legal, political, genetics, whatever, and annotate it using RDF statements, which are then added into the MySQL database for a specific weblog post.

It’s not going to require that users understand RDF, or even that developers understand RDF. The developers will be able to define a set of statements they want to capture as a model, and this will be used to generate form statements of the nature of _______________ issomething _____________________. An example would be “bird” IS METAPHOR FOR “freedom”, with the issomething provided by the model developer, and the values provided by the user.

Then, when a post is accessed (either WordPress, or Movable Type implemented as a dynamic PHP weblog) with an “/rdf” extension, .htaccess rules will trigger functionality that will deliver a complete RDF/XML output of all the data that is defined for a resource defined with the URL of the article. So, a specific article could have poetic and political annotation, and both would be combined into one model and returned when the URL is accessed with the “/rdf” extension.

Sure the statements defined using this extension are simple, but most models will consist of simple assertions. A case in point is the example data that I’m using at Practical RDF’s original Query-o-Matic: a listing of all terrorist acts since 1988. Simple, yet; but still managing to capture a lot more context that the other folksonomy/tags implementations I’ve seen.

The tricky part on this is getting the PHP together to maintain the backend services. There is RAP (RDF API in PHP), but it doesn’t implement SPARQL (the W3C RDF query language spec) yet, and I had hoped to use this query language.

My hope is by the time I’m finished with these projects, WordPress 1.5 will be out. I’ll then follow through on Wordform, but based on the final 1.5 product, not one in process. It will also be less ambitous than my original intent, primarily because its for my own use and as a curiousity to others — I don’t expect much interest in it.

What it will have is:

1. All data operations are pulled into a separate file rather than have bits and pieces of SQL scattered about. This makes it a whole lot easier to make changes, especially to the data model.

2. I’ll most likely be altering the comment and trackback spam prevention to incorporate my own ideas, which have shown themselves to be working relatively nicely in Burningbird. I’ve talked about these previously in this weblog.

3. I’m going to change the conditional checks in the code. All of them are as follows:

if ( ’spam’ != $approved ) {

In other words, the literal is first, the variable second. In all my years of programming, you put the variable first, because if it’s null (hasn’t been assigned a value) the conditional fails at that point without having to check the second value. I wasn’t aware that PHP differed in this regard, and I have no idea why the developers of WordPress do it this way. But it bugs me, so I’m changing them for no other reason than it bugs me, unless someone pops in with the reason for this, in which case I won’t. Who knows, maybe PHP does handle this all differently.

4. I’m making the admin more dynamic. Well, I’ve already made this change. With this, you can add a new comment or post status, high level menu item, and individual post menu item by updating tables, as these will not be table driven. In line with this, the semantic data extension talked about earlier will be incorporated into Wordform’s administration pages, as well as my existing fullpage preview functionality, per comment moderation, and post status of ‘insert’.

I won’t be adding multiple weblog support, primarily because it’s a lot of work, I won’t be using it, and Wordform is mainly for my use. The separation of SQL into a separate file should help with this if I ever get energetic about this application again.

When finished with all the various application, I will put them online as GPL open source for others to do with as they will. I’ll be posting on these changes, as well as links to the code, at Burningbird rather than this weblog; except for the Practical RDF book updates, of course, which will go to the book weblog.