Categories
Weblogging

Morphing URLs

Recovered from the Wayback Machine.

I signed up at Blogrolling.com to manage my blogroll, and you can the results in this page. Scroll down and you’ll see the ten most recently active webloggers in my virtual neighborhood. Click the “more…” link and you’ll go to my Blogroll page.

I’m using the blogrolling.com feed a couple of different ways. I’m using the raw PHP feed in this page, because it’s simple to process. However, I modified the code of the feed to only display the recent ten updates. I’d create another list, instead, and limit it to the most recent ten (a feature at blogrolling.com), but that’s only for those who have paid, and money is in short supply at the moment. So I tweaked the code on my own.

In the Blogroll page, I’m accessing the feed as RSS, and then using the PHP XML classes to process the data. By doing so, I can access the individual elements of the feed, such as the URL of the weblog, which I then use with my new Talkback feature.

(I’m thinking of accessing the RSS feed in this page and then caching the feed locally, to be used by the blogroll page, and lower the number of hits against the blogrolling.com site. We’ll see.)

Blogrolling.com makes use of changes.xml from weblogs.com to check for recently updated weblogs, a feature I incorporated into my blogroll. I really appreciate this, because it lets me see who’s updated without having to use an RSS aggregator, something I’m not fond of.

The problem, though, is that we’re inconsistent in how we format URLs. For instance, a person might update weblogs.com as “http://www.myweblog.com/”, but a blogrolling.com customer adds them as “http://myweblog.com”. These are two different URLs, syntactically, even though they point to the same weblog. Unfortunately, then, when the person updates their weblog, they’re not floating to the top of my blogroll.

The problem is that we all have different understandings of how a URL works, and what we need to use in a URL, and what not. Time for URL 101, I think.

First, the ‘www’ that is so common in most URLs today. Originally, the ‘www’ part of a URL stood for the hostname of the server on which the website lived. The complete name, ‘www.myweblog.com’ then translated into a specific IP (via DNS lookup of the domain) and a specific server.

Things have changed quite a bit and we now have something called virtual hosting. What this is, among other things, is the ability to create a sub-directory, such as (web server basepath)/weblog, and have the web server map weblog.domainname.com to that sub-directory. For instance, I have the following sub-directories, each of which is paired with the mapped subdomain:

 

basepath/weblog – weblog.burningbird.net
basepath/rdf – rdf.burningbird.net
basepath/articles – articles.burningbird.net
basepath/www – www.burningbird.net
and so on..

 

The last one in the listing shows www.burningbird.net, but I don’t have to use “http://www.burningbird.net” to get to my top-level web site — I can use “http://burningbird.net”. The reason is within my web server configuration files, the URLs “http://burningbird.net” and “http://www.burningbird.net” map to the exact same sub-directory, the one named ‘www’. You’ll find with most modern web installations that “http://www.domainname.com” and “http://domainname.com” map to the same sub-directory on the server (something you can easily check through your browser).

Just think: All that time when you’ve been typing in ‘www’, when you could have saved key strokes. Why you probably could have saved enough time to go and buy a Krispy Kreme.

(Note, though, that this mapping isn’t consistent, and you may actually get errors if you omit the ‘www’. Don’t you love individualism in web access?)

So the use of ‘www’ isn’t mandatory. Neither is the use of the trailing forward slash (‘/’) at the end of the URL, as you’ll see some people use.

In olden times, when you used the trailing slash at the end of the URL, the browser knew that you were accessing a directory not a file, and you saved the browser a second trip to the server to determine this. However, all modern browsers now assume that “http://yourdomain.com” and “http://yourdomain.com/” are the same, and you don’t get any performance benefit from the use. However, if your weblog is off of a sub-directory, such as “http://yourdomain.com/somedirectory/”, you will still, usually, get a performance benefit using the trailing slash.

However, the use of the trailing slash is one more difference in our URLs. At this point we have the following variations all pointing to the same web page:

 

http://www.yourdomain.com
http://www.yourdomain.com/
http://yourdomain.com
http://yourdomain.com/

 

But there’s yet another variation — specifying a file, explicitly.

For most of us, our weblogs are located in a page named ‘index.someextension’. It could be ‘index.html’ or ‘index.htm’ or ‘index.php’ and so on, but it is the index file, which is the default file to load when a directory is specified without a file name (this differs slightly based on web server and configuration).

To load my weblog, you can access “http://weblog.burningbird.net”, and you’ll get “http://weblog.burningbird.net/index.php”, because my web server is configured to look for files in the following order:

 

index.html
index.htm
index.php
and so on

 

As long as I don’t accidentally include an ‘index.html’ file in my directory, you’ll get the index.php page instead.

By not specifically giving the file name extension, what I can do is change the type of file, from index.html to index.php, and you all don’t have to change your links to me because you’re only specifying the directory, not explicitly the file name. In fact, if a person is using the default ‘index’ file name, you shouldn’t use this in your blogroll link to them, because it will break if they go to a new file format.

However, we now have yet another variation of the URL:

 

http://www.yourdomain.com
http://www.yourdomain.com/
http://www.yourdomain.com/index.html
http://yourdomain.com
http://yourdomain.com/
http://yourdomain.com/index.html

 

All in all, our use of URLs is about as distinct as we are, and I’m amazed that the bubble up feature of blogrolling.com works, at all.

To attempt to work around these challenges, I added people to my blogrolling.com list when they showed on weblogs.com, using the URL format they used with their pings. In addition, I checked the person’s perma-links, to see if they used ‘http://www.domainname.com’ or ‘http://domainname.com’, and so on. It became a treasure hunt in a way, but the golden egg in this hunt is a correctly bubble upped URL when the person updates.

BUT…

This has left my Talkback feature in a difficult state. The reason is, that the URL you use to ping weblogs.com, usually generated by your weblogging tool, isn’t the same URL you used in my comments. So, you might bubble up to the top of my blogroll, but querying for the blogrolling.com supplied URL in Talkback results in no comments showing.

Pain in the butt.

What we need is consistency. Perhaps we need a URL cleanup day, to clean up the URLs we use in our blogrolls. And a common guideline for URL usage, such as the following:

 

  • Use ‘www’ only if you need to. You don’t need to use ‘www’ unless your page doesn’t resolve without it.
  • Use the default ‘index.extension’ filename for your weblog main page.
  • If the default filename is used, don’t including this in the blogroll link. You’re putting a burden on the weblogger to have to use redirection if they want to change to a different page format.
  • Use the same URL in your comments that you use when pinging weblogs.com or blo.gs. In fact, be consistent with your weblog URL regardless of where you use it.

 

Categories
Weblogging

Skeletons in the closet

I had not looked at the negative consequences of Talkback, and appreciate those who have taken the time to point them out.

Geodog wrote:

 

But I think of comments as ephemeral, and strongly contextual. Plus, as Gibbon might say, some things are meant to remain veiled in the decent obscurity of a obscure format. The last thing I want when someone puts my name in Google is to have the first thing come up be some stupid late night comment I put on a popular (dare I say A-list?) weblog. So will this cut down on stupid late night comments? Or just increase the number of anonymous cowards?

 

(<quote> Burningbird is NOT A-List</quote>)

In Geodog’s comments, the question of identify was also raised: if there’s no sign on process, anyone can come in and write as anyone else.

Good points. Ones that John also discussed:

 

In order for this approach to be more fully developed we would need to implement a security model like PGP, which would allow me to “prove” who I am when posting a comment. While I applaud Shelley’s effort to expose a history of comments, it won’t take long before people start spoofing them. Which is a shame because I’m not sure that level of complex security model will be implemented for some time, and with a network of webloggers like Shelley providing scripts like this for their individual weblogs it wouldn’t take much to build a consolidating engine like Technorati to group them together and give me a global view of my comments across all participating blogs.

If people are uncomfortable about having their comments archived on Burningbird, how about for all sites, forever? Scary.

 

Dorothea also followed through on concerns about Talkback, but from a different perspective. She wrote:

 

Irrelevance, impermanence, mortality—these are my feeble defenses against a potentially crippling sense of worthlessness, futility. I cling to a false nihilism to save myself from the genuine article. Illogical, probably stupid, but that’s how I function.

Which brings me back to Shell (who, I feel compelled to say, is of course utterly innocent of any intent to harm, and who has not really harmed me at all in any case). Now my comments, even more ephemeral in intent and execution than my own blog, are becoming solid, persistent, potentially permanent records. I guess I can live with it; I have to. But I’ll still whimper.

 

About the last thing I want to do is make people uncomfortable with commenting here at Burningbird. Comments are an integral part of this weblog, and I’ve taken care to nurture an environment where everyone is comfortable speaking out. I am extremely hesitant about implementing any technology that impacts this in any way.

But then Monica wrote, in response to Talkback and the posting about spell-checking and writing formally:

 

To me, both the idea of anyone being able to read all the comments we posted on a site and the idea of spell-checking our writings and even elaborating them, in a way have to do with how we want – or don’t want – to be seen. Half-assed comments written in the heat of the moment are the ones on which we can be seen between the lines.

 

And Stavros writes:

 

This latest innovation from her is a really cool idea, and one that might help to combat that feeling of impermanence and evancescence of weblog comments.

Dorothea was right, and I intended no harm to come from Talkback. Another instance typical of so many applications of technology: the social impacts far of a new innovation far outweight the effort to actually create the innovation. However, my intent was to keep the comments — many of which are thoughtful, compelling, and more interesting than the posts themselves –from fading into obscurity.

More than that, though, I wanted a way to introduce people who might be new to this weblog to those others who have been kind enough to come around for some time. You can get a good idea of who I am from my archives, but who are all these strangers leaving all these comments, and what are all these obscure references about? I considered Talkback the digital equivalent of a block party to introduce a new neighbor to those who have lived on the block for a while.

But then, a block party isn’t the same as pulling a new neighbor into each house in the block and shoving their face into the closets to look at all the skeletons hanging there, either. Perhaps some things are best left to accidental discovery over time.

Categories
Weblogging

Globble globble

And this week’s award for best irony goes to…

…Me!

For being underjoyed about the Blogger + Google deal; for discussing some of the negative consequences of the deal; for demonstrating, visually, a sense of perspective regarding “world” and “world with blog” and…

…still managing to capture the top search position at Google for the terms Google Blogger. (screenshot)

I want to thank my readers, my linkers, and my headers. Without them, I wouldn’t be there today.

Categories
Weblogging

This is your world on blog…

The excitement about Google and Blogger continues, though I wonder if we’re not drifting to the extreme goodness end of the spectrum in our view about what this will mean in the long term.

Ben Vierk wrote:

 

Noone can ignore the increasing space weblogs take in search results on Google. Weblogs are becoming Google’s primary content source. What if Blogger had built it’s own search engine? With direct access to the data it could have yielded search results on content as soon as the content was posted rather than waiting 1 to 2 days for a Google spider. Why wait for Google when you can get the information you want now from somewhere else? In the first 6 hours of any new story people can’t go to Google for relevant content. Google is too slow.

 

Tom Matrullo follows this same belief when he writes:

 

The linking of Google and Pyra fired a teensy synapse felt around the world: The advent of the blog as where events happen and are reported, and travel through the network nervous system that Bill Gates could never quite imagine but once dreamed he could own

 

Jeneane also continued this theme:

 

A one-stop-shop for voice? Maybe. Weighing search results in favor of the common-voice news and opinion and entertainment offered by us bloggers (as opposed to big media)? I hope so! Google already does this–they’ve been doing it for at least a year. God bless them.

 

She also writes in my comments in the Cut the Ribbon post:

 

I will be busy fighting the anti-greed war over in my neck of the woods for the near future. I figure it’s the least I can do to help change the world. With the Google/Prya deal, we have a real opportunity–if they enable us, and I believe that’s the whole point–to flip the power structure globally. Dislodge the greed model and be surprised what else follows. I’m not insane. I think it could work.

 

Others have also spoken eloquently about the impact Blogger can have on ensuring that the news Google News reports is more timely; Through Blogger, Google will now have direct access to the data in the weblog posts of a couple of hundred thousand webloggers, as soon as they are published. Heady stuff.

Ignoring the fact that this still precludes the majority of webloggers, now is a good time to remember the incident between Google and the Church of Scientology before we become so sanguine about Google buying Blogger and other centralized weblogging tools. Rather than censorship at the server, after material has been posted, there is the potential of censorship at the source. Rather than wait for content to be published and ask for it to be pulled, don’t allow it’s publication in the first place. And if you control the tool, then it doesn’t matter where the content will be published — the source is controlled, not the destination.

I can’t help thinking the Scientologist are already preparing briefs to force Google into searching for so-called copyrighted material about the church in weblog posts before they’re published, and preventing such posts. And before you call me alarmist, look in the news at what’s happening to the country, to the US. Anything’s possible now.

This is, of course, the view at the extreme badness end of the spectrum, and this vulnerability exists for all centralized tools; however, it’s important to be aware that centralization can close doors as much as it can open them.

Reality check time: Perhaps we webloggers also need to remember that though it seems crowded out here on the boards, we are but a speck in the world. We are growing, our numbers are increasing by leaps and bounds, and we are having an impact, but we’re still a speck. Or, for a more visual demonstration:

This is your world

 


apollo17_earth.jpg
 

And this is your world, on weblogs

 


apollo17_earth.jpg
 

Any questions?

Categories
Weblogging

By their own words shall they be known

Recovered from the Wayback Machine.

I’m keeping my neighborhood links but will be moving them to a separate page. However, the blogroll won’t just sit there, passively. A couple of tweaks:

First, I was thinking about accessing changes.xml from weblog.com and checking for recent updates; however, blogrolling.com does this for me and has a PHP service, so why am I giving myself grief? When I incorporate blogrolling.com, I’ll do like so many others and order the links by most recent updated. However, a little extra code will skim the top five most recent blogs and put them on the front page.

But wait! That’s not all.

Originally, I was going to look for favorite posts from the weblogger, list in a separate page, and then link to this page in addition to the person’s weblog link. I still may do this but it is labor intensive. To be honest, I’d rather to let you speak for yourselves. And so you are.

Next to the weblog link in the new neighborhood page will be a second link opening a page listing the complete text of all the comments you’ve ever made, in descending order, to my posts. Above the comment is the author’s name, and a link to the original comment in the posting page.

We’ll be able to see, at a glance, everything you’ve ever said here at the Burningbird since the day I started using MT comments. I call it this new sticky strand, “Talkback”.

Now when people read my comments and ask themselves “Who is this guy?”, Talkback will tell them. By your own words shall you be known.

The Talkback page is up and running at this time. You can try it yourself with any URL that’s associated with at least one comment. Type the following into the browser address/url field, changing the yoururl to the URL:

 

http://weblog.burningbird.net/speakback.php?url=http://yoururl

 

Provde the exact same URL you use in comments now, and all your comments should show.

To demonstrate more fully, comments from some weblogging neighbors who have ‘talked back’:

 

Dorothea Salo (BTW, Happy Birthday David!!)
Stavros the Wonder Chicken
Jonathon Delacour
Ruzz
Gary Turner
Mark Pilgrim
Dave Winer

 

Unfortunately, if you haven’t used your weblog url with your comments, Talkback doesn’t work. However, I’ll be quite happy to add your url to your comments in the database if you ask. Nothing more than a simple database update.

I have a few other things I want to do with the Neighborhood page, but they’ll keep for another day.

Updated TalkbackI added the capability of searching by name. The format for this is:

 

http://weblog.burningbird.net/speakback.php?name=DD

With this, those who don’t have a web site, or who didn’t leave a URL can still track their comments by the name they’ve used. The name you use in the URL must match the name showing in the comments.