Categories
Technology Weblogging

NoFollow

Six Apart has announced what Dave Winer only hinted about and we’ve been expecting — Google and the other search companies have partnered with the weblogging companies to come out with the use of rel=”nofollow”, as a way of dealing with comment spam.

When added to the weblog template for links, this instructs the search bots not to include the links within page ranking. The point being that once the spammers realize that their effort is futile, they’ll go away, like the professional business people we all know they are.

This might have worked…three years ago when we the webloggers called out for Google to help. At the time that comment spam started to become a problem, one of the suggestions was for Google to get involved and come up with a way to mark links so that they have no value for the Google webbot.

Now, all these years later, we read the following at Six Apart:

Recently, we’ve reached out to other blog tool vendors to try to coordinate information about comment spam techniques and behaviors. As part of these efforts, we’ve also begun to talk to search companies about enriching linking semantics to better indicate visitor-submitted content (like comments or TrackBacks).

Others are jumping up and down about this now, such as Scoble. I’m not quite jumping up and down. But I’ll add it to my template, and hope for the best.

If you do implement this, you need to implement it not only in your comment listing but also in the sidebar ‘recent comments’ plugin or code that you’re using. Your legitimate commenters or trackbacks won’t get any link rank for their entries, but I imagine people are so desperate they don’t care anymore.

Remembered another

WordPress is going to have to change its comment policy to automatically create hypertext links for internal links. Otherwise spammers will just include links within the comment itself.

Categories
Technology Weblogging

Take your hands off the tech and back away slowly

Recovered from the Wayback Machine.

Several people have linked to Martin Schwimmer and his indignation about the fact that Bloglines re-prints the content of his post, without attribution and with the possibility of future advertisements (…or guilty until proven innocent). This violates the cc license, he says, because he can only be republished if proper attribution is given, and in a non-commercial setting.

This sounded familiar, and sure enough, digging around in my archives finds this. where another person reacted in outrage when they found out their feed was being re-published:

What was a surprise is that Mitch reversed himself and now offers a Creative Commons license on his material, though the license information isn’t duplicated in Mitch’s RSS feed directly. Mitch also brings up the ‘commercial’ aspect of re-publishing the material at LiveJournal, and what’s to stop someone from grabbing the content and putting it behind password protected sites that charge money for access.

Easy – don’t publish all your entire post in your RSS feed; keep the RSS feeds to excerpts only. Remove the content-encoded field and just leave the description. And adjust your blogging tool to publish excerpts, only. If your weblogging tool doesn’t allow this adjustment, ask the tool builder to provide this capability. The RSS feeds are there to help promote your ideas, not promote their theft. But you have to control the technology, not let the technology control you.

Wait until he discovers the other online sites, such as 2rss.com, that do add ads into the feed if you use it to subscribe within any aggregator, Bloglines or not.

update

Also, see this about creative commons licenses and RSS feeds back in 2002.

Question to Mr. Schwimmer — is your cc license attached to your feed?

Categories
Technology

Be Stingy

Regarding Dave Winer’s idea for some form of centralized syndication feed system, I got a chuckle out of the comment, “What problem am I having and how is a centralized service going to help?” in Phil Ringalda’s post Centralized Subscription? Not that way thanks. You see now the great benefit of being exposed to us techs through weblogging: you get to experience, with us, the joy of uncertainty that comes from knowing that you’re always on the edge of failure.

Dave does have a point in that if you provide one click subscriptions for one aggregator, such as a Subscribe via Bloglines button, it won’t work for other aggregators; you either have to blow off the others, or you end up with a trail of buttons down your page, like stepping stones across a vast sea of syndication.

You could be like me, and provide the bare minimum to aid in subscription: auto-discovery enabled via my weblog tool, and a couple of links to feeds in my sidebar. However, I will be the first to admit that clicking a link to open an XML file isn’t the friendliest way to get people to subscribe to your site’s syndication feeds.

I am open to alternatives to this arrangement, but not necessarily Dave’s approach. Though he hastens to say that his approach isn’t a centralized directory, it is a centralized source of data, one with consequences beyond the intended purpose.

Dave’s solution would require that you pass to the service a link to an OPML file, which contains a listing of sites to which you subscribe, and then click a link to add a new subscription. In return the service would provide the list in a format specific to whatever aggregator you use. Your subscription list would then be merged with other subscription lists, and made public; the data contained being accessible for other purposes.

With this approach, not only would I be able to more easily subscribe to your writing, I could also take a look at who you read, and don’t read. Would your subscription list be the same as your blogroll? If not, are you prepared to answer questions from those who you link to, but don’t read? How about those who you read, but don’t link? I could even use your subscription list as my own, so that I can read the exact same sites you read every day; more, I could follow you around in comments, adding my own following yours, just to let you know I’m near and thinking of you.

Phil wrote his own scenario, about subscribing to a site that provides information about spastic colons, which can then get Googled by the hot new love of your life. We say we’re an open book, but do we really want to be that open?

As Phil demonstrates so effectively, which service works best is the one that requires the minimum of information. This follows from a known paradigm in designing relational databases or class systems in languages such as PHP–more data is more overhead and increased complexity, so you keep the data needs as simple and specific to the problem being solved, as possible.

In fact, though the needs of aggregation aren’t the same as identity, we could apply Kim Cameron’s second law of identity, the Law of Minimal Disclosure, to this problem: The solution which discloses the least identifying information is the most stable, long-term solution.

In the case of too many subscription buttons, Phil recommends the Syndication Subscription Service, as a solution. The service doesn’t require anything more than a link to your syndication feed, and when accessed, returns a set of buttons for many different aggregators. In fact, I liked this service so much that I’ve pulled my links to my two Atom and RDF feeds in the sidebar and replaced them with a link to it, instead.

Though it is also a centralized service, it’s one that requires a minimum of data and effort, and since the code to support it is open source, could be duplicated if need be. Best of all, it’s something I can use now for this newly discovered problem I didn’t know I had, but which has now been solved, and so no longer exists.

Much of the discussion is about handling feeds like audio files, and the so-called feed protocol. I like what Seth Dillingham wrote on this long ago:

The feed protocol was originally designed for farms. Cattle, for example, just have to click a button to access a feed: url on the farmer’s server, which causes grain to be dropped in the trough.

In a bizarre misuse of this important technology, the feed protocol can also be used to request an RSS or Atom file, to “feed your brain.”

I’m with the cows on this one — if I can’t poke a button with my nose and have it give me food right now, I’m moving to a different barn.

Categories
Technology

Mega Meta Mommy Backplane

Kim Cameron responded to my I, URL posting with a very gracious response. Gracious, in particular, considering that I was a ‘bit rough’ on him, as well as the Liberty Alliance.

Years working with the multi-mega-corporate data model efforts at various organizations and my own studies and writings on RDF have left me impatient, I will have to admit. We are at a stage now, we are ripe for it and beyond, some form of digital identity system that we can all be comfortable using. Whatever it is, it has to be something that focuses on the people, not the corporations. It must be our champion, not the champion of kings and queens–or of major technology companies.

I acknowledge the work that Liberty has taken in the field, but there’s never been any doubt about the ‘customer’ of this group’s efforts. However, just because the focus of the Alliance is on the companies, doesn’t mean that’s the focus of all the people in the Alliance; or the focus among the crafters of the work. Still, the specification from Liberty that I quoted in my original post is at version 1.2–long past time for such ‘placeholders’ to exist.

Ah well me, I’m just a coder looking for a solution, though I am glad of this discussion. Otherwise I would never have heard that priceless phrase, about the …emerging “mega meta momma backplane”. Not even the RDF folks could come up with that one–it’s lovely. Or have heard that there is now a sixth law of identity on its way.

Thou shalt not centralize…thou shalt not centralize…thou shalt not centralize…

Categories
Technology

Self-documenting technology

Danny Ayers points to a Jon Udell article about dynamic documentation managed by the folks who use a product, rather than relying on stuffy old material provided by the organization making the software. In it, Jon writes:

Collectively, we users know a lot more about products than vendors do. We eventually stumble across every undocumented feature or quirk. We like to maintain the health of the products we’ve bought and we’re happy to discuss how to do that with other users.

The problem is that vendors, for the most part, do a lousy job of encouraging and organizing those discussions. Here’s an experiment I’d like to see someone try: Start a Wikipedia page for your product. Populate it with basic factual information, point users there, then step back and let the garden grow. Intervene only to repair vandalism, make corrections, and contribute useful new facts.

I had to pause when I read the words …we users know a lot more about products than vendors do. I was reminded about finding information about how to convert my Nikon 995 camera to RAW format, based on the helpful advice of just such a user, only to find out warnings at the Nikon site that if you do, you null and void the warranty on it because this could have Serious Consequences in the continued Usability of the Product.

I remembered reading a weblogger, brand spanking new in their use of SQL, telling everyone how they could fix a problem in their weblog just by running a certain SQL command, and then frantically sending the person an email saying that if the users do, there’s a good chance they’ll lose half their data. Then I was reminded that there is nothing more dangerous with a user who just knows they have the answer, and who also has absolutely no stake in whether you break your copy of the product or not.

Still, I have been helped numerous times by other users when I get into situations not documented in the product manual, and I can agree with Jon, and with Tim Bray that it’s much easier to interactively look for help than to read through static documentation when you run into problems.

Jon Udell suggests a new documentation strategy for technology vendors; rather than going on publishing incomplete, out-of-date, poorly written manuals, they could just set up a per-product Wiki and let the customer base fill it up with problems, fixes, workarounds, tips & tricks.

Which is probably why most company provided formal documentation isn’t focused on problem resolution as much as it is problem prevention. For instance, all the interactive help in the world isn’t going to help a new user set up a Movable Type weblog if they have to go query for each stage in the installation process. That’s why a company like Six Apart provides a quite nice installation guide that covers 99% of the situations most people would run into. By providing step by step instructions, the majority of people are able to get their weblogs up and running, without too many problems.

On the other hand, OsCommerce a heavily used open source free product for managing ecommerce sites, has little or no formal documentation, other than that provided by the users in provided forums. However, other sites have sprang up providing other documentation, including a wiki, and multiple site that provide, ta dah!, formally written, structured documentation for how to use the product.

Why the need for the latter? Because for the most part, OsCommerce is used by people who don’t have that much technical background or experience, and they, for the most part, are very uncomfortable without having structured documentation that they can follow, step by step, in how to actually use the product. Not troubleshoot, but actually use the app.

Still, Jon and Tim aren’t recommending that companies not provide this information — they’re saying provide this (for all those who can’t connect the dots through Google, I imagine), but then provide areas where users can provide additional documentation and help each other.

Jon, mentioned a wiki, and Danny pointed to the WordPress wiki as an example of this type of documentation project. Once upon a time, I also thought that a wiki would be a good tool to use for open documentation efforts. However, that was before I tried to get several people–non-technical people–interested in providing information at a wiki I set up. They were willing, but many were also intimidated about the environment. It was then that I realized that a wiki requires not only a great deal of knowledge about how to edit the pages, but also familiarity with the culture. In other words, wiki is for those who are wiki primed.

In fact, this has become a problem out at Wikipedia — most of the editors are technical people, and therefore the skew of information tends to be towards technical subjects. The organization has taken to actively promoting non-tech topics in hopes of attracting enough contributors to these other topics to start achieving some balance in coverage.

But then, Wikipedia has the necessary, critical element to make this work — it has achieved enough momentum to be able to direct attention to obscure topics and know that there should be enough members of the audience with knowledge of this topic, and willingness to dive into what is a fairly structured culture, to provide at least a bare minimum coverage of the topic. Most wikis will not.

That’s why when Jon proposes that vendors provide a hands-off wiki for users, and then uses Wikipedia as an example of how well a wiki can work, I winced. Too many people point to the Wikipedia as a demonstration of how a wiki can work, without realizing that the Wikipedia is unique in its use, purpose, and community. In other words, if the only wiki we can point to as a demonstration of how wikis work is Wikipedia, then perhaps what we’re finding is that wikis don’t work. Or don’t work without a great deal of organization on the part of the wiki administrators, and an already existing community of willing contributors.

Of course, technology users can be heavily motivated to support the products they use, as we’ve seen with weblogging technology. There is nothing more loyal than a weblog tool user–unless it’s a Mac user. You take your life in your hands when you take a critical bite out of Apple.

Based on the assumption of interest on the part of tech users, let’s return to the WordPress wiki that Danny pointed out. Checking out the recent changes, we find that on January 6th, a user who calls himself GooSa, has added a bunch of spam pages. I imagine that the pages will be removed by the time you look as this, but I copied this person’s ‘user’ page entry:

Goo Sa is an evil, evil spammer. Plus he smells like eggs.

Other than that, there isn’t that much activity on this wiki, because it’s no longer the WordPress wiki. No, that’s now the Codex wiki. As you can see in recent changes at this site, there is a great deal of work being done, as well as less spam content. Of course, I don’t believe this is linked any where from the main WordPress site, so it could be that the spammers haven’t found it yet.

If they do, there seems to be enough organization to help keep the site clean of the obvious spam, but what about the not so obvious destructive actions? For instance, the malicious editor who adds in a helpful tidbit that could actually cause harm to the users, but only a very experienced technical person would be able to recognize that this causes harm? (Or the user who has just enough knowledge about a topic to make them scary as hell.)

If this tip was out at a forum or email list, the user might be (should be) wary enough of the tip to perhaps get it vetted first; but this is the ‘approved’ wiki for the product, which implies trust in the material contained. Does this mean, then, that the WordPress developers vet every bit of information in the wiki? If the developers are busy providing code for the product, this is unlikely.

Of course, the thing about wikis is that they are self-correcting. However, it can take time for a correction to take place, and in the meantime, I’m fielding emails from half a dozen WordPress users about why their comments have stopped working, or what happened to their data, all because they followed information at the ‘official’ WordPress wiki.

Now, the Wikipedia doesn’t have many of these problems, because there is a good, formal procedure with enough people to monitor the site in place to route around damage. However, as the administrators of the site warn on a fairly regular basis — believe what you read their at your own risk. A wiki that’s ‘authorized’ by a vendor to provide documentation for a product can’t afford to be this lose in what information gets released under under its corporate umbrella.

Security and validity of the data aside, wikis foster a certain form of organization that may not be comfortable for all people. The information contained tends to float about in pieces, rather than flow smoothly, as more formal documentation does. Now, this might suit many of the more technical folks who want to know what’s going on with WordPress; I have a feeling, though, that those less technical folks who read it are going to feel cut adrift at times, as they read a discrete bit of information here, and one there, but without the experience to understand how the two pieces of information are related.

Of course, again, thats where the organizers come in, by helping to move things about and point out gaps in the flow–but at what point does it become obvious that if the organizers had just spent their time doing the documentation themselves, it would have taken less time than it takes to continuously prevent harm to the material?

Issues of wiki aside, let’s return to Jon Udell’s request for a new way of managing community documentation. He talks about not being able to find information at the vendor site to solve his problem, so he searches on Google and in user forums, he found the answer he needed. He then says, something new needs to be done to enable user access to community information.

Now, go back and re-read this last paragraph a couple of times.

As Danny points out, much of this user information is already available, but what might be missing is a way of accessing it:

But wait, those discussions with considerable information is already available – the end-user’s don’t really need any extra encouragement, they’re motivated as it is. But what’s lacking is the organisation of that information. Google is a very blunt instrument. Yes, a company Wiki could act as a focus for people, but they’re still be plenty of info on mailing lists and blogs that could be far more accessible.

But as Jon points out, whether intentionally or not, Google does work, and worked for him in this context. More, what he and Tim Bray and to a lesser extent Danny are looking for is more of the type of documentation that favors the geek, when it’s the geeks who don’t need the help — they know how to dig this information out, and how to vet its usefulness as compared to its harm.

What about the non-tech? The non-geek? The very documentation that Udell and Bray scathingly reject is the very documentation most of the non-techs need, and that’s well constructed, clearly written, stable, and vetted user documentation about how to use the tools. Throw in searchable user support forums for troubleshooting, Google and blogs and online interest sites, and Babs is your aunt. So when Tim writes:

Like many great ideas, it’s obvious once you think of it. I’m quite sure it’ll happen.

I have to scratch my head in confusion because seems to me that the mechanisms behind this ‘new idea’ are already in place, and have been for some time. Now if we could just convince the open source community and supporters like Jon and Tim and Danny that rather than spending time on creating a new style of hammer they need to provide nails and just start hammering away, we might be set.

Still, I don’t want to completely discount Jon’s wiki suggestion: it could be humorous to see what happens out at, say, a security wiki for Windows software. Might be better than a raree show, indeed.