Recovered from the Wayback Machine.
There’s nothing that will bring me off my bed faster than the word, “blacklisted”. That and getting 22 trackback pings in the last week having to do with my old comment spam quick fix. I guess the spammers have paid a visit and you’re all mad as hell and aren’t going to take it anymore.
Except for this weekend when I turned all comments off, I haven’t used any comment spam protection, including my own suggestion that was so heavily pinged. Reason? I was curious about Mr. or Ms. Comment Spammer and wanted to see how they operated.
There’s at least two different types of spammers operating: the smart spammers and the hit nor missers.
The recent Lolita blitz is a hit nor miss spammer that just sends posts to deduced web entry posts based on known weblogs using Movable Type, and the fact that Movable Type uses sequential numbering for weblog posts. My simple solution of a hidden form field could have blocked this spammer; I wish I had it in place when I had to delete 57 comment spams from the little buggers, as soon as I turned comments back on.
The other type of spammer is smarter, more devious, and a lot more interesting. This one tests our parameters and also changes code to fit our discussions and modifications. They listen to us. They are out there.
I mention a hidden form field used to protect against ad-hoc spammers, and then I’m hit with spam posts that pull the form data and use it with the comment post. Someone else mentions about putting timers between when the page is accessed and the comment is posted and the code soon reflects this. This spammer sometimes re-directs to a porn site, but most often leaves just a calling card — a domain that doesn’t exist.
I have really enjoyed watching the smart spammer operate, but now the ante was upped when the primitive hacker hits a comment post 57 times in a row; I had to discontinue my little experiment and implement whatever anti-comment solutions I could find, primarily because there is no way in Movable Type to deal with this type of comment.
When you receive a comment spam, you have to delete the comment directly using SQL, or manually by deleting each in turn from within Movable Type. Then you have to regenerate all the pages to get them to disappear. Multiply that by 57? Ugh.
Hark, though, a knight in shining armor, Jay Allen, gives up all sleep for it sounds like a week to hack through a comment de-spammer that uses sophisticated regular expression processing to block known keywords and relative URLs when a comment is posted. It also blocks duplicate comments. Best of all, it gives you a little link in the email you get with your email notification that lets you delete the comment and rebuild the page in one fell swoop.
This is cool stuff, and Jay deserves a big damn gummy bear to munch in appreciation. However, it wasn’t this that brought me out of my sickbed, with holes in my gut and feeling achy, to comment. It was this casual chit chat about blacklisting. Oh, you know I don’t like that word. It’s a Bad Word.
It never fails to amaze me that webloggers will cry foul at the slightest hint of impartiality or censorship in mainstream publications, but willingly, happily, blindly adopt any and all thought of blacklisting without a backwards thought. It seems with Jay’s tool that you can not only list keywords and URLs you want to block comments for — you can export your list and others can import your items. Wow, web of trust.
Lesse now. Well, Dave Winer has said some pretty nasty things to me in the past so I think I’ll add ‘harvard’ to the list to block Dave. And you know, Mark Pilgrim has been on my back for six months now, so I think diveintomark goes. Wait a sec — I’ll just put ‘mark’ on the list.
Anyone want to use my list now? What’s the matter? Don’t you trust me?
The thing is that Dave Winer, for all of his willingness to explain our faults in infinite detail, is a real person posting as himself and I opened the comments to him to talk. There’s been a couple of times when I’ve been mad enough to block him, but I can’t believe in ‘free speech’ if I block people from speaking freely with me, and he’s been unblocked and free to comment for months now.
As for Mark, these ‘A Year Ago’ posts I’ve been running at Burningbird have shown comment after comment from Mark when we did get along, or at least were neutral, and I miss those times. However, I’ve crossed Mark’s line and am therefore told to Dive out of Mark, and I’m not necessarily fond of some of his newer comments. Still, I can’t bitch about Mark’s inflexibility as regards differences of opinion if I block him from making comments, can I?
So I guess I’ll remove these two items — harvard and mark. Now, do you want my list? Trust me. I wouldn’t lead you wrong. Besides, I know you all know how to use regular expressions to check to make sure I haven’t snuck a block in against a friend of yours among the foes. I wouldn’t do that. No siree. I’s good, I is.
But speaking of ‘good’ and opposite thereof, does anyone want to have the blacklist.txt file from Little Green Footballs? Would you trust it? How about other more extreme folks who have shown themselves less than amenable to disagreement?
Of course, you don’t have to know that you’re getting Little Green Football’s items. You could get someone else’s 3560 entries, and LGF’s items could be a part of this. That’s the problem with non-signed and non-identified entries in a mega-list of blacklisted items — you lose some good with the bad.
No biggie. Right?
You all know Allan Moult and Jonathon Delacour. I’ve known both of them through weblogging for going on two years now. From time to time, I send both an email to say hi, let them know the minute and uninteresting details of my life, or maybe send a link to an interesting article. At least, I used to send them emails before a week ago. I can’t send either of them an email now, because the IP address for my SMTP server is part of an entire block of IP addressed that have been blacklisted by SPEWS. And when I went to SPEWS and said that I can’t be held responsible for my ISP renting out IP addresses to spammers, I’m not a spammer, the response was basically, “Tough. Change ISPs” Sure, as if I have an extra few bucks to forgo what I’ve paid for and moved just because SPEWs decided to punish my ISP using me as the weapon.
(My ISP’s response? “Tell your friends not to use SPEWS for filtering.” Pot, meet kettle. Kettle, meet Pot.)
Blacklisting is never going to be an effective, long-term solution for any, and I mean any, internet-based problem. Period.
I had an email conversation about comment spam earlier today with Dorothea on this issue. In addition to the Bad Word, my conversation with D also sparked glimmers of weblogging interest deep within this tired old body.
Dorothea mentioned about SPEWS being different from the comment spammer thing because it’s centralized. My response was:
Actually, the problem with SPEWS is that it’s not centralized — there are no people you can contact directly to say, you’re hurting me by your blanket IP block blacklisting. There are no faces taking responsibility. There is no accountability, no compassion, no individuality. It is group behavior at its worst.
Group behavior at its worst. Hmmm. Sometimes when things like this comment spammer hit, you can feel the world tilt by the movement of webloggers in one direction. See what you did? You all made me fall over.
I trust in the individual, which means each person should consciously decide on what is, or is not, acceptable, when it comes to the flow of information to them or from them. Filters are non-discriminating in their ruthless discrimination. Communication, and the so-called freedom of speech we rant about, is based on work and deliberate determination — not quick fix global blacklisting.
Still, my concerns about blacklisting are just so much paranoia — nothing like this could ever happen in weblogging. Could it? Nah, not a chance. About as silly as comment spamming.
My preferred solution for comment spam? Close the barn door. Comments were added into Movable Type with a lot of openings and it’s time to provide better functionality for managing them — not comment spam, comments.
Ben and Mena Trott of Movable Type ask, what can we do? Well for starters:
Give me the ability to list all comments by a specific IP, URL, email, or name.
Give me the ability to mark all or part of them, using bulk update techniques, for deletion.
Give me the ability to then rebuild just those pages where the comments were deleted.
Give me the ability to turn off new comments temporarily for those days when I may not be around to deal with the baddies, and to provide information to people automatically about why they can’t post comments momentarily.
Finally, give me the ability to add Jay’s functionality, and others, to not let in the possibility of spam comments if I want to add this additional functionality in. Of course, we have this now — but it doesn’t take the place of the other items on this list.
I want all of this — greedy bugger that I am — and following through on Jay’s excellent ideas, give me the ability to do so with one push of the button. Don’t give me new functionality such as user registration and fancy uses of RegEx processing. Give me the ability to manage the data I already have. Give me better comment management.
If I had this with the 57 items for Lolita, I could have selected all the comments based on the one IP or URL, marked them for deletion, and rebuilt the pages that contained them in one click of a button. End of problem, minor irritation.
Now what happens is that I have to add Jay’s perl-based Regex handling into my system for all comments that come in (yes, take a serious pause with this), slowing what is an already very slow process at times. I have to punish the many for actions of the few, rather than being provided a way to clean up after the few so that the many can happily chat away. And then I have to make sure my regular expressions don’t accidentally filter a friend. Or foe. Accidentally. Of course.
Tech solutions to social software problems. I mentioned in the email earlier to Dorothea that most of these automated approaches aren’t social in nature, and therefore not compatible with social software. How come, then, I was asked, that my approach is better? I responded with:
Because they force the individual to take responsibility for the material that is deleted or not from their weblogs.
I wrote you and you didn’t respond.
I didn’t get it. Must have been blocked by email spam filter.
I commented on your weblog but it didn’t show.
I didn’t get it. Must have been blocked by the comment spam filter.
I had something important to say, but you didn’t hear it.
IT MUST NOT HAVE PASSED THE FRIGGEN’ FILTER!
“Oh say can you see,
by the dawn’s early light…Only if you speak just right!
As for the Google thing or Technorati or Blogdex, or most recently commented lists — sure the URL might get pushed up momentarily. But it’s just as likely to fall off when all of the links disappear. These are dynamic entities, and thus, are self-repairing. So they’re on top for a minute. Who cares?
If we’re that concerned a solution would be in the most recently commented list, just point to the entry with the most recent comments rather than list the individual’s URL, like I do now. As for Google and the comments, create a second individual page template that doesn’t have comments and have it built when the other new page is built. Allow Google access to this page, but not the one with comments.
(Send email if you want instructions — maybe I’ll be able to reply if you’re not in Australia, and I’m not blocked.)
Ben and Mena say, “We don’t know what to do”, and we should be saying back, “Well, for starters, you can do this and this and this.” And no the solutions aren’t using clever coding techniques, as much as I admire them (and Jay’s one smart puppy); but they are using good, common programming sense and practices, which state that a better use of time is to close the friggen door rather than figure out fancy new knots to catch the horses that escape. I respect what Ben and Mena have accomplished with Movable Type to this point, but if they give me comment management, I’ll send them chocolates for Christmas.
Most of all, though, we should push back any time someone even remotely mentions ‘blacklist’ and ‘weblog’, or ‘blacklist’ and ‘internet’ in one breath. Always. These words, they don’t go together.
They never will.
I like wKen’s approach to the whole problem. He loves the comment spammers — gives him an ability to slide on posting, figures he could just let the spammers do it for him. Now that’s a social software solution.
And instead of hating the spammers, maybe we should learn from them, as I wrote Dorothea:
I admire this spammer enormously and have had a wonderful time tracking him/her the last month or so. It’s fascinating to watch someone with this person’s adept understanding of the social aspect of ‘social’ software, as they counter and move around obstacles we clever techs put in their way. Personally, I think Tim O’Reilly should have him or her as a featured speaker at the Emerging Tech conference.
update Winds of Change has had to disable mt-blacklist because the processing is too extreme for the site — Winds of Change is a pretty popular place.
We talked about this issue before, the last time comment spamming was a hot topic — anything clever enough to catch most comment spammers, will be too complex for regular use.
Now, if we had good comment management in MT….