Making a Deliberate Choice

Recovered from the Wayback Machine.

It must seem at times as if we webloggers have become the target of every prankster, spammer, virus writer, cracker, and general wacko that exists on the Internet. However, before you dismiss your vague feelings of insecurity as paranoia, remember that old chestnut: Just because you’re paranoid, doesn’t mean someone’s not after you.

Webloggers are an extremely tempting target for all the Bad Guys that exist on the Internet. We inject more of our personal selves online and into our web sites than most other Internet users. Additionally, we’re some of the more active online websites, as well as the most interconnected. We’ve long had a disproportionate influence on search engines such as Google; Lately, we’re having the same impact on major publications, and even politics. And we’re volatile — we spare no effort to draw attention to ourselves or to a cause, the more controversial the better. So yes, if it seems like we have a bright red bullseye drawn on our collective body — we do.

Of course being the active Web participants that we are, we’re not passive victims of abuse — far from it. Unfortunately, though, many of the means we take to protect ourselves end up hurting ourselves more than the abusers ever did.

Lolita and Viagra sitting in the tree, K-I-S-S-I-N-G

Most Movable Type users have experienced the recent comment spammers that have been putting their links into our weblog comments. Up to a month or so ago, these comments have been relatively manageable; usually consisting of a couple of comments that have to be deleted and entries re-built. However, when Lolita hit, things changed.

The Lolita comment spammer didn’t just automatically send out comments to a few posts in each weblog, some of us had over 50 posts that received weblog comments. Additionally, the Lolita spammer posted comments in a massive number of weblogs — enough to send the links included in the posts to the top of the Blogdex and Daypop buzz sheets.

The purpose behind these spam comments isn’t to be mean to the weblogger — there is nothing personal in these attacks. The purpose also isn’t to con weblog post visitors into clicking through to the site — our traffic isn’t that heavy.

(No, not even the top sites (10,000 unique visitors a day isn’t even considered a medium traffic site outside of weblogging circles.))

The Lolita comment spammer’s only purpose is to get links into weblog posts for that infamous and most cherished little web crawler — the Google bot. By doing so, the URL that gets put into the weblog post achieves a higher Google PageRank based on the active links in our weblogs, and consequently, the URL is going to show towards the front of search results when searching on a particular keyword. Such as porn. Such as Viagra.

(The Lolita comment spammer is also sometimes known as the viagra-rx comment spammer — depends on which mass comment spammer you were hit with, first.)

What made us vulnerable is the very nature of weblogging — the openness, the invitation to communication. Our comment forms are wide open, requiring no login to post a comment. In addition, the form structure and element names are consistent across MT installations, making it easy to use any number of tools to post a comment without even having to access the comment form. You don’t even have to have programming skills: I downloaded a command-line utility from the W3C last week that allowed me to post comments from my Linux server to a few weblogs I know without having to go through the form.

(No worries. I did each post specifically to a posting, and signed my name. That recent Lolita/viagra-rx was not me.)

As for finding the posts, well that’s dead simple, too. When the viagra-rx comment spammer hit, I wrote that the spammers are actually using Google to search for high ranking weblogs that have MT-style comments enabled. But then, the spammers can also use the recently updated feeds from weblogs.com or blo.gs if they so desired.

The interconnectivity that brings people to your weblog is the same interconnectivity that spammers use to find your weblog.

Another aspect of Movable Type that makes it easy for spammers is that fact that entries are given sequential numbered identifiers. Once you find one entry you can keep posting to other entries just by incrementing the count. Changing how the files are named won’t change this because MT sees the entries by their identifiers, not by how the files are named. However, from what I can see of usage patterns of Lolita, our posts are being discovered via Google, not using this other MT vulnerability.

Unfortunately, because of people like myself, who use the default MT number generated web page names rather than some other naming sequence, Ben and Mena of Movable Type are constrained to continue supporting sequential number identifiers or risk creating all sorts of broken links when an entire site is regenerated.

That’s the problem with weblogging and our use of technology in the past: we were such a naïve and open group of people, protected from our folly because we used to fly under the Bad Guys radar and weren’t much of a target. Because of this, we’ve built software and we’ve added functionality that’s left all sorts of holes into our sites, and closing these holes is going to be difficult, at best; impossible, at worst. If it seems like Movable Type webloggers are the favorite herd animal, it’s only because the creators of Movable Type, Ben and Mena, have listened to what we’ve asked for and given it to us. None of us knew or imagined how a combination of our openness and our growing influence could be used against us. Or we knew, but we were having such fun, we didn’t care. Let tomorrow’s problems occur tomorrow, we thought to ourselves, and when they do, we’ll whip up a technical solution quick as a blink.

So now we’re blinking. Like mad.

Who was that masked man?

When we were first hit with the comment spammers several months back, we instituted some simple changes, but they were easily overcome by today’s semi-sophisticated comment spammers. For instance my own implementation of using a hidden form field to catch generic comment posts was easily gotten around when the spammers would read the web page, find the hidden field and its value, and use this as part of the comment post.

We didn’t seriously pursue anything more complicated because the issue was manageable. After all, not every weblog was hit with comment spammers, and we could use MT’s comment deletion functionality to manage it.

Well, Lolita and viagra-rx changed all this. When you wake up in the morning and check your email and find that you’ve had 75 comments spams attached to 75 different posts, MT’s support for comment management is no longer viable. For most of us, the only approach was to delete the comments in the database using SQL, and then rebuild the entire site.

Another approach is to use IP banning within MT to ban IPs from being able to post comments. Now, this works against individuals who use static IP addresses to access your site, but doesn’t do a thing to protect you from dial-up users, or users who have multiple IP addresses. It definitely won’t help you against the comment spammers who either use a different dial-up account each time they do a comment spam run, or spoof the IP address, making it seem as if the comment spam originated from another IP address.

In addition, its so easy to make mistakes with IP banning. For instance, I couldn’t post a comment in Loren Webster’s comments last week and wrote Loren to ask, what’s up? After all, I know that Loren would never ban me — I’ve never once tried to push either viagra or porm on him. Neither have I asked his help to do my school report on some sucky old poet.

What happened (according to Loren, who gave me permission to recount his experience) was that when Loren was putting an IP address into the ban list, he accidentally added a blank entry. This, in effect, blocked every IP address from posting comments — including Loren.

My being blocked was an error, but what about comment blocking by deliberate act? Originally the Lolita/viagra-rx comment spammers were using IP addresses from China for their posts and people were blocking entire sections of addresses from China. As I wrote previously, not only were the Chinese people blocked from reading weblogs hosted at sites such as Blogspot, but they couldn’t post comments even when they could get into the weblogs.

So if IP banning won’t work, and my simple fix is too simple, what’s one to do? Enter into the fray, Jay Allen and his sophisticated MT-blacklist plugin and software.

What Jay has done was to develop a multi-prong attack against the comment spammers. First, his software will add a link to the bottom of every comment email, and clicking on it will give you the option to add any URLs found in the email to a list (already pre-installed with over 450 entries). The software will also delete the comment from the database and rebuild the comment — in one easy step.

Next, if you so choose, you can block all comments with that URL from that day on, which means that that the comment won’t even be added to the system. The only record you’ll have of a blocked comment is in MT’s log.

With new additions to the software, you can control individual comment spams, block future spams, and also traverse your existing system and remove old spam that’s hanging around.

Of course, to do all this there were some drawbacks. For instance, Jay had to overwrite the existing MT code for comment and trackback management. This means that those of us who hacked the MT code to do things such as republish our pages after a trackback, had to now add this code to Jay’s Blacklist.pm file, and update this code every time Jay puts out a new release. Additionally, there are some software requirements to run the code, and the extra processing does add to the overhead every time a comment is posted, good or bad.

However, most of us installed MT-Blacklist even with the drawbacks, primarily because of that one click comment deletion, because comment management in MT is not effective against comment spammers.

Well and good…except that this fix adds its own potential problems.

Nuclear powered flyswatters and other myths of Man’s invulnerabilty

I installed MT-Blacklist but I didn’t activate it, which means that I’m not blocking comments, only using the email link to delete comments already made. I wanted to see what happens with comment spam in light of the new comment prevention techniques. Within the first week I spotted a pretty serious problem with MT-Blacklist that has repercussions for the innocent commenter who just wants to say, hey nice pic.

Within the first week, URLs added to the list included anything to do with the word ‘academia’, as well as the domains of ‘hotmail.com’, and ‘yahoo.com’. Now, there are academic people who frequent my comments from time to time, and the word academia isn’t that uncommon. In fact, when I checked my comments, I found it mentioned four times. If this URL (it was a faked URL, but ended up as ‘academia’ in the list) had been used to block comments, these four comments would not have been allowed. Worse, if I had run the utility to remove old comments with this value, these four comments would have been deleted.

As for hotmail.com — this isn’t that unusual an email address to use when making comments to people’s weblogs. Most of us have a ‘throwaway’ email address we use for weblogs. Or a spam faked one. If hotmail.com had been left on the list, this would have impacted on 155 comments. If I had run the cleanup utility, these 155 comments would have been deleted.

This is nothing compared to yahoo.com — a whopping 455 comments are in my system related to or using the yahoo.com URL in one way or another. 455 comments! I may have over 8000 comments in my system, but that doesn’t mean I want to delete 455 good ones — or block anyone who uses a yahoo.com email account. After all, I use yahoo.com email account.

These entities ending up in the list is not an error in the technology of MT-blacklist, but is a consequence of using technology as a blanket solution to social software problems. (Now, where have I heard that before?) MT-blacklist does give you the ability to review URLs before they’re added to the list, and these obviously good URLs (or faked keyword-as-URL) could have been filtered out. However, this implies that you review each and every URL in a comment spam to make sure good URLs aren’t being included. Each and every one.

Considering that some of these comment spams have upwards of a 100 different URLs listed, one can see how something like ‘academia’ made it into the list until I was reviewing the it one day and spotted the word.

The potential for abuse with something like URL blocking, as it was with IP blocking, is cause for concern. For instance, if you want to deliberately censor me in other weblogs, just add a weblog comment that has 200 spam URLs, and then sneak my URL among them. Post this manually or automatically at any site using MT-blacklist and there’s a very good chance my URL won’t get caught. Not happen you say? Where do you think the yahoo.com URL came from in my example?

Or the comment spammers can get clever — post a comment spam with a spam URL, but embed it among several hundred ‘good’ URLs, forcing the person to have to review each and everyone carefully to find the bad URL to add to the list.

Of course, if the comment spammer reads this I’ve given him or her, or them, ideas. We know they are listening — one spam comment I had recently actually talked about our conversations, and then added links to pharmacy sites to add salt to the wounds. But it’s not just the comment spammers who are listening.

UpdateI get carried away with the focus of my writing that I sometimes forget my manners. I wanted to make a strong point here that there is no ‘fault’ in Jay Allen’s software leading to the issues just mentioned. Jay has worked hard on this product, it is a remarkably sophisticated and useful product, and it is one that Jay provides free of charge for people to use — a very generous act and one in which I am grateful.

In Denial

If your weblog is currently hosted with Hosting Matters, chances are your web site suffered some serious downtime in the last week due to a Distributed Denial of Service (DDoS). The fact that a site was hit with a DDoS wasn’t unusual — it’s become a common event nowadays. What was unusual was all the conjecture that this DDoS was a deliberate act to take down certain high profile warbloggers.

I first read the conjecture about this being an ‘warblogger attack’ at Winds of Change>. In the comments and posts, the discussion focused around a so-called claim of responsibility posted at another web site, and that the attack was against Internet Haganah. A bit of irony enters the picture here because Internet Haganah is a site devoted exclusively to bringing down what it terms to be Islamist Terrorists sites.

(I took a look at some of the sites still up, and all I’ll say is that the term ‘terrorism’ is extremely relative. So is ‘freedom of speech’.)

Other sites also jumped on to the DDoS as Jihad including the ever committed Mr. Reynolds.

In the Winds of Change post, Jace from Bloghosts even brought up the accountability of webloggers to each other in our actions. He wrote in comments:

I think all war-bloggers need to be smart about the content they post and the activities they are involved in. There is a real war going on, there is no need for us to contribute to a virtual one as well. Everyone should recognize the difference between reporting and sticking their noses in places they do not belong. It should not the responsibility of any blogger to see that Al Qaida sites or their supporters are shut down or exposed. By doing this you are putting your own site and the sites of others at risk from not only DDoS attacks but also harassment, identity theft, and possibly worse.

If I write something that brings attack on me, I’m bringing that attack on to others on my server. Do I stay quiet? Jace isn’t advocating that we muzzle ourselves, but he is telling us to be aware of our actions, and to make sure they’re deliberate.

Of course, there’s no real way of knowing where this DDoS originated, and what its purpose was. DDoS is a way of life, and as with the comment spammers, most DDoS aren’t necessarily personal, though they do tend to be deliberate.

There is no guaranteed technique or tool that will stop all DDoS attacks. The only way not to be attacked is to not have your web server machine online, which tends to defeat the purpose of having the web server machine in the first place.

The resulting behavior and reactions from the webloggers must have been enormously satisfying to the attackers — all this talk of conspiracy and jihad, and virtual wars. “I regret that I have but one weblog to give to my country.” I can hear the script kiddies now, “Oh. That was fun! Let’s do it again!”

That’s not to say damage wasn’t done by this DDoS attack, and that I don’t take it seriously. For instance, the reactions against Hosting Matters ranged from threats of physical violence against the HM support personnel, to people leaving the company for other providers, usually because some pundit in comments somewhere makes casual statements that “a DDoS attack is easily preventable. You should dump your host.”

I can guarantee that whoever says this has never supported a network. In their life. I’ll write more about the mechanics of DoS and DDoS in a separate essay but the point to make here is that animals hunt where there is noise and there’s no bunch of people noisier than webloggers.

When in doubt…

It must have seemed, especially to people hosted on Hosting Matters, as if they were settlers in a new territory having to draw their wagons into a circle to keep out the bandits. First, there were the comment spammer attacks, which have increased exponentially in the last few weeks. This was then compounded by the DDoS attacks, which knocked sites offline, for days in some cases. Contrary to conjecture, though, there isn’t a conspiracy to get all warbloggers on Hosting Matters. The ISP who hosted our coop server was also attacked during this time, as were other ISPs throughout the world. The SCO Unix site was also attacked through a vulnerability in older unpatched OS software, which was rather embarrassing.

People getting angry about these events didn’t surprise me. What did surprise me about all of this is how personal people are taking these ‘attacks’. Joe Duemer originally wrote:

Spam is bad & while I think pornography has been & will always be with us, comment spammers are the lowest of the low. Bottom feeders, the catfish of the internet, eaters of rotten feces.

Later he expanded on his reaction, writing:

What struck me as interesting is that most people who posted something more than grrrrrrr on the subject were more bothered by the invasion of privacy than by the explicit nature of the links. The fact that the links led to unsavory websites was an added irritant, but in most cases the pr0n link was not the primary objection. As I said in my original post, I have a pretty tolerant view of explicit material, though I find the exploitation of children despicable. Beyond that, what grownups do with their bodies in making or consuming pr0n is pretty much up to the people involved. I take a libertarian view of the industry.

What I strongly object to is the appropriation of my bandwidth to game Google. And I even more strongly object to the presumption required on the part of the spammer to barge into my little corner of the net.

Today as Jay Allen works extremely hard on a new version of MT-Blacklist, he issues a warning, Have your fun, lolita, because soon you will have none….

Marie wrote:

When I came home a few days later I discovered that lolita has scrawled “her” soulless signature all over my blog, and I was enraged. I say “enraged” because I could not find a reasonable explanation for what was bothering me so much about this incident. It wasn’t the p0rn issue that set me off, that’s for sure — and yet, I felt a bit sullied, as if somehow my person (who I thought I was) has been violated.
…
But, as Coetzee goes on to argue, the affront, which is real, is an attack on a construct by which we live, and not on our essential being as such. This is why we need to use our heads, not just our guts, as Shelley has suggested, and fight back not with our wounded egos and their urgent demands for censorship, but with other constructs that recognize these intrusions for what they are. This is why Shelley throws the challenge to the Trotts, asking them to step up and play a better game when it comes to designing the comments system.

Taking offense, then — and I am reminding myself here, not preaching to you all — is not the answer to the lolita problem.

This sense of personal attack isn’t limited to just comment spamming, as witness the reactions and rumors of conspiracy among the warbloggers when HM was attacked with the DDoS.

Instapundit anxiously reminds his readers about his weblog backup site just in case he’s taken out, as if he’s the only source of news we have and he’ll get it to us, or die trying.

When Boing Boing was also attacked by a DoS and moved to a new server (according to the kind answer from AKMA) the question blazed across the aether — where’s Boing Boing? Where’s Boing Boing? We posted a direct link to the IP address on MetaFilter and in sites until the DNS name change could propagate, as if we couldn’t live without Boing Boing for that day or two.

(Do I envy the hits that Boing Boing gets? I used to until this event. Then all I could think of is I don’t ever want such popularity that I can’t go offline for a day or two, or thee or four or twenty, when I don’t feel like posting.)

When I returned from my short break, I wrote at the time:

Next year is going to be a very bad year for the Net, and every weblogger, no matter who you’re hosted with, had better be ready to have your site down an average of 2-4 days every month. Yes, days.

Pretty extreme prediction, isn’t it? Random Bytes thought it was extreme and wrote:

I don’t buy it solely based on Rader’s First Law of Statistical Analysis – “A prediction with an outcome that contemplates an order of magnitude increase over current state must be accompanied by some damn good evidence supporting the prediction.”

Is it even remotely possible that the internet is going to get as bad as Shelley predicts as quickly as she predicts?

But you see, I wasn’t making a prediction. What I said was that webloggers need to be prepared to have their sites down 2 to 4 days a month. By this I mean that webloggers are going to have to come to terms with the technology that supports them, and that this technology will never completely be able to protect them from comment spammers and DDoS and whatever other electronic things that go bump in the night.

What could possibly bring about such a violent change in the aether so as to violate the laws of statistics? People. People, that’s who.

Perspective.

We aren’t being violated by comment spammers, and I refuse to give them that power over me. They annoy me, and sometimes they even intrigue me; but at no time do I feel as if my personal space has been violated. How can it? They don’t have this type of power over me. Nor am I going to close down even one legitimate comment in order to trap the Bad Guys. That also gives them power over me.

If our sites go down, they go down. When you can’t access a weblog for a day or two, unless you have reason to believe that the person is fed up with the whole thing and quit and run off to Tibet or something, assume that Technical Difficulties are happening, and that the site will return soon. If it’s your weblog that’s down, it’s down. Face the fact that your site is going to go down, and instead of issuing threats of violence to the ISP, or screaming into the phone, and sending emails to everyone you know that your site is down due to technical problems (which they can deduce anyway) — why don’t you use the time for a walk, instead? Bake a cake. Pet your kitty. Write something on paper.

Unless your weblog is necessary for your business, or your life is at risk, why are you stressing?

That’s not to say you should be passive, but your actions should be deliberate rather than reactive.

Today I received a comment in my Shinto Commandments post that read, “Your site blows. I am going to kill you.” This wasn’t a comment spammer, but it was, in some ways, far worse — it was a person hiding behind anonymity to issue the most casual of threats: I am going to kill you. Funny, haha.

Blow it off as kids? Not a chance.

I traced the IP address to a school system for a small town that happens to be in Missouri, gathered up my log entries and the comment and sent it off to their network administrator. He was able to use his own proxy logs to determine that the person who submitted the comment was one of two culprits and agreed with me: there is nothing funny about a comment such as this. He said that the person who wrote it would be, in his words, …severely punished

Just a kid you say? I don’t care. If he or she was old enough to type the words into the computer, they were old enough to accept the consequences. They made a deliberate choice, and so did I.

That’s the point of this long rambling discourse. The longer you’re going to be online, the more you’re going to have to make deliberate choices about your environment.

Choices such as taking the time to learn as much as you can about the technology that runs your site so that you know if your ISP is doing the best it can, or if its time to abandon ship for a new ISP. Someone somewhere along the way fed a line to webloggers that they don’t have to know anything about the Internet to have a weblog. Well, that’s a load of bullpuckie.

Choices about the battles you fight, and knowing when to make a stand, and when to walk away. Mad at a spammer? Then by all means, take the fight to them — but be aware of the laws and rules governing the internet and make your fight deliberate. And follow it to the end.

Choices about the technology you use to protect your site, and to be aware of the consequences of it being abused; to question the so-called experts when they tell you what you must or must not do with your site. If you don’t want comments, then don’t have comments. If you don’t want an RSS feed, then don’t. Turn your comments off on posts thirty days old, or block comments from specific people.

Just don’t edit my comments, or I’ll have to hurt you.

Joking.

You’re not a leaf floating in a stream with no control over your movements — if you have enough control over your life to decide to have a weblog, then you have enough control to know that whatever happens to your weblog, it’s not happening to you.