Recovered from the Wayback Machine.
In the last few weeks, I’ve been hit not only by comment spammers but a new player who doesn’t seem to like our party: the crapflooders, people who use automated applications (you may have heard of the program called “MTFlood” or some variation) to literally flood comments or trackbacks. At one point I was hit with over 1000 comments in one of my posts; another time over 500 trackbacks. If you add in rebuilds and email, this can be a stress on the web server, not to mention annoying to clean up.
Several people have looked at this issue but two, Phil Ringnalda and Jacques Distler have provided code as well as technical expertise looking at the problem and deriving solutions for Movable Type users.
(Others also have code solutions, but I’m primarily familiar with Phil and Jacques’ work.)
One solution looked at was the use of a ‘nonce’ with forced Preview on comments, which should help hinder automated posting. The idea for this came from Sam Ruby, though Sam’s software does differ from the rest of us, who are Movable Type Users. A nonce is value, a random number or based on the machine clock, that is submitted in Preview mode, and verified when the form is submitted. It’s a good idea, works for Sam, and Phil took the idea and has been working with it. However, as he found out, this type of solution can be cracked, and means altering the nonce, which means changing the code. We, Phil, Jacques, and I, felt that a solution that would require lots of tweaks of the code on a fairly frequent basis would not be a viable solution to release to the non-tech MT users. So instead, we’re focusing on throttles.
(Note, if you are into tweaking code, check with Phil about his efforts. The code is not published online in order to impede the efforts of our interesting new challenges.)
Six Apart released one throttle with Movable Type 2.661. Unfortunately, though, it focuses on on IP address, and both the comment spammers and the crapflooders have gone beyond single IP addresses now. If you look at the MTFlood code (ironically enough, the code used to create the crapflooder’s application is actually open source) you’ll see that the system uses a series of calls to proxies to get proxy IP addresses and uses these to alter the IP associated with each post. It’s very unlikely that IP-based solutions will be at all viable either now or in the foreseeable future.
Enter Jacques Distler who back in January released a patch for the Comments module in Movable Type that throttles comment flooding. How the throttle works is that if a threshold of comments is exceeded within a single hour, comments are shut down and an error is returned for any additional comment. In addition, there is a broader throttle in effect for a 24 hour period.
(He found that a value of 20 comments per hour, 100 per day seems to work for most folks. That’s the value we have used with the patch files you’ll be able to download later. Unless you’re one of the higher ranked political pundits, these values should be effective. They can also be changed in the code.)
When we were hit with Trackback crapflooding last week, Jacques also wrote a patch for Trackback crapflooding. It operates in the same manner as the comment throttle–only so many per hour, so many per day.
The benefit of this type of throttle is that your site cannot be overwhelmed with getting hit by over a hundreds of comments or trackback pings at a time. Again, when you add in the peripherial processing such as rebuilding and emailing, this can be a strain on the server.
Now, once the throttle is in effect, it is atomatically reset in either an hour or the next day, depending on which threshold you hit. Additionally, if you delete the bad comments or trackbacks, this resets the trap. Unfortunately, throttles act just as they sound–they throttle out of control action, but the don’t stop it completely. You can still get hit with up to 20 comments or trackbacks at a time. Though this is easier to take care of than hundreds it’s still not trivial within Movable Type. Enter the next aspect of this overall solution: Jay Allen’s Mt-Blacklist.
I’ve talked about MT-Blacklist before, and blacklisting in general. I don’t like blacklisting, and I never will. However, Jay also wrote a nice interface for managing removal of both comments and trackbacks, as well as a very nice utility that attaches a link to each email to delete the comment or trackback. In addition, a lot of people have been helped by the blacklisting action of MT-Blacklist, which has stopped our original friend, the comment spammer.
(The problem with blacklisting is passed around lists of blacklisted items, which can include legitimate URLs–such as fda.gov. There’s also concern about scaling some day if the list begins to number into the thousands.)
Be aware that MT-Blacklist’s blacklisting functionality would not stop the comment or trackback crapflooder, who alternated real weblogs URLs with fake URLs made up of random word and letters. Blacklisting is based on combating comment spammers, who use real URLs to real sites, but not weblogs.
In addition, Bayesian filtering, which you may have heard about in connection to email spamming, won’t be effective either, because the comments themselves are built from random entries from various publications (or by stringing together unreleated words). Baysesian filtering is based on filters that learn from what is ‘good’ and what is ‘bad’ text, and adjust accordingly. There is little rhyme or reason to weblog commenting anyway, much less comment spam or comment flooding–weblogs by their vary nature generate esoteric conversation.
Another suggested approach with trackbacks is to follow the link associated with the ping to the originating site and see if it exists. However, during one of the trackback attacks initiated against me, another weblogger’s posts were used as the source of the ping. In fact, the attack against me was in actuality an attack against the other weblogger.
You actually don’t have to use the blacklisting component of mt-blacklist–you can just use the management aspect of the tool, which is what I am now doing. And for that, its help is priceless.
Between the two–crapflooder and spammer throttling and MT-Blacklist–you can at the least, keep your site from being overwhelmed by attacks not to mention clean up afterwards. And if you use blacklisting, you can eliminate some of even most comment spammer’s efforts. In fact it is the merge of several different people’s efforts that are now protecting this site, and which I will detail here.
The steps are:
- Upgrade to Movable Type 2.661. The reason for this is to add that IP throttling and the redirect if you want to deny Google access to the URLs of commenters. It’s also a good, common synching point for our efforts. If you’re concerned about the redirect operation, later on I’ll describe a plugin written by yet another contributor that will allow you to work around redirects.You can download MT 2.661 at Movable Type’s web site. In addition, find the documentation associated with this upgrade and follow it to upgrade your installation.
- Once upgraded to MT 2.661, install or upgrade to Allen’s MT-Blacklist v1.63 beta. I would hesitate to have you upgrade to a beta release, but it’s the only one that works with MT 2.661. If Jay has to change the impacted patched files, which I’ll provide later, I’ll provide updates to these and they’re very easy to install.Now, I did a fresh install of MT-Blacklist 1.63, and have had no problems using it. If you’re upgrading from MT-Blacklist 1.62, you’ll need to use the 1.63 beta upgrade package. Otherwise, use the fresh install. Jay has provided installation instruction for this, which should be trouble free. If you run into problems, check to see if Jay has provided a troubleshooting solution to your particular problem. You can also ask questions here.
If you can’t run this application, later I provide patched versions of the code directly in 2.661.
- Once you’ve installed MT-Blacklist, you’ll need to download two files that have incorporated Jacques Distler’s throttle code. Once downloaded and unzipped, copy the two files–MTBlPing.pm and MTBlPost.pm–to Jay’s extended library location: /MTinstalldirectory/extlib/jayallen.Unless you want to change the throttle values–20 per hour, 100 per day–that’s it to add throttling. If you do want to change these values, open the files, search for the word ‘Throttle’, find the 20 and 100, and modify accordingly.
- Now, I don’t like the Movable Type 2.661 redirect, so what I’ve done is download and install David Raynes’ Optional-Redirect plugin. How do you install it? Copy the file, unzip it and drop it into the plugins directory of your MT installation: /MTinstallationdirectory/plugins/.(There is one code change associated with this plugin – commenting out a duplicate line. I created a temporary copy of this for download for those of you who are not comfortable hacking around with Perl code. )
Also, as noted in comments associated with David’s post, if you use “spam_protect” in your individual comment template code, you’ll need to replace this with “show_email” instead. You could also alter the code, but I think the template change is a better option.
(Note, though, that you only need to use this plugin if you don’t want redirects; adding it has nothing to do with the throttling code. There is an alternative method to protect your comments from Google and thus ’starve’ the comment spammers, which is detailed in three of Jacques’ posts: here, here, and here . However, using redirects and starving comment URLs won’t stop the crapflooders–they don’t care about Google.)
This seems like a lot of code and I would have liked to pull all this together into one installation package, but this violates both Movable Type’s and Jay Allen’s license restrictions. Still, if you have already installed some or all of these updates, your job should be that much easier.
Hopefully these steps should help you protect your site as well as add improved comment and trackback management. They don’t provide perfect protection, but they do provide control, and right now, comments and trackbacks are out of control.
In addition, unless you get many valid comments on older posts, I still recommend turning comments off on posts 30 days old or older (adjust time to your liking). I detail how to do this with SQL here. You can also use this to turn trackback off by changing the column to entry_allow_pings and set the value to zero (0).
These changes will not be compatible with Movable Type 3.0. When 3.0 releases, your options are: use whatever throttle and protections are included as part of that installation; just continue using the older version of Movable Type; or move to a different weblogging software package.
Until then, though, hopefully this will help. Holler if you have questions.
More discussions at Phil’s:
Throttling Down
Confidential to my Crapflooder
Also, another fix for comment XHTML for 2.66 from Jacques.
Due to the fact that some people can’t run MT-Blacklist, you can also access a copy of Comments.pm and Trackback.pm from MT 2.661 that have had throttling added. Unzip and copy to the MTinstallationdirectory/lib/MT/App/ directory. Unfortunately, though, you won’t have the comment and trackback management that MT-Blacklist provide. However, with less than 20 comment spams at a time, you also won’t have the burden deleting 100’s of comment spams.