Technology Weblogging

Comment spam prevention in Wordform

I believe that, eventually, most comment spam strategies will have to have a system-wide component in place to truly combat this problem — something to watch for comment spam patterns happening on a server, and throttle accordingly. However, that’s something that can’t really be handled with the application. So, I’ll focus on what I can do in Wordform.

My comment spam protections are not going to include a blacklist, in any shape or form. These require too much processing, and are too vulnerable to corruption. Instead, I’ll use a variety of techniques that combined should protect a site — even a heavily hit site.

First, I’ve added individual comment moderation so that you can turn moderation on for a specific post, or a group of posts. When this is turned on, a message will show near the comment form stating that the comment is currently moderated.

Next, I’m adding new capability to search in comments for those that fall into a range of dates, and then be able to delete all comments that match a search criteria. With this, if you do get hit, it should be easier to delete the spam.

(I’m also adding a one-touch button to globally approve, or delete, all moderated comments.)

The comment posting page will have a throttle that can be configured in options. This throttle will check the number of comments received within a certain period of time, and if the count exceeds a value that the user can specificy, will either moderate the comment, or deny it (again, something that can be configured). At Burningbird, the throttles are no more than ten comments in a minute (a WordPress option); and no more than 50 comments in a day (my option). These two values can be changed, and I’m also adding a maximum count for number of comments allowed in an hour. All of this will prevent ‘crapfloods’, which can overwhelm a site, and even a server.

Currently I’m using database queries for the comment throttle I have at Burningbird, but for Wordform, I’ll be using other caching methods to hold timestamps and comment counts. This should make the throttle lightweight and robust.

I’m also adding a configurable option to either close or moderate all comments over a certain number of days old. I use this with Burningbird, whereby the first comment to a post over so many days old gets moderated, and then the post gets closed. This has eliminated probably about 98% of my comment spams, while still giving me the option of determining (from this last comment), whether I want to keep the post open, but moderated.

A new functionality for Wordform not currently implemented at Burningbird is the ability to close a discussion. By closing a discussion, the post (or the web site) is temporarily put into a lock-down form, where only those people who have previously written published comments can add new comments. When they do, the comment is posted immediately. If a person hasn’t added a comment previously (based on the person’s email, which is a requirement for lock-down, though it’s not printed), their comment will be put into moderation.

Finally, I’m experimenting around with a new comment spam prevention method that I’m calling “Stealth Mode”. However, this is one item I am leaving for a “Ta Da!” moment when I release Wordform’s first alpha release.

(Most of these comment spam moderation techniques will also apply to trackbacks. I’m currently wavering on my support of pingback, which is really nothing more than recording a link, and this is accessible via the vanity sites.)

Between all of these–Throttle, Lock-down, individual and weblog moderation, better comment management, closing older posts, and Stealth Mode–the comment spam problem should end up being no more than a minor irritation in Wordform. Then if I can just get people to accept that comment spam is not an invasion of a person’s personal space, and that it’s a way of life and to not spend so much time fretting about it, we’ll have the comment spam problem managed.

Technology Weblogging

Progress Report

I am in the midst of converting Wordform’s architecture into supporting multiple weblogs. The procedure I worked out, over coffee at Border’s, is the following:

Pull the SQL statements out of all the application files and incorporate them into one file. The reason for this is to help me identify all of the SQL bits the application is using, and make sure none are missed. This also makes it easier to make changes to the underlying SQL in the future — as all of the database accesses will be in one spot.

Update the database. I’m adding blog identifier to most of the tables, but I’m also splitting the options table into a weblog information table and an options table, and adding some foreign key relationships. At this time, I’m using a default weblog identifier until the program pieces are in place to add weblogs.

Modify the program to set a default blog identifier, and then adjust all the functions accordingly.

Once the backend components are in place, I’ll front end pieces. The first will be to add a section to pick a weblog from an existing list, or choose to create a weblog in the what used to be Dashboard. Picking an existing weblog will set the globally accessible weblog identifier within the administration tool.

Creating a new weblog is a bit tricky with PHP, because the application doesn’t have general write permissions. A new .htaccess file, index.php, and word-contents subdirectory need to be created in the new location of the weblog. Either the create weblog routine will provide the how-tos, or more likely, have the person make the subdirectory writeable temporarily.

Other than the tricky bits, the rest of the weblog creation is simple — just data collection.

The base installation of Wordform is very simple, meeting the needs of most users. The multiple weblog capability is being added to the code, but the actual front-end pieces are being create as a Wordform extension — pages that can be dropped into the tool’s administrative interface. The capability for this also requires several backend code changes.

First, the post status and comment status are being pulled in from the database, to make these adjustable via extension or plugin. Next, the menu data that runs the top navigation tags for the application is also being pulled in from a database, again so these can be easily updated with administration extensions. Finally, the former dashboard is being modified in a couple of different ways.

First, the list of extensions is displayed, with an option to uninstall each. (Unlike plugins, administration extensions can be installed or unintalled, but can’t be turned on and off). Next, the main area of the page is dynamic, just like the weblog posts themselves. With this, extension developers can create content for this area for hooking their extension in as needed. For instance, with the multi-weblog extension, the extension will add code to list the weblogs, allowing the person to select from the list. This list will be filtered to just those who have been given access to the weblog.

The multiple weblog extension itself will consist of a couple of files that are copied to the administration subdirectory, and loading one page that makes the appropriate database updates. Refreshing the admin site in the browser will then show the new extension in place, with all the appropriate backend goodies in place to use it.

My plan is to have these bits in place by New Years and then release a first cut of the code. All of this should be sufficient enough to make Wordform a unique product by that time.

Technology Weblogging


In case you’re curious, or see odd behavior now and again with this weblog, I’m making the code changes for Wordform directly on the source running this site.

By working on a ‘live’ site, I get to test the changes as they’re made. More than that, this forces me to be very careful with my changes — to make sure that I don’t remove one bit of code until another is in place to replace it. This, in turn, ensures that I’m less likely to introduce bugs, though there may be an odd–but soon fixed–break now and again.

Besides — it makes life more interesting.