Categories
Political Weblogging Writing

Like to Like

Recovered from the Wayback Machine.

The new group weblog Open Source Politics had its first week, and one can’t help but applaud the quality of much of the writing that’s come out of this effort. Of course, I know two of the people contributing, Mike Golby and Loren Webster, and have enjoyed their writing for some time so this isn’t surprising.

As much as I applaud any effort that gets Mike and Loren the readership they both deserve, I’m still not fond of these group weblogs. Rather than expose important issues to a new audience, the decidedly liberal nature of this weblog will either attract those who are liberal in the first place and likely to agree; or attract those from the opposite viewpoint, looking for cannon fodder. I think those who are more neutral are going to be pushed away by the tone and focus of the weblog.

I can see wanting to move political commentary from your weblog if you want to focus on writing or poetry or linguistics or some other specialized topic. However, if you’re hoping to influence people who haven’t made up their minds on specific issues, wouldn’t a more effective approach be to focus on writing or poetry or linguistics or technology, with an occasional sneaky aside into politics?

Speaking of politics, yesterday was the first of the Democratic candidates debate and though I know that people wanted fireworks, I thought all of the candidates were very responsible in focusing on Bush rather than each other. No candidate is perfect, but there were some good points raised. As for my political leanings, it’s no surprise I’m voting Democrat, but which specific candidate is between me and the button. I make no apologies that my focus is on removing Bush from office, rather than promoting any one of the Dems. I can say I don’t support Lieberman because of his hawkish outlook, and I don’t care for Gephardt because I think he sold us out when he stood behind Bush in the invasion of Iraq. Other than that, I’ll support whoever has the best chance to defeat Bush.

In line with recent discussion about politics and religion, the Pew Forum just released survey results relating religion and politics and voting in this country. Extremely interesting reading.

For instance, from what I can see from these results I would more likely to be voted president if I were Muslim than aetheist. In this country, we’re willing to concede a shared heritage from Abraham – reluctantly – but the non-religous, and one has to assume the polytheist and animist and other outsiders, need not apply.

In fact, unless there’s a drastic change in culture and attitudes about religion, it will be a cold day in hell before an aetheist is voted President in the United States.

Categories
Weblogging

Spammers: Getting to know you

Probably my last entry in this recent series of Weblog/Email Spammers: Evil thereof. I only write this entry because some of you have implemented my little comment hack, and have most likely found it’s not working with our friend, vig-rx.

My latest hit from this particular fiend was based on a Google search for the following:

blog August 2003 Name: URL: Comments:

In other words, if the page follows a consistent Movable Type template, which shows the month and year of posting, as well as containing the traditional form and comment field labels of Name, URL, and Comments, and had the word “blog” somewhere in the page (such as the title), you were visited. What the spam automation is doing – guesswork, only, so take with a grain of salt – is grabbing the page, finding the form, finding all of the form fields (including my own little hack), and recreating a form POST with the same fields.

Targeting specific weblogging phrases makes sense because we all start with a basic set of templates for our tools and then modify them. Unfortunately, we focus on the appearance and not the content. So, for instance, we MT users leave the comment form in the same page as the individual posting, and we leave the labels the same – Name, URL, and Comment. To make things easier, we use the word ‘blog’ somewhere, such as our title, blogroll, or so on.

I don’t use ‘blog’ in my basic template at Burningbird, but do mention the word when I’m talking about blogging – which just goes to show that perhaps I need to talk about blogging less. My Practical RDF weblog isn’t getting hit by the comment spammer because I don’t maintain the comment form in the individual entry pages; the labels of Name and URL are missing (not to mention the form for scraping). Most of my other weblogs aren’t getting hit because I rarely mention the word ‘blog’ in them – other than in Weblogging for Poets – and even if I did, I’m not using the traditional date annotation with these essays. No August. No July.

Not only does this person have a decent understanding of how to use technology – using different IP addresses, timed delays between accessing the page and posting the comment, page scraping (grabbing the form fields), and most likely changing the requesting Agent so that it looks like the request is coming from a browser (IE, of course) – they have a fairly good understanding of people, and our habits. Clay Shirky, this is the type of person you should invite to your software summits.

This comment spammer is a good social software engineer, lying in wait observing us and our patterns and then crafting software that fits how we do business. Rather than get angry at this person, we should marvel at their ability to write software that is so adaptive to how we use software. Rather than tear our hair out and gnash our teeth, and block every IP from ChinaNet, China’s primary Internet provider, we should be smiling wryly at how our own habits have been used against us.

After all, the solution to this spammer, this time, is to change one label in the template – for instance, changing the label of URL to Link. All of our clever technical hacks fail but a simple human hack succeeds. Of course, as we adapt so will the hackers. There is no ultimate solution to this problem, other than eliminate comments.

When I was heavily involved with P2P technologies (Peer to Peer, such as the music sharing software), we knew that the key to making our software work would be to fit the technology to people’s behavior, not make people change to fit the technology. We need to look no further for our teachers of this type of software development and distribution than the virus writers and spam generators.

Take our recent email spam buddy that’s cause us all so much heartache. You would think that we would have learned not to open email attachments by now, but we’re still getting hit because people are still opening the email, launching the virus contained in them, and generating yet more emails. Why do we open them? Simple – the spammers use our behaviors against us.

They pull people from contact lists and used these as senders so the names are familiar. They vary the subject line. They take advantage of open hooks within the software that’s installed by default and the operating systems on most PCs. Most of all, though, they used subject messages that triggered trusting responses within us – the use of “Thanks!” and “Wicked Screensaver”, “Details”, and so on. I wouldn’t be surprised if the spammers weren’t collecting data from the machines of people that opened the attachments, seeing just which subject was responsible for more results.

And we make these things so easy for the Bad Guys. We use Outlook for our email on Windows because that’s what’s installed by default. We trust the identity of the senders without examining the headers. We trust our software to protect us, though the same software blocks friends as well as foe. Most of all, we fall into patterns that can be automated – such as all of us Movable Type people using a comment form that has the same labels of Name, URL, and Comment.

Recently, there’s been discussion that email is ‘broke’, though I have no idea what people mean by email in this context. Do you mean the protocols? The email applications? Or do you mean people using the software, because there’s a world of difference in looking at email from a technology perspective, and looking at email and how we use it. Yet, rather than focus on our behavior when using software, we focus on the technology and we talk about using RSS as replacement for email, same as we talk about using htaccess and MT to block IPs of spammers from our sites. Or using my own comment hack, so easily set aside.

And all the while, the virus writers and spammers are watching us, seeing how we react, observing what we’re doing, listening to our debates – and are already hard at work writing the next generation of virus and spam generators.

Categories
Weblogging

Passive resistance

Recovered from the Wayback Machine.

Sometimes I think we technical folk are too clever for our own good. The more gimcracks we put into out tools, the more gimcracky things crawl through. We tweak just to tweak, and add far more moving parts to applications than are needed, or even desired.

In addition to making things more complicated than they need to be, we also forget that there are non-technical folks out there who don’t like to have things “done” to them. However, they’re forced into a role of passivity because we bring our shovels in and proceed to bury them with words until they retreat back into their proper role. We are the doers, they are the donees.

I started the For Poets sites specifically to bridge the gap between the technologist and non-technologist, though I lost my mad energy burst earlier and haven’t finished all the planned essays. This week without fail, unless I fail in which case not this week, Semiotics of I (with fresh inspiration from Spirited Away), and The Ten Command(ment)s of Unix. A new one, too, called Walking in Simon’s Footsteps: or What’s a nice XML boy like you doing in an RDF joint like this?.

(If the weather clears though, and the waters recede, all bets are off. I need my walks. And I have a trip to take to Texas.)

The thing is, a passive role for non-techies isn’t always the fault of the Alpha Geeks; non-techies need to make a choice about how passive they’re going to be. When a techie says do this or that, the non-techie should ask why, and keep asking why until they understand it. No one is incapable of understanding the basics of online technology, if they’re interested enough, and persistent enough. Besides, aren’t all you non-techs getting tired of being donees?

If you’re a weblogger, know the technology surrounding you, and control it, don’t let it control you.

Case in point was our little vig-rx friend. It’s easy to find weblogs to spam when they’re so accessible using simple services, at Google and half a dozen other places. Google and public RSS aggregators provide links to specific URLs, and even comments, which just makes the spammers job so much easier.

I talked about a quick and dirty fix for vig-rx. It uses a hidden field embedded in the comments form in my pages, existence of which is then verified when a posted comment is received by the Movable Type code. This will prevent anyone from using global comment posting based on a standard posting format for MT comments. This would, for instance, prevent the comment spam that occurred with my Faux photoblog this morning.

For MT users, and other weblog tools that use an individual entry identifier for a page name, one thing that could slow comment spamming is changing the file names of the individual entries to using keywords or entry titles – in other words, removing that tasty little entry identifier from the page name. Without including the entry identifier in the name of the page, it can’t be discovered in the page URL and used to post spam to your comments.

However, this only goes so far, because any spammer with the intelligence above an amoeba can grab the HTML for your entry, find the comment form in it and dig out the entry identifier within the form. Come to think of it, any spammer can also dig out my little hidden field hack and build a comment post containing my comment form fields (and all default values). Piece of cake, I can do it, most web developers can do it. And there’s nothing illegal about this. Nothing at all. After all, we open the door, we invite people in. That’s the problem with all this stuff – it’s not brain science to do the technology. All you need is an open door to the data, and we practically beg people to take our data. Please take our data, the hits feel so good.

Yesteryear when the hordes were at the gate we pulled up the drawbridge and manned the battlements with boiling oil ready to pour. Now, the drawbridge is down and we’re using the oil to fry donuts to go with the coffee we’re giving to the barbarians we invite in.

Back to my friend vig-rx: To work around the comment spamming hacks, some folks force a time period between a specific IP address first accessing a page, and a comment being posted. The thought behind this one is that automated tools would post a comment within seconds or microseconds of accessing the page, or not access the page at all; however people have to have time to read the contents.

Well, think again about this being a good idea. I timed the page access and the time the comment post was made last night with our friend vig-rx (really, I like this guy – he’s potentially clever). The first time there was a 2 minute delay, the second close to three minutes. Of course, this could mean ‘vig-rx’ is a person using a persona, reading the content and then posting their little hypertext link bit bucket as a sort of thank you.

Yeah, and pigs fly.

Don’t want to pick on weblog comments, only. RSS (and most likely Pie/Echo/Atom) is another open door. We’ve found that when we provide full content, our weblog entries are being posted elsewhere online, rather than being used as links to our pages. Then we find that links are being made directly to our photos – a process called hotlinking, which I discussed in a previous essay.

To prevent full content republication, we provide excerpts, which means that people who want to read the content offline, can’t; and to prevent hotlinking, we build in checks in our htaccess files to make sure images are accessed only from our own domains – also preventing photo access to our friends who are hosted at sites that don’t allow photo uploads.

Now there’s a new one, and, just like RSS, it’s coming from the ‘good guys’. By request, Brent Simmons is implementing HTML differences in NetNewsWire, a popular Mac-based RSS aggregator. With this, every edit you make to your writing will be persisted and color coded. In fact, it works just like Mark Pilgrim’s Winer Watch, which was the inspiration for this idea. I imagine that other aggregators will also add this feature. I can see their busy little fingers at the keyboards now.

The only way you can control this is to not provide content or excerpts, a solution I just implemented in my RSS files. My feeds are still perfectly valid, as neither content nor excerpts are required. Sorry for those of you who miss the excerpts in your aggregators. However, I really don’t like the concept of ‘marked edits’.

Since the techs are taking away my control, I guess I’ll have to remove the data.

Now there’s discussion about using RSS for email. What we need is to find hobbies for all the techs out there so they stop tweaking with the technology, making simple things break, and using things the way they weren’t originally designed. Perhaps petit point, or maybe badminton. Meg wrote about this when she was mucking around with OPML this weekend:

Maybe if the format you’re using requires you to change it to represent your data, you’re not using the right format in the first place.

Which makes me realize that I think some of the problems we’ve had in the weblog community around formats like RSS and OPML might stem from the fact that we use them in manners for which they weren’t designed. But that seems like a topic for another day’s rant.

Meg got it one – if you have to change the format to capture your data, perhaps you’re using the wrong format. Database people and business application developers have known this forever – it’s called business domain scope. Now, what will it take for the Alpha Geeks in this neighborhood to get it? A shot of female hormones? Or our private email encoded in RSS, pulled into an aggregator, marked for edits, attached to class penis enlargement spam, signed with the name vig-rux posted in a comment of weblogs found from scarfing your FOAF file?

Come on, non-techs and techs both. Say, “Enough already”. Let’s spend a little time closing the barn doors before we buy more horses, shall we?

You know, I’m writing a lot about metablogging lately. Hmm. Enough already.

(BTW, I edited this posting six times after the original writing – can you imagine how colorful it would be in a Burningbird Watch?)

Update/No/Update Make that eight edits! I’m getting more colorful by the moment. A veritable rainbow. They’ll have to invent colors just for watching me edit. In addition to pink for deletion, green for addition, there will be a Burningbird orange for “hacked to pieces”.)

Categories
Weblogging

Using Google against us

Recovered from the Wayback Machine.

The vig-rx blog virus, otherwise known as comment spammer, is using Google against us. After stealing another IP address, as expected.

Weblogs being targeted are being found through a Google search. Example is here. Aren’t open web services a wonderful thing? Go ahead – all open comment MT weblogs on this list have this comment, if they haven’t deleted it yet. The key word in the search is Blog – any weblog title or entry with Blog, and Bob’s your uncle.

Let’s kill the Googlebot. Anybody got some rope?

Evil intentions aside, this was a great example of P2P (distributed) technologies. You know who will build the semantic web? Spammers and virus geeks, and kiddie hackers.

BTW, my comment trap should stop this one…for now.

Categories
Weblogging

DDT for comments

Recovered from the Wayback Machine.

From the trackback entries I’ve received from an old comment spamming entry, I gather the spammers have been out and about recently. I received a recent comment spam myself – a shotgun message that seems to provide links to everything your kid wants to know about, but you don’t want them to ask.

It goes by vig-rx. Rings the bells?

Even though I discussed a method for preventing these, I received the comment because I don’t currently have the comment trap (described in the post referenced by the trackbacks) enabled. Why? The reasons are simple: I’m currently adding new weblogs and there’s too much overhead for too little payback with the technique.

The comment trap requires changes to all comment forms in all templates in all weblogs. I have recently added several new weblogs, and am adding three new ones in the next week or so; that’s a lot of template changes. As all of the weblogs use the same comments application, mt-comments.cgi, either the template change is added to all weblogs and weblog pages, or it’s not used for any of them.

I could add the change, and that leads to my second reason for not using the comment trapper at this time – effort and payback. If I implement the comment trapper, it’s used with every comment to my weblogs, from either friend or foe. Though the code seems insignificant, it adds to the overall process burden on my weblog’s server; start adding up tiny little burdens and over time, you have some significant performance hits every time a person tries to post a comment.

It would be worth the performance hits if I received a lot of comment spams, but I don’t, and other than the bad nuisance ones that post a thousand comments at once, the comment spams I get aren’t much more than a minor annoyance. I see them, I delete them, end of story.

What I find more annoying is the Google searchers who search on some esoteric search phrase and post comments on old posts that are irritating and irrelevant to the post. These do not fit the criteria of ‘comment spams’, but they also don’t add a lot of value, either.

I have a couple of options for older posts. The first option is the one I’m currently using, and that is allow the comment but filter it from my ‘recent trackback/comment’ list. I also did this with trackbacks after getting several trackbacks on old posts from Radio-based weblogs when trackbacks were enabled. However, this also filtered out the recent trackbacks because of the comment spam problem – odd how this works out – and I decided to keep the comment filtering, but eliminate the trackback filtering. For now.

Another option is one that I’m very seriously considering and that is turning off comments for older posts. Weblog writing is both ephemeral and enduring, contradictory as this may seem. Our writing rolls of the page to barely accessed archives, with faint hiccups of activity that linger a week or two from latecomers; but because of search engines and other weblog writers with long memories, the writing never completely disappears.

Have you ever been to a party and been in an animated discussion with a group of people, and someone joins the group with comment about a conversation you were involved in 6 months ago? Unlikely in real life, but this type of activity can occur in weblogs. It’s particularly noticeable with weblogs like mine and so many others that implement some form of recent trackback/comment feature.

While I can see the value of the trackback on older posts – look how three pings have re-awakened an old conversation in response to comment spammers – I question the value of comments on old writing and old conversations. The players have moved on, the songs changed. Additionally, turning off comments for older posts provides fewer entry points into our systems for comment spammers. This is an option I’ll continue to think on.

Two options I won’t explore, though, are IP banning and comment registration. I find comment registration to be irritating, and have been put off more than once having to register to leave a comment. I’d rather just turn comments off.

IP banning is more troublesome, and I hope that people who’ve implemented this consider carefully the consequences. As some of you may have discovered, the recent vig-rx comment spam originated from a domain that’s part of the Asia Pacific Network. APNIC is the equivalent agency as ARIN, which manages the IP addresses for America; it is one of the four major registries that manage DNS for the world. Further lookup at APNIC shows that the IP originates with ChinaNet. In case you’re curious, ChinaNet is the major Internet backbone for China.

If you add the IP address to your .htaccess file to block it, congratulations – you’re effectively denying your weblog to people in China, because chances are, the next time someone uses that IP, it’s some student or other person out exploring or looking for information. If you add them to MT to block comments for the IP, they can still view your weblog and most likely wouldn’t leave a comment anyway; however, then you’ve added a tiny bit more CGI processing for every comment that is left.

The problem with IP banning is that it only works with consistent IP addresses, and the only entities with consistent, unmasked IP addresses are companies who don’t use proxies and people affluent enough to have a static internet connection. It’s too easy to spoof IP addresses – originating a comment spam from one IP address, making it seem like it comes from another – and too easy to use a random connection to change the IP address next time you’re in the neighborhood with porn to sell.

An additional constraint on the effectiveness with IP banning is that people and organizations also use open proxies to access the internet so that their IP addressed aren’t exposed. The use of proxies was covered not that long ago when it was discovered that China was blocking access to Blogspot weblogs from people using IP addresses that originated in China. In fact, IP addresses from that same China Net that originated the current flurry of comment spam activity.

As regards to our friend, vig-rx, if lists of IP addresses are passed around weblogs, as was discussed over in comments at Liz’s weblog, and added to .htaccess files everywhere, then the Chinese government doesn’t have to censor weblogs – we’re doing it for them.