Well I've decided to protect my precious blog against self-serving spammers seeking to raise their rank in the almighty Google's eyes amongst their likewise scummy competition which peddle smut and other such wares. Perhaps you have no idea what I'm talking about, perhaps you do, but in either case I'm going to explain it anyway.
There are a variety of bots crawling along this here internet along with regular people like you and me. There are good bots, not unlike Glenda the good witch, like search engine bots which allow us to index our sites in Google and the like. There are also bad bots, these are the things that pick up email addresses from websites to entice you with offers of strumpets ready to "entertain" you, pills/gadgets/contraptions to enbiggen parts which uncomfortable people seek to enbiggen, pills/gadgets/contraptions to ensmallen places which larger people seek to ensmallen, and all manners of shady deals to get money out of your pockets and into theirs. The bots in question today fall under the category of "Bad Bot" - I would smack them with a newspaper on the nose and say "NO! Bad Bot!" if I could, but the combat employed today will just have to do.
So what does this bot do? Well, it surfs the internet looking for places such as this blog where it can add content. The creator(s) of the bot made it smart enough to find blogs and fill out the forms to make their own posts in the comments area. Of course the poor, misunderstood creatures created by these cruel Frankensteins are not programmed with a brain capable of responding intelligently to the blog entry, they are only capable of saying things such as "Find pills to help your self-worth grow since you are the kind of guy who measures his self-worth by the size of his reproductive organs!"
But you want to know the interesting part? They aren't doing it for us to click on, in fact they would be perfectly happy if no human saw their advertisement - especially if the human happened to run the blog and was aware that there is a "Delete Comment" feature. So what they typically do is post in blog entries a few weeks or months old in hopes that the almighty GoogleBot will find their spam but meddling humans won't. The pay-off is that GoogleBot says "Hrm... This guy's site is linked on 5,000 blogs, it must be a good site for Harlots, lets rank it higher next time some pervert does such a search"
This has been an intermittent problem on my own blog, six times now have these nefarious bots chosen me as their prey - and six times now I've fought back in the only way available via standard MovableType operations, delete the comment and ban the IP. Unfortunately IP bans can be circumvented quite easily and are thus are only effective against people who don't know any better... and that's assuming they don't have dial-up which will change their IP for them each time they dial in anyway.
So today after deleting the worst post yet (Rather than 1 link to their site via their username, there were 17 links in the comment plus the link in their username) I went in search to see if somebody had already developed a plug-in for MovableType or if I would have to develop my own workaround. Thankfully I found a solution, elsewise I would have been tempted to move over to a php-based script like WordPress, since if I'm going to modify code I would prefer doing it in the web language I'm most comfortable with. Sorry Brian, I know you woulda liked me to switch, better luck next time.
The solution I found was a plug-in called MT-Blacklist which implements a comment blacklist for me. The list is updated periodically with new offending sites and also supports regex matching to offer some more generic protection. I could also use such a thing to filter out foul language or innocent words such as "bird" "milkshake" or "pudding" - the power is in my hands. I don't plan on censoring anything outside of the blacklist that is provided for me unless a spammer manages to sneak by with a "clever" substitution of an i with a 1 or something of the sort. Yes, in case you were wondering, this seemingly bulletproof solution has a flaw which e-mail spammers have been exploiting for a while now in an attempt to slip by the powerful Bayesian Filtering system employed by spam filters like PopFile. Thankfully PopFile has served me well despite their attempts, it learns quickly. Unfortunately for the spammer, they DO need to have their URL go to their site, so even if they slip by with a registration of a creative domain name I can still just blacklist that site for future spam attempts. Not perfect, but good enough.
So this is just forewarning, if you see your comment got rejected, it was probably marked as potentially coming from a spambot.
Posted by Michael at June 30, 2004 12:12 PMYou never cease to amaze with your compu-squirrel powers.
Posted by: Rx King at July 1, 2004 05:37 AMSo if I make a post about penis enlargement, it won't go through? :D
Posted by: Brian at July 1, 2004 10:01 AMIt works, I tried to leave a post with what I'm assuming was a blacklisted word.
MT-Blacklist is a decent fix for an MT 2.6x install. However, I prefer WP's fix. I can have it not reject, but suspend any posts that contain either certain words or with an inordinate amount of links. Lately, comment spam has come with a crazy number of links, and of the seven attempts I've had since I moved to WP, every single one has been thwarted.
Also, it should be noted that if you upgraded to MT 3.0 (which, technically, since you're a 1-author 1-blog guy, you could do for free, you just have to register) you can do some more advanced forms of moderation. I still prefer WP now, but MT was and is a great tool. It's just that I'd rather support OSS tools, even if it means buffing my PHP skills a bit.
Posted by: Brian at July 1, 2004 10:04 AM