Captcha SPAM Protection for Wordpress

My Wordpress Captcha

My Wordpress Captcha

Spam is always a problem, no matter who you are. I should quickly say that I have nothing against people promoting their sites, or companies making people aware of special offers or products. No, the kind of spam I refuse to tolerate, is porn advertising, Viagra pills. There are some wonderful patches for Wordpress out there to make it ever harder for the machines to start posting. Today I added a new technique that may not have been done before which others might find useful.

I began by rewriting some of wp-login.php (which handles new user registrations, logins and forgotten passwords!) and wrote my own captcha in PHP (using the GD libraries) basing my code on a helpful example, and improving it by applying separate transformations to each individual character (random rotation, sizing, colour etc) then to render each independently with one or two other neat fiddles to make it more effective still:

REFRESH

Intelligent SPAM Protection for Wordpress, Google Searching for Spam eMail Addresses

Then, a friend suggested something that made all these captcha techniques look very 20th century. You see, up until now I’ve been searching Google for the email addresses of each new user, to see if stopforumspam.com or any other similar sites come up. (This is a good way to quickly knowing if the new user is a machine or a person).

At this friends suggestion, I made Wordpress search Google for the new users email address using the curl_exec code I have shared previously and if it finds ‘stopforumspam.com‘ or anything similar on the first page of Google (which is where it will be, if anywhere!) it tells the user:

You can download my modified version of wp-login.php here, although it also includes the captcha which I have called capt.php (you will need to alter this to whatever your captcha script is called)

Confusing Spambots in phpBB 3 using Web Pages of Infinite Size

This same week, I also discovered a method of delivering pain to spambots. It began with my GTA San Andreas forum at www.rogerdavies.org.uk which kept getting battered with spam (largely because I can’t be bothered reading the manual for phpBB 3 again to figure out how to tighten security!)

It suddenly hit me! Both humans and machines share this world. The human will be looking for the bright and colourful ‘post comment’ button, whereas the machine will search for the text : “<form action=”comment.php” method=”post”>“… but the objective remains the same.

All I needed to do was lead the machine to believe it would get what it wanted, and play ‘keep aways’ with the grand prize … the ’submit’ button! So I updated my ‘post comment’ button, redirecting the spambot to a page on my roger-davies.net server I run in my bedroom which would send out a randomly generated web page of infinite size that I called ‘kill.php‘ just to see how they react. There is a little text asking the human user to ‘click here to continue’ upon which they are taken to their entry ready to start typing, but of course spambots cannot read as we can, and will of course not follow the correct link.

194.8.74.67 – - [02/Apr/2009:15:31:55 +0100] “GET /kill.php?f=1 HTTP/1.0″ 200 1040384
194.8.74.67 – - [02/Apr/2009:15:32:07 +0100] “GET /posting.php?mode=post&f=1 HTTP/1.0″ 404 209
194.8.74.67 – - [02/Apr/2009:15:32:07 +0100] “GET /viewforum.php?f=1&start=0 HTTP/1.0″ 404 211
194.8.74.67 – - [02/Apr/2009:15:32:07 +0100] “GET /viewforum.php?f=4&sid=d674a18521f87c176ae26f274f9cddff HTTP/1.0″ 404 211
194.8.74.67 – - [02/Apr/2009:15:32:07 +0100] “GET /index.php HTTP/1.0″ 404 207
194.8.74.67 – - [02/Apr/2009:15:32:07 +0100] “GET /index.php HTTP/1.0″ 404 207
194.8.74.67 – - [02/Apr/2009:15:32:08 +0100] “GET /memberlist.php?mode=viewprofile&u=151 HTTP/1.0″ 404 212

And another:

195.2.240.119 – - [02/Apr/2009:16:00:56 +0100] “GET /kill.php?f=1 HTTP/1.0″ 200 696449
195.2.240.119 – - [02/Apr/2009:16:01:26 +0100] “GET /posting.php?mode=post&f=1 HTTP/1.0″ 404 209
195.2.240.119 – - [02/Apr/2009:16:01:26 +0100] “GET /viewforum.php?f=1&start=0 HTTP/1.0″ 404 211
195.2.240.119 – - [02/Apr/2009:16:01:27 +0100] “GET /viewforum.php?f=3&sid=80094acf85bc68bb1ab34cef4ff37b7a HTTP/1.0″ 404 211
195.2.240.119 – - [02/Apr/2009:16:01:27 +0100] “GET /index.php HTTP/1.0″ 404 207
195.2.240.119 – - [02/Apr/2009:16:01:27 +0100] “GET /index.php HTTP/1.0″ 404 207
195.2.240.119 – - [02/Apr/2009:17:57:38 +0100] “GET /kill.php?f=1 HTTP/1.0″ 200 778369.

Checking my webservers logs, I saw it was a great success! There was puzzle from the machines most of whom did happily wait until over a megabyte of garbage had been sent out, before grappling desperately for some of the other scripts – such as viewforum.php and posting.php (all integral parts of phpBB) which of course aren’t on the server they were redirected to. What I can’t tell is if they are simply backing up a level and traversing down a different branch of links, or if they are programmed to guess at common forum software scripts when they suspect they are within a particular one? Here are Apaches access logs for the first two that fell in, notice they start trashing around for posting.php viewforum.php after it didn’t have any joy with the kill.php)

Needless to say I haven’t had a single bit of spam since on the forum. I have since updated this technique to create a few forms that look just like the official phpBB 3 ones, which actually just lead the bots around in circles and back onto the kill.php to suck down more of my information-garbage. So far, the average spambot will go around in about 3 to 5 loops before leaving in disgust, but I am hopeful that as I refine this further – styling it so that the garbage text is hidden, and using the style sheets of the forum, I will employ URL rewriting to create a seemingly infinite number of web pages that the spambot will be kicked around between… when – in actuality – it won’t have moved from the only script it landed on. I hope to someday see a spambot who never actually leaves, but needs to be restarted.

  • Share/Save/Bookmark