‘Spam’ Archive

14 Ianuarii 2005

Dirty tricks

Mua-hahahahaha. I love this trick of Parker’s and have implemented it here. If you’ve cut-and-pasted from me, here’s how you can too.

Change

BadReferrer

to

BadReferrer=yes

Get rid of the line

deny from env=BadReferrer

Then at the end of your file add these lines:

RewriteCond %{ENV:BadReferrer} ^yes$ [NC]
RewriteCond %{HTTP_REFERER} ^(.*)$ [NC]
RewriteRule ^(.*)$ %1 [R=301,L]

And chew up some of the spammer’s bandwidth the way they want to chew ours. Beautiful work, for which I am grateful.

Ham and jam and spamalot

The user-agent trick from yesterday works like a charm, but if you’re not using it, Parker Morse kindly dropped me the latest referrers:

krantas|azian|mor-lite|formula42|paramountseedfarms|
reservedining|

13 Ianuarii 2005

Yet more spam

A commenter on Ann Elisabeth’s site pointed out that our ex-mediavisor spammer has a consistent User-Agent line. If you set up User-Agent blocking as I explained in this post, try adding this:

(compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.1.4322)|

I have. The danger here is that we’ll block somebody legit… but no browser software I know has that as part of its User-Agent string.

And another bad referrer string:

hdic|

Eat hot 403, spamming scum!

A few more cropping up from our friends who used to be mediavisor:

ansar-|stories-on|hometeaminspection|catchathief|
sportingcolors|ingenysms|pagetwo|

Sharp eyes will note that pagetwo is a repeat from last post; I added it after posting last time, so I’m putting it in again in case someone missed it.

Spammer hot off the spam run

Well, there was a lull in the referrer spam this morning, but it’s back with a vengeance and a bunch of new spammy domains. My list:

rifp|parkviewsoccer|lvcpa|twinky|psychexams|
marshallyachts|krantas|devilofnights|rethy|tecrep|
tclighting|atlanta2000|suttonjames|nehrucollege|pagetwo

A couple of these words aren’t the whole domain, because it didn’t seem necessary and I’m a lazy typist.

Ann Elisabeth is on the case as usual; I note for her benefit that the IP 199.73.1.1 did a pretty thorough canvass of my site a little bit ago. (I suspect it’s a zombie.)

More to ban

I added free-sms to my referrer-spam list. The domain is actually send-free-sms.us.

On Ann Elisabeth’s advice, I’ve also added a total IP block for Atrivo, who appear to be Total Spam Harbor. To do the same:

deny from 69.50.160.0/19

Another weird but apparently nefarious trick spammers use is asking for filenames that end in underscores. I have no idea why; they Just Do. Here’s how to 403 such requests:

RewriteRule ^.*_$ - [F,L]

12 Ianuarii 2005

Raft of new blockees

Parker Morse of Flying Papers kindly sent me a list of new words to block. His list includes the top-level domain, but the technique I use makes it unnecessary (for the most part), so I’m not bothering with it. If you’re watching really closely, you’ll find that a few of these are redundant with my list (the onlinegamingassociation jerks are a subset of the mediavisor etc. scum), but that’s all right; they could pop up again somewhere else.

locators|popex|teenassearch|massearch|teensearch|2pursuit|
hq_inform|adultfriendfinder|insuranceinfo|lee-hom|itipa|find-it|buy-it|
9sekund|kylos|roxtet|spy-software|ionic-bonds|iconsurf|viagra|cialis|
tramadol|debt|consolidation|onlinegamingassociation

I noticed auctions-discounts.com in my logs today—but I didn’t have to add it to the list, as it was already being 403ed by the presence in my list of “discount”. I like it when things work!

Brandimensions

Add brandimensions (note one “d”) to the referrer-spammer list.

You may also want to kill the user-agent BDFetch, which appears to be their corporate bot. You can do this with a trick similar to the referrer-spammer trick, because there are quite a few bots out there that’ll suck you dry for no good reason. Here’s how to make it work.

Underneath your existing list of referrer spammers, add one for bad user agents:

SetEnvIfNoCase User-Agent ".*(bdfetch|npbot).*" BadUA

(I took out all but two of my own list, as they’re personal grudges rather than known spambots or surveillance-bots.)

Then underneath your existing referrer-spammer ban, add a bad-user-agent ban:

deny from env=BadUA

Et voilà tout. You can ban named hosts, too, if your webhost does reverse-DNS on incoming IP addresses. (If you see domain names in your web stats, then your host probably does.) The pattern should be clear by now, but just in case:

SetEnvIfNoCase RemoteHost ".*(esthost|.gb.com).*" BadHost
deny from env=BadHost

Note the backslashes in front of the dots up there. If you’re going to include a dot (and sometimes you’ll want to), you NEED that backslash. (Never mind why. You just do. If you really have to know, look up something on regular expressions.)

This won’t work with IP numbers. That’s not as bad as it sounds, because IP banning won’t help against this particular attack. If you’ve been reading over at Ann Elisabeth’s, you know that these scum are using zombied broadband PCs with IPs all over the map. Back in the early anti-email-spam days, we called this “whack-a-mole,” and it’s as pointless now as it was then.

However, if you do find an IP or IP block you want to get rid of—remember my idiot from Stanford? he’s still hitting me once a minute—it just takes one line apiece (stealing an IP from Ann Elisabeth):

deny from 69.50.170.122

Using the front chunks of an IP address works too:

deny from 69.50.170.

Again, I don’t recommend banning by IP number unless you’re pretty sure what you’re doing.

11 Ianuarii 2005

Latest bad referrer

Add “defunctportal” (as in “defunctportal.com”).

By the way, I left out a crucial bit about using the WordPress templates page to edit your .htaccess file. You have to chmod said file to be world-editable.

If you don’t know what that means? Stick to using your FTP client. Or, as Ann Elisabeth pointed out, those whose hosts are running Cpanel can use Cpanel’s File Manager gizmo to do the job.

Killing referrer spam

I’ve been watching Technorati on the subject of referrer spam the last few days. The blogger Ann Elisabeth has been doing excellent work ferreting out where all this is coming from, and I do recommend you go see—but that won’t so much help you stop it.

If you don’t know what referrer spam is? Ignore this post entirely or pass it on to a knowledgeable friend. But what the hey, for the rest of you, here’s what I do. Not a silver bullet—takes work—but definitely a bandwidth-reducer.

You will need:

  • Your webhost to be running the Apache webserver (not IIS).
  • FTP access to your server, an FTP client, and the skill to use it.
  • A text editor. Notepad actually will do this time. Microsoft Word won’t.
  • Some patience.

If you have access to your server logs, “recent visitor” logs, or a log-analyzer (like Analog or AWStats), that will help a lot. I will also be discussing some WordPress-specific tricks; I’ll mark them as such.

What we’re going to be doing is messing with .htaccess files. These files tell Apache (among other things) who is allowed to see a particular part of your website and how to rewrite and redirect URLs when that’s necessary. If you use WordPress and you have pretty permalinks, you’ve already messed with .htaccess, because that’s what’s making the pretty permalinks work.

BE AWARE: YOU CAN BORK YOUR WEBSITE WITH THIS. I’ve done it. (In fact, I did it two minutes ago. Go me.) How will you know your .htaccess file is borking your site? Well, usually, when you browse to your weblog’s URL you’ll get a “500 Internal Server Error” page of some sort instead of your beloved weblog.

Always, always, always keep a last-known-good version of your .htaccess file! If you’re using FTP to place your .htaccess file and you bork your site, you just upload the last-known-good file, and you’re golden.

If, on the other hand, you’re using WordPress’s Templates menu to mess with your .htaccess file and you bork your site, you probably can’t use WordPress to fix it! So what you do (well, what I do) is fire up the FTP program, grab the malfunctioning file, fix it, and re-upload it. WordPress will then behave normally. DANGER WILL ROBINSON! Don’t use WordPress’s Templates menu for this unless you’re reasonably confident you know what you’re doing, or at least can fix whatever you mess up!

Right. That said (and to it I add: don’t sue me if you bork your site; you take my advice at your own risk), onwards.

Your first decision is where to put your .htaccess file. Typically, it should go as high up in your web-folder hierarchy as possible, because it should then protect all the subfolders underneath. However, if you’re a WordPresser and your WordPress install is in a subfolder, go ahead and use the existing .htaccess file, or if you don’t have one, put one in along with your index.php file.

If you use a subdomain for your blog (as I do; the difference between http://cavlec.yarinareth.net/ and http://www.yarinareth.net/caveatlector/), you have to put your .htaccess file in a directory belonging to the subdomain. (At least on my webhost you do.) This is annoying, because if you maintain more than one WordPress blog on more than one subdomain, you have to edit .htaccess separately for every single subdomain. I haven’t found a workaround for this. Yet. If anyone has one, please let me know!

Okay, now that you know where your towel .htaccess file is, what do you put in it? At the top, you need the following two lines:

RewriteEngine On
RewriteBase /

If they’re already there, great; leave them be. This tells Apache that you’ll be making some rules about URLs on your site.

I now recommend that you pick up one of the tricks from Mr. Costello:

RewriteCond %{HTTP_HOST} !^yarinareth.net$ [NC]
RewriteCond %{HTTP_REFERER} ^(.*)$ [NC]
RewriteRule ^(.*)$ %1 [R=301,L]

Replace my website domain name in the first line above with yours. (No. Really. Do it. If you don’t, YOU WILL BORK YOUR SITE.) This won’t kill all referrer spam by a long shot, but it’ll kill the really stupid ones. What it does is, if the stupid referrer spammer asks for a page that isn’t even part of your site (as a few of ’em actually do!), Apache silently tells them to go to the page they’re trying to referrer-spam you with! Cute, no?

Next we’re going to deal with referrer spammers smart enough not to screw up in this fashion. And to make our lives easier when new referrer spammers come along, we’re going to use a bit of indirection. First, we’ll make a list of words or word-fragments that show up in fake referrers, and we tell Apache that they’re bad. Then, we’ll tell Apache not to give anything to anybody who shows up with a referrer containing one of the words or word-fragments we’ve defined as bad. Make sense? Good.

Here’s my current list, which you are welcome to copy and paste — just get rid of all hard returns so that the entire thing is one line. Apologies for the bad language below; not precisely my fault!

SetEnvIfNoCase Referer
".*(credit|canadianlabels|8gold|texas-hold|hold-em|holdem|
fidelityfunding|condo|sportsparent|mortgage|spoodles|money|
cash|hotel|houseofseven|stmaryonline|newtruths|popwow|oiline|
flafeber|thatwhichis|tmsathai|pisoc|crepesuzette|mediavisor|
commerce|easymoney|911|////.vi|gb////.com|4free|macsurfer|teen|
pussy|discount|blogincome|lillystar|aizzo|webdevsquare|laser-eye|
escal8|xopy|vixen1|linkerdome|youradulthosting|fick|inkjet-toner|
fuck|ime.nu|perfume-cologne|italiancharmsbracelets|shoesdiscount|
psnarones|hasfun|casino|gambling|poker|porn|sex|paris|gabriola|nude|
xxx|hilton|pics|video|adminshop|devaddict|iaea|empathica|
insuranceinfo|atelebanon|handy-sms|peng|just-deals|pisx|rimpim).*"
BadReferrer

(Wow. I’ve got quite a few of ’em, haven’t I? Well, I’ve been doing this a while.)

Next, add the following lines:

order deny,allow
deny from env=BadReferrer

Be careful to stifle your automatic tendency to put a space after a comma in the first line above. THAT WILL BORK YOUR SITE. Seriously. Apache is unforgiving.

And that’s that. Any request for a page containing a referring URL with any of the words separated by pipe (|) characters above is going to get smacked down.

What happens when the new batch of referrer spammers hits? Let’s say somebody’s desperately trying to insert example.com into your server logs. All you do is add example| to the beginning of your string o’ bad words, and you’re golden.

What, you say, I don’t have to enter the whole URL? No, no you don’t. And in fact you probably don’t want to, because entering “mortgage” just once blocks every single stupid mortgage-shark referrer-spammer coming down the pike.

How do you tell it’s working? Check your server logs (however you do that) for HTTP code 403, which means “forbidden” and is what Apache should do to these yobbos. But to use a broader brush, you should also see your daily bandwidth drop significantly if these guys have been hitting you.

I’ve created a new category “Spam” which I intend to use to pass on additions to my personal list of bad words, as well as any other good tips I see and decide to employ. I don’t want to become a blacklist clearinghouse, so if anybody else wants to take on this job, please please be my guest.

I’ve got a few other techniques to pass on, but they’re strictly for the serious anti-blog-spammer, so I’ll close this post for now, hoping it does some good.