Search

Recent Posts

Tags


« | Main | »

2010 Link-Referral log SPAM

By Dale Reagan | May 1, 2010

Link/Referrer SPAM: For some reason this seems to have started in early 2010 (or at least that is when I first noticed it.)  When reviewing server logs I was seeing strange links to my site (the referrer domains had no connection to the site content – they made no/little sense.)

After a bit of research it seems that Web SPAMMERs started using this technique several years ago and it grew even more when Blogging software (i.e. WordPress, Drupal, etc.) became popular AND when bloggers (and other web masters) decided that it was a good idea to show their web server logs (which contained the SPAMMER’s customer’s web addresses.)  At this point I am guessing that the SPAMMERs are simply hitting every  web site possible in hopes that the numbers game will yield the results that pay the bills.

Why are companies/people doing this?

The motivation is simple:

My solution to the problem is to both block the domains being pushed as well as the IP space being used to push the SPAM.  Over a relatively short period of time the major players are easy to spot (using the same or related IP space.)   No, I won’t mention any domains using these services and I also will not mention the IP space being  used.  A mod_security solution:

  1. create a blacklist of IP addresses
  2. create a blacklist of domains using these services
  3. create rules to block access for the blacklisted IP address and domains

Example Mod_security rules

######## Block 'bad' referrers
SecAction "pass,nolog,setvar:tx.REFERER='/%{REFERER}/'"
SecRule HTTP_REFERER "@pmFromFile blacklist.refer.txt" "t:lowercase,log,deny,msg:'Blacklist-Refer'"
########

The SecAction line above creates a variable based on the contents of the variables used by the web server for the current connection request.   The SecRule line scans the file ‘blacklist.refer.txt’ for a match based on the REFERER variable.  If a match is found then access is denied.  The ‘blacklist.refer.txt’ file contains one pattern per line – it could contain as many entries as needed – and yes, a large file may impact server performance.  The good news – once you install and verify the rule you only need to add entries to the blacklist and restart the web server to put your new patterns into use.  Care and testing is advised since you could easily end up blocking non-spammers… 🙂

########  Block 'bad' IP addresses or IP ranges
SecAction "phase:1,pass,nolog,setvar:tx.REMOTE_ADDR='/%{REMOTE_ADDR}/'"
# check IP var
SecRule TX:REMOTE_ADDR "@pmFromFile blacklist.ip.txt" "phase:1,deny,msg:'Blacklist_IP'"
########

As with the previous mod_security rule pair, the above SecAction line first sets a variable to the IP address of the visiting connection and then the SecRule line compares that IP address against the known bad guys from the file ‘blacklist.ip.txt’. If there is a match then the connection would be denied.

Once you identify the repeat offenders (i.e. the same or similar REFERER values or the same or related IP addresses) then you can create/install a cron process to review your server logs for new/additional SPAMMING attempts; you can then update your blacklists (if using mod_security then remember that for changes to take effect you must restart your web server.)

Like many IT issues, dealing with this problem requires some time and in this case some skill with pattern matching (i.e. using Regular Expressions.)  Of course you could just ignore it (as long as you are not sharing your web logs or allowing un-audited site linking…)

There are other approaches but they require manual editing of web-related access files – automation is probably a better solution.

Some references (indicating that this is an old problem)

Note that the mod_security rules on this page are simple examples that may not be appropriate for your site(s) – your mileage will vary… and yes, I may be available if you need a consultant to assist you in optimizing your Apache web server.  🙂

Topics: Business Blogging, Computer Technology, Internet Search, Long Tail Search, Problem Solving, System and Network Security, Web Problem Solving, Web Technologies | Comments Off on 2010 Link-Referral log SPAM

Comments are closed.


________________________________________________
YOUR GeoIP Data | Ip: 73.21.121.1
Continent: NA | Country Code: US | Country Name: United States
Region: | State/Region Name: | City:
(US only) Area Code: 0 | Postal code/Zip:
Latitude: 38.000000 | Longitude: -97.000000
Note - if using a mobile device your physical location may NOT be accurate...
________________________________________________

Georgia-USA.Com - Web Hosting for Business
____________________________________