Optimizing Nginx Referrer Spam Protection: Efficient Blocking with Regex Patterns


1 views

Dealing with referrer spam in Nginx often leads to configuration files cluttered with repetitive blocks like this:

if ($http_referer ~* spamdomain1\\.com) {
    return 444;
}
if ($http_referer ~* spamdomain2\\.com) {
    return 444;
}
if ($http_referer ~* spamdomain3\\.com) {
    return 444;
}

The more elegant solution is to combine all blocked referrers into a single regular expression pattern:

if ($http_referer ~* (spamdomain1\\.com|spamdomain2\\.com|spamdomain3\\.com|semalt\\.com|buttons-for-website\\.com)) {
    return 444;
}

For better maintainability, you can store the patterns in a separate file:

Create /etc/nginx/spam_referrers:

spamdomain1\\.com
spamdomain2\\.com
spamdomain3\\.com
semalt\\.com
buttons-for-website\\.com

Then in your nginx.conf:

map $http_referer $is_spam_referrer {
    include /etc/nginx/spam_referrers;
    default 0;
}

server {
    if ($is_spam_referrer) {
        return 444;
    }
    ...
}

When dealing with large block lists (100+ domains):

  • Use the map directive outside server blocks
  • Consider compiling Nginx with PCRE JIT for faster regex matching
  • Group similar domains (subdomains of same spam network) together

Example script to update block list from community sources:

#!/bin/bash
wget -O /tmp/spam_referrers.txt https://example.com/spam-lists/referrers
awk '{print $0 " 1;"}' /tmp/spam_referrers.txt > /etc/nginx/spam_referrers
nginx -t && nginx -s reload

Managing referrer spam through individual if statements in Nginx quickly becomes unwieldy. Each new spam domain requires another block:

if ($http_referer ~* spamdomain1\\.com) {
    return 444;
}
if ($http_referer ~* spamdomain2\\.com) {
    return 444;
}
# ...and so on for hundreds of domains

The elegant solution is Nginx's map directive, which lets you create key-value pairs for efficient matching:

map $http_referer $bad_referer {
    default 0;
    ~*.spamdomain1.com 1;
    ~*.spamdomain2.com 1;
    ~*.referralspam.net 1;
    ~*.fake-traffic.xyz 1;
    # Additional domains...
}

Then in your server block:

server {
    if ($bad_referer) {
        return 444;
    }
    # Other server configuration...
}

For better maintainability, store domains in an external file:

# /etc/nginx/conf.d/bad_referers.map
~*.semalt.com 1;
~*.buttons-for-website.com 1;
~*.social-buttons.com 1;
~*.ilovevitaly.com 1;

Then reference it in your nginx.conf:

map $http_referer $bad_referer {
    include /etc/nginx/conf.d/bad_referers.map;
    default 0;
}

The map directive is compiled when Nginx starts, making it more efficient than multiple if statements. For very large lists (1000+ domains), consider:

  • Using regex patterns to group similar domains
  • Splitting into multiple map files
  • Pre-compiling patterns during configuration testing

Combine with a cron job to update your block list automatically:

# Example update script
wget -O /tmp/spam_domains.txt https://example.com/spamlist.txt
cat /tmp/spam_domains.txt | awk '{print "~*."$0" 1;"}' > /etc/nginx/conf.d/bad_referers.map
nginx -t && nginx -s reload