How to Create Separate Nginx Access Logs for Specific Requests Using Conditional Logging


2 views

While Nginx provides robust logging capabilities through error_log and access_log, many administrators need more granular control. The standard setup logs all requests to a single access log file, which becomes problematic when:

  • Filtering specific request patterns (e.g., bot traffic)
  • Monitoring API endpoints separately
  • Tracking 4xx/5xx errors in isolation

The most efficient solution combines Nginx's map and access_log directives. Here's a complete implementation:

http {
    map $status $loggable {
        ~^[23]  1;
        default 0;
    }

    server {
        access_log /var/log/nginx/access.log combined if=$loggable;
        access_log /var/log/nginx/special.log combined if=$loggable;
    }
}

1. Separating Bot Traffic

map $http_user_agent $is_bot {
    default 0;
    "~*(bot|crawl|slurp|spider)" 1;
}

server {
    access_log /var/log/nginx/bots.log combined if=$is_bot;
    access_log /var/log/nginx/human.log combined if=$is_bot;
}

2. Isolating API Requests

map $request_uri $is_api {
    default 0;
    "~^/api/" 1;
}

server {
    access_log /var/log/nginx/api.log combined if=$is_api;
    access_log /var/log/nginx/main.log combined if=!$is_api;
}

For more complex scenarios, you can use variables in log paths:

map $time_iso8601 $logdate {
    default 'nodate';
    '~^(?\d{4}-\d{2}-\d{2})' $ymd;
}

server {
    access_log /var/log/nginx/access-$logdate.log combined;
}

While conditional logging adds minimal overhead, be mindful of:

  • Disk I/O with multiple log files
  • Log rotation configuration
  • Open file descriptors limit

Nginx's default logging mechanism provides two primary log files:

error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log combined;

While these cover basic needs, many real-world scenarios require more granular control over what gets logged and where.

Nginx actually supports conditional logging through its map directive and if conditions in the access_log configuration. Here's how to implement it:

http {
    map $request_uri $is_crawler {
        default 0;
        ~*bot 1;
        ~*crawl 1;
        ~*spider 1;
    }

    server {
        access_log /var/log/nginx/access.log combined;
        access_log /var/log/nginx/crawlers.log combined if=$is_crawler;
    }
}

For more complex filtering, you can combine multiple conditions:

map "$http_user_agent:$request_uri" $loggable {
    default 1;
    ~*(googlebot|bingbot|yahoo):/private 0;
    ~*badcrawler 0;
}

server {
    access_log /var/log/nginx/clean.log combined if=$loggable;
    access_log /var/log/nginx/suspicious.log combined if=$loggable=0;
}

Here's a complete configuration example that handles blocked crawlers:

http {
    log_format crawler_fmt '$remote_addr - $http_user_agent - "$request"';

    map $http_user_agent $is_bad_crawler {
        default 0;
        ~*(AhrefsBot|SemrushBot|MJ12bot) 1;
    }

    server {
        listen 80;
        server_name example.com;

        # Main access log
        access_log /var/log/nginx/access.log combined;

        # Special log for bad crawlers
        access_log /var/log/nginx/bad_crawlers.log crawler_fmt if=$is_bad_crawler;

        # Return 200 for bad crawlers
        if ($is_bad_crawler) {
            return 200;
        }
    }
}

When implementing conditional logging:

  • Place common conditions first in map blocks for better performance
  • Consider memory usage when creating complex map variables
  • Use separate log files for high-volume conditions to prevent I/O bottlenecks

After implementation, you can easily analyze the special log files:

# Count blocked crawler requests
grep -c 'AhrefsBot' /var/log/nginx/bad_crawlers.log

# Get top blocked IPs
awk '{print $1}' /var/log/nginx/bad_crawlers.log | sort | uniq -c | sort -nr