While Nginx provides robust logging capabilities through error_log
and access_log
, many administrators need more granular control. The standard setup logs all requests to a single access log file, which becomes problematic when:
- Filtering specific request patterns (e.g., bot traffic)
- Monitoring API endpoints separately
- Tracking 4xx/5xx errors in isolation
The most efficient solution combines Nginx's map
and access_log
directives. Here's a complete implementation:
http {
map $status $loggable {
~^[23] 1;
default 0;
}
server {
access_log /var/log/nginx/access.log combined if=$loggable;
access_log /var/log/nginx/special.log combined if=$loggable;
}
}
1. Separating Bot Traffic
map $http_user_agent $is_bot {
default 0;
"~*(bot|crawl|slurp|spider)" 1;
}
server {
access_log /var/log/nginx/bots.log combined if=$is_bot;
access_log /var/log/nginx/human.log combined if=$is_bot;
}
2. Isolating API Requests
map $request_uri $is_api {
default 0;
"~^/api/" 1;
}
server {
access_log /var/log/nginx/api.log combined if=$is_api;
access_log /var/log/nginx/main.log combined if=!$is_api;
}
For more complex scenarios, you can use variables in log paths:
map $time_iso8601 $logdate {
default 'nodate';
'~^(?\d{4}-\d{2}-\d{2})' $ymd;
}
server {
access_log /var/log/nginx/access-$logdate.log combined;
}
While conditional logging adds minimal overhead, be mindful of:
- Disk I/O with multiple log files
- Log rotation configuration
- Open file descriptors limit
Nginx's default logging mechanism provides two primary log files:
error_log /var/log/nginx/error.log;
access_log /var/log/nginx/access.log combined;
While these cover basic needs, many real-world scenarios require more granular control over what gets logged and where.
Nginx actually supports conditional logging through its map
directive and if
conditions in the access_log
configuration. Here's how to implement it:
http {
map $request_uri $is_crawler {
default 0;
~*bot 1;
~*crawl 1;
~*spider 1;
}
server {
access_log /var/log/nginx/access.log combined;
access_log /var/log/nginx/crawlers.log combined if=$is_crawler;
}
}
For more complex filtering, you can combine multiple conditions:
map "$http_user_agent:$request_uri" $loggable {
default 1;
~*(googlebot|bingbot|yahoo):/private 0;
~*badcrawler 0;
}
server {
access_log /var/log/nginx/clean.log combined if=$loggable;
access_log /var/log/nginx/suspicious.log combined if=$loggable=0;
}
Here's a complete configuration example that handles blocked crawlers:
http {
log_format crawler_fmt '$remote_addr - $http_user_agent - "$request"';
map $http_user_agent $is_bad_crawler {
default 0;
~*(AhrefsBot|SemrushBot|MJ12bot) 1;
}
server {
listen 80;
server_name example.com;
# Main access log
access_log /var/log/nginx/access.log combined;
# Special log for bad crawlers
access_log /var/log/nginx/bad_crawlers.log crawler_fmt if=$is_bad_crawler;
# Return 200 for bad crawlers
if ($is_bad_crawler) {
return 200;
}
}
}
When implementing conditional logging:
- Place common conditions first in
map
blocks for better performance - Consider memory usage when creating complex
map
variables - Use separate log files for high-volume conditions to prevent I/O bottlenecks
After implementation, you can easily analyze the special log files:
# Count blocked crawler requests
grep -c 'AhrefsBot' /var/log/nginx/bad_crawlers.log
# Get top blocked IPs
awk '{print $1}' /var/log/nginx/bad_crawlers.log | sort | uniq -c | sort -nr