When working with Awstats to analyze Nginx access logs, many developers face the challenge of processing multiple compressed log files (like access.log.1.gz through access.log.40.gz). The default Awstats configuration typically points to a single uncompressed log file, leaving these valuable historical logs unanalyzed.
Awstats requires uncompressed log files for processing. For gzipped logs, we need to implement a preprocessing step before feeding them to Awstats. Here's the complete workflow:
1. Locate all gzipped log files
2. Uncompress them (either temporarily or permanently)
3. Process them sequentially with Awstats
4. Optionally: recompress or delete temporary files
The most efficient method is to use zcat
to stream uncompressed logs directly to Awstats without creating temporary files:
for gzfile in /var/log/nginx/access.log.*.gz; do
zcat "$gzfile" | /usr/lib/cgi-bin/awstats.pl -config=yourconfig -LogFile=-
done
Key points:
-LogFile=-
tells Awstats to read from stdin- Processes files in alphabetical/numerical order
- No temporary storage needed
For environments where streaming isn't possible, create a temporary combined log:
TEMPFILE=$(mktemp)
for gzfile in /var/log/nginx/access.log.*.gz; do
zcat "$gzfile" >> "$TEMPFILE"
done
/usr/lib/cgi-bin/awstats.pl -config=yourconfig -LogFile="$TEMPFILE"
rm "$TEMPFILE"
Create a script to handle daily log rotations and processing:
#!/bin/bash
CONFIG="yourconfig"
LOG_DIR="/var/log/nginx"
AWSTATS="/usr/lib/cgi-bin/awstats.pl"
# Process current uncompressed log
$AWSTATS -config=$CONFIG -LogFile="$LOG_DIR/access.log"
# Process all compressed logs
find "$LOG_DIR" -name "access.log.*.gz" -print0 | sort -z | xargs -0 -I {} zcat {} | \
$AWSTATS -config=$CONFIG -LogFile=-
# Update the database
$AWSTATS -config=$CONFIG -update
When dealing with large numbers of gzipped logs:
- Process files sequentially to avoid memory issues
- Consider using
pigz
(parallel gzip) for faster decompression - Schedule processing during low-traffic periods
- Monitor disk I/O during processing
To analyze logs between specific dates (e.g., for monthly reports):
START_DATE="2023-01-01"
END_DATE="2023-01-31"
for gzfile in /var/log/nginx/access.log-${START_DATE}.gz \
/var/log/nginx/access.log-${END_DATE}.gz; do
if [ -f "$gzfile" ]; then
zcat "$gzfile" | $AWSTATS -config=$CONFIG -LogFile=-
fi
done
When working with AWStats and Nginx, one common obstacle is processing archived log files in compressed format. Unlike the active access.log
that AWStats handles easily, gzipped historical logs (access.log.1.gz
through access.log.40.gz
) require special configuration.
The most efficient approach involves:
- Creating a shell script to process multiple gzipped files
- Modifying AWStats configuration to handle compressed files
- Setting up log rotation compatibility
First, configure your AWStats config file (/etc/awstats/awstats.yourdomain.conf
):
LogFile="/usr/bin/zcat /var/log/nginx/access.log.*.gz |"
LogFormat=1
SiteDomain="yourdomain.com"
HostAliases="www.yourdomain.com localhost 127.0.0.1"
Create a processing script (process_logs.sh
):
#!/bin/bash
# Process all compressed logs
for i in $(ls -r /var/log/nginx/access.log.*.gz); do
echo "Processing $i"
/usr/lib/cgi-bin/awstats.pl -config=yourdomain -update -LogFile="zcat $i |"
done
# Process current log
/usr/lib/cgi-bin/awstats.pl -config=yourdomain -update
Set up a cron job for regular processing:
0 3 * * * /path/to/process_logs.sh > /var/log/awstats_processing.log 2>&1
For large log collections:
- Add
DNSCache=1
to AWStats config - Consider
DNSLastUpdate=0
for static IPs - Use
LoadPlugin="decodeutfkeys"
for international domains
If you encounter errors:
# Test log format with:
zcat /var/log/nginx/access.log.1.gz | head -n 1 | \
/usr/lib/cgi-bin/awstats.pl -config=yourdomain -debug=1