Dealing with massive log files that are actively being written to presents unique challenges in production environments. Unlike static files, we need approaches that:
- Don't interfere with the logging process
- Maintain file integrity
- Minimize system resource usage
- Preserve recent log entries (typically more valuable than older ones)
For production systems, logrotate
is the most robust solution:
# Sample logrotate configuration for /var/log/large-app.log /var/log/large-app.log { daily rotate 7 compress delaycompress missingok notifempty copytruncate size 1G }
Key advantages:
- copytruncate: Copies the file first before truncating, ensuring continuous logging
- size-based rotation: Triggers when log reaches specified size
- compression: Reduces storage requirements for archived logs
When you need to quickly reduce log file size without service interruption:
# Method 1: Using tail to keep last N lines tail -n 1000000 large.log > temp.log && mv temp.log large.log # Method 2: Efficient in-place truncation : > large.log # Complete truncation
For more controlled truncation:
# Keep lines from line 1000000 onward sed -i '1,999999d' large.log
For extremely large files on filesystems supporting sparse files:
# Create hole in file (requires Linux 3.15+) fallocate -c -l 1G large.log # Alternative for older systems truncate -s 1G large.log
When dealing with multi-GB files:
- Avoid reading entire files into memory
- Filesystem choice matters (ext4 handles large files better than FAT)
- Consider log rotation frequency based on write patterns
- Monitor inode usage when creating many log files
Combine these techniques with monitoring:
#!/bin/bash LOG_FILE="/var/log/app.log" MAX_SIZE=$((1024*1024*1024)) # 1GB while true; do size=$(stat -c%s "$LOG_FILE") if [ $size -gt $MAX_SIZE ]; then tail -n 500000 "$LOG_FILE" > "${LOG_FILE}.tmp" mv "${LOG_FILE}.tmp" "$LOG_FILE" gzip "${LOG_FILE}-$(date +%Y%m%d).old" fi sleep 3600 # Check hourly done
When dealing with production Linux servers, one common issue is managing log files that grow to several GB in size while still being actively written to by running processes. Simply deleting or truncating these files can cause application failures or data loss.
Here are several tested methods for safely reducing large log files without disrupting services:
# Method 1: Using tail to preserve recent logs
tail -n 1000000 large_log_file.log > reduced_log.log
mv reduced_log.log large_log_file.log
# Method 2: Efficient in-place truncation
: > large_log_file.log # For complete truncation
OR
truncate -s 0 large_log_file.log
For long-term management, implement proper log rotation:
# Sample /etc/logrotate.d/your_app configuration
/var/log/your_app/*.log {
daily
rotate 7
compress
delaycompress
missingok
notifempty
copytruncate
size 100M
}
When processes maintain open file handles to logs, use these approaches:
# Find processes with open handles to the log file
lsof | grep large_log_file.log
# Gracefully restart affected services after rotation
systemctl restart your_service
For more selective reduction while preserving important entries:
# Keep only error-level logs
grep -E 'ERROR|CRITICAL' large_log_file.log > filtered_log.log