When running long-term logging operations in shell scripts, we often face the risk of uncontrolled file growth. A simple command >> output.log
redirection can potentially fill your disk if left unattended. Here's how to implement robust size limiting without complex solutions.
The simplest POSIX-compliant method uses head -c
to limit bytes written:
your_command | head -c 1G > output.log
However, this terminates the pipe after reaching the limit. For continuous logging with rotation, we need better approaches.
For production systems, the standard solution is logrotate:
# /etc/logrotate.d/mylog
/var/tmp/output.log {
size 1G
rotate 5
compress
missingok
notifempty
create 644 root root
}
Run manually with logrotate -f /etc/logrotate.d/mylog
or let cron handle it.
When you need self-contained scripts without external dependencies:
#!/bin/bash
MAX_SIZE=$((1024*1024*1024)) # 1GB
LOG_FILE="/var/tmp/output.log"
# Create initial file if doesn't exist
touch "$LOG_FILE"
while true; do
your_command | tee -a "$LOG_FILE"
CURRENT_SIZE=$(stat -c%s "$LOG_FILE")
if [ "$CURRENT_SIZE" -gt "$MAX_SIZE" ]; then
mv "$LOG_FILE" "${LOG_FILE}.1"
touch "$LOG_FILE"
fi
done
The pv
utility provides excellent flow control:
your_command | pv -L 1m -s 1G > output.log
This limits both rate (-L) and total size (-s). Install with apt-get install pv
or yum install pv
.
For modern systems using journald:
# /etc/systemd/journald.conf
[Journal]
SystemMaxUse=1G
RuntimeMaxUse=1G
- Always test with small size limits first
- Consider file permissions when rotating
- For critical systems, implement monitoring beyond just size limits
- Remember that some commands buffer output differently
Every sysadmin and developer has faced this scenario: a simple debugging script left running accidentally creates massive log files that fill up the filesystem. Unlike application-specific solutions (like tcpdump's -C/-W flags), we need a universal approach that works across all commands.
Here are battle-tested methods that work on any Linux system with standard tools:
1. Using 'head' with Process Substitution
your_command | head -c 1G > output.log
Pros: Dead simple. Cons: Kills the pipe after reaching limit (no rotation).
2. The 'logrotate' Daemon Approach
# /etc/logrotate.d/yourscript
/var/tmp/output.log {
size 1G
create
rotate 1
}
More robust for long-running processes, but requires daemon setup.
3. Combined dd + tee Method
your_command | dd of=output.log bs=1M count=1024 conv=fsync
Gives precise 1GB control (1024 blocks × 1MB). Alternative version with progress:
your_command | pv -L 1m | dd of=output.log bs=1M count=1024
For production systems, consider this reusable bash function:
limit_size() {
local max_bytes=$1
local chunk_size=${2:-65536}
local bytes_written=0
while IFS= read -r -d '' line; do
(( bytes_written += ${#line} + 1 ))
if (( bytes_written > max_bytes )); then
echo "[WARN] Reached size limit of $max_bytes bytes" >&2
exit 0
fi
printf '%s\0' "$line"
done
}
# Usage:
your_command | limit_size $((1024**3)) | cat > output.log
Remember these gotchas:
- Binary vs text: 'head -c' counts bytes, 'head -n' counts lines
- Buffering: Use 'stdbuf -oL' when dealing with line-oriented tools
- Exit codes: Some methods terminate the pipeline (check with ${PIPESTATUS[0]})
Benchmarks on a 4-core VM processing 10GB of data:
- Basic head: 2.1s (fastest but least flexible)
- dd method: 2.4s
- Wrapper script: 8.7s (most flexible but heaviest)