How to Monitor and Identify Rapidly Growing Files in Linux Systems for Storage Management


4 views

When your Linux server's disk space mysteriously disappears, these commands become your forensic tools:

# Find files modified in last 24 hours sorted by size
find / -type f -mtime -1 -exec ls -lh {} + | awk '{ print $9 ": " $5 }' | sort -k2 -hr

# Continuous monitoring with watch
watch -n 60 "find /var/log -type f -exec du -sh {} + | sort -hr"

Create a monitoring script to track file growth patterns:

#!/bin/bash
LOG_FILE="/var/log/space_monitor.log"
TARGET_FILES=$(find / -type f -size +100M 2>/dev/null)

echo "[$(date)] Starting filesystem scan..." >> $LOG_FILE
for file in $TARGET_FILES; do
    current_size=$(du -sh "$file" | awk '{print $1}')
    echo "$file - $current_size" >> $LOG_FILE
done

For identifying actively written files (even those deleted but still open):

# Show all open files with write descriptors
lsof +L1 | grep -i deleted

# Monitor growing files in /var in real-time
lsof -Fn /var | grep -E '^n.*[0-9]$' | cut -c2-

Common culprits and their solutions:

# Check problematic log rotation
grep -v "^#" /etc/logrotate.d/* | grep -E 'missingok|notifempty'

# Example fix for Apache logs
/var/log/apache2/*.log {
    daily
    missingok
    rotate 52
    compress
    delaycompress
    notifempty
    create 640 root adm
    sharedscripts
    postrotate
        /etc/init.d/apache2 reload > /dev/null
    endscript
}

For interactive analysis of storage usage trends:

# Install and run with timestamp comparison
apt install ncdu
ncdu -o scan1.dat /  # First scan
sleep 3600
ncdu -o scan2.dat /  # Second scan
ncdu --diff scan1.dat scan2.dat

Create a service to track growth patterns:

# /etc/systemd/system/spacemon.service
[Unit]
Description=Storage growth monitor

[Service]
Type=oneshot
ExecStart=/usr/local/bin/spacemon.sh
# /etc/systemd/system/spacemon.timer
[Unit]
Description=Hourly storage check

[Timer]
OnCalendar=hourly
Persistent=true

[Install]
WantedBy=timers.target

Every Linux sysadmin has faced this scenario: your disk space mysteriously vanishes, and df -h shows shrinking available space. The usual suspects are log files, temporary files, or poorly configured applications writing unchecked data streams. Here's how to hunt them down effectively.

For immediate visibility, combine these commands in a new terminal:

watch -n 5 "du -ah /var/log | sort -rh | head -20"

This refreshes every 5 seconds, showing the top 20 largest files in /var/log (common location for runaway logs). Adjust the path as needed.

To identify files growing between two time points:

# First snapshot
find / -type f -exec du -ah {} + > /tmp/snapshot1.txt

# After some time (e.g., 1 hour)
find / -type f -exec du -ah {} + > /tmp/snapshot2.txt

# Compare
diff /tmp/snapshot1.txt /tmp/snapshot2.txt | grep ">"

For real-time file system events (requires inotify-tools):

inotifywait -m -r --format '%w %f %e' -e modify,create /var/log | 
while read path file event; do
  if [[ $event == *"MODIFY"* ]]; then
    size=$(du -h "$path$file" | cut -f1)
    echo "$(date) - $path$file modified - $size"
  fi
done

Identify processes writing to files that haven't been closed:

lsof +L1 | grep -i deleted

This finds "deleted" files still held open by processes - a common source of hidden space usage.

Verify your log rotation is properly configured:

ls -l /etc/logrotate.d/
grep -r "rotate" /etc/logrotate.d/

Ensure critical services have proper rotation settings like:

/var/log/nginx/*.log {
    daily
    rotate 14
    compress
    missingok
    notifempty
    create 0640 www-data adm
    sharedscripts
    postrotate
        /usr/sbin/nginx -s reload
    endscript
}

Create a monitoring script (save as /usr/local/bin/disk_growth_monitor.sh):

#!/bin/bash

THRESHOLD_MB=100
CHECK_DIRS=("/var/log" "/tmp" "/home")

for dir in "${CHECK_DIRS[@]}"; do
    find "$dir" -type f -mtime -1 -size +${THRESHOLD_MB}M -exec ls -lh {} + |
    awk '{print "ALERT: " $NF " - " $5 " - modified: " $6 " " $7 " " $8}'
done

Add to crontab:

*/30 * * * * /usr/local/bin/disk_growth_monitor.sh | mail -s "Disk Growth Alert" admin@example.com

Don't forget container-specific issues:

docker ps -a --filter status=exited
docker volume ls -qf dangling=true
docker system df -v

For systems using journald:

journalctl --disk-usage
sudo journalctl --vacuum-size=200M

Configure limits in /etc/systemd/journald.conf:

[Journal]
SystemMaxUse=500M
RuntimeMaxUse=100M