Troubleshooting Rapid Root Partition Growth on CentOS: No Large Files Found


4 views



I recently encountered a puzzling situation where our CentOS server's root partition (/dev/sda3) was consistently growing at about 1% per day, despite not finding any obvious large files through conventional checks. Here's my investigation journey and solution.



First, let's examine the standard diagnostic commands I ran:

# Basic disk usage overview
df -h

# Traditional large file search
du -ah / | sort -rh | head -20

# Check for deleted files still using space
lsof | grep deleted

None of these revealed the culprit. The standard tools showed normal file distribution without any single large file accounting for the growth.



When standard tools fail, we need to dig deeper:

# Check for small files accumulating rapidly
find / -type f -size -1M -mtime -7 | xargs ls -lh | sort -k7

# Monitor filesystem changes in real-time
inotifywait -m -r / --exclude '/(proc|sys|dev)/' -e create -e delete -e modify

# Check for filesystem reserved space
tune2fs -l /dev/sda3 | grep Reserved



After extended monitoring, I discovered several subtle issues:

1. Journald logs were growing rapidly but not being rotated properly:
journalctl --disk-usage
Journal occupies 3.2G on disk.

Solution: Configure journald limits in /etc/systemd/journald.conf:
[Journal]
SystemMaxUse=500M
RuntimeMaxUse=200M

2. Docker containers were creating numerous small temporary files:
docker system df
docker ps --size

Solution: Implement cleanup policies:
docker system prune -a --filter "until=24h"



To prevent recurrence, I implemented this monitoring script:

#!/bin/bash
THRESHOLD=80
PARTITION="/dev/sda3"

CURRENT=$(df -h | grep $PARTITION | awk '{print $5}' | sed 's/%//')

if [ "$CURRENT" -gt "$THRESHOLD" ]; then
    echo "WARNING: $PARTITION is at ${CURRENT}% capacity" | mail -s "Disk Alert" admin@example.com
    # Automatic cleanup actions
    journalctl --vacuum-size=200M
    docker system prune -f --filter "until=24h"
fi



Some additional measures I implemented:

1. Created a separate /var partition to isolate log growth
2. Implemented logrotate configurations for all services
3. Set up daily disk usage reports via cron:
0 3 * * * df -h > /var/log/disk_usage.log


When dealing with mysterious disk space consumption, the standard du command might not always reveal the complete picture. Here's a more comprehensive approach I've developed through experience:

# Check for deleted files still held by processes
lsof | grep '(deleted)'

# Alternative du command with depth limitation
du -h --max-depth=1 / | sort -h

# Check for hidden large directories
du -ah / | grep -E '[0-9]{3,}M|[0-9]G' | sort -hr

Several less-obvious culprits could be eating your space:

# Check for large log files
journalctl --disk-usage
find /var/log -type f -size +100M -exec ls -lh {} \;

# Audit package cache
dnf clean all
yum clean all

# Verify Docker/container storage
docker system df
podman system df

Setting up proactive monitoring can prevent future surprises:

#!/bin/bash
# Daily disk check script
THRESHOLD=80
CURRENT=$(df / --output=pcent | tail -1 | tr -d '% ')

if [ "$CURRENT" -ge "$THRESHOLD" ]; then
    echo "Warning: Root filesystem at ${CURRENT}% capacity" | mail -s "Disk Alert" admin@example.com
    # Generate detailed report
    df -h > /tmp/disk_report.txt
    du -h --max-depth=3 / 2>/dev/null | sort -hr | head -20 >> /tmp/disk_report.txt
    lsof | grep deleted >> /tmp/disk_report.txt
fi

When conventional methods fail, these techniques can help:

# Check for filesystem reserved blocks
tune2fs -l /dev/sda3 | grep "Reserved block count"

# Inspect for filesystem errors that might report incorrect sizes
fsck -n /dev/sda3

# Monitor real-time disk writes
iotop -oP

Implement these strategies to maintain healthy disk space:

# Set up log rotation
cat << EOF > /etc/logrotate.d/custom
/var/log/app/*.log {
    daily
    missingok
    rotate 7
    compress
    delaycompress
    notifempty
    create 644 root root
}
EOF

# Create separate partitions for volatile directories
# Example fstab entry:
/dev/sda5  /var/log  ext4  defaults,noatime  1 2