While df -h
gives you a high-level overview of disk usage, we need deeper tools to pinpoint space hogs. Here's a more surgical approach:
du -ah / | sort -rh | head -n 20
This pipeline:
- Scans the entire filesystem (
/
) - Outputs human-readable sizes (
-h
) for all files (-a
) - Sorts results in reverse numerical order (
-rh
) - Shows only the top 20 offenders
To target specific file extensions that often grow large:
find / -type f $-name "*.log" -o -name "*.sql" -o -name "*.dump"$ -size +1G -exec ls -lh {} + 2>/dev/null
The -size +1G
filter shows only files exceeding 1GB. Redirecting stderr (2>/dev/null
) suppresses permission errors.
For a web server, focus on common storage areas:
for dir in /var/www /var/log /tmp /var/lib/mysql; do
echo "=== $dir ==="
du -sh $dir/* 2>/dev/null | sort -h
done
For graphical analysis, install and use ncdu
:
sudo apt install ncdu # Debian/Ubuntu
sudo yum install ncdu # RHEL/CentOS
ncdu -x /
Key features:
- Interactive navigation with arrow keys
- Sort options (size, name, mtime)
- Delete files directly (
d
key)
When dealing with multiple mounts, analyze each separately:
awk '$1 ~ "^/dev" {print $6}' /proc/mounts | while read mount; do
echo "Largest files in $mount:"
find "$mount" -type f -size +500M -exec ls -lh {} + 2>/dev/null
done
Create a cron job for regular reports:
#!/bin/bash
REPORT="/var/log/disk_usage_$(date +%F).log"
{
echo "==== Top 20 Files ===="
du -ah / 2>/dev/null | sort -rh | head -n 20
echo -e "\n==== Largest Log Files ===="
find /var/log -type f -size +100M -exec ls -lh {} + 2>/dev/null
} > "$REPORT"
Schedule with crontab -e
:
0 3 * * * /path/to/script.sh
Large files deleted but still held by processes can be found using:
lsof -nP | grep '(deleted)' | awk '{print $9}' | sort | uniq | xargs ls -lh
To free the space, either restart the holding process or truncate the file:
# Find PID holding file
lsof -nP | grep '/path/to/file (deleted)'
# Truncate safely
: > "/proc/PID/fd/FD_NUM"
When your Linux server's storage is nearly full, these commands become your best friends:
# Show overall disk usage
df -h
# Scan directories for largest consumers
du -sh /* | sort -rh | head -n 20
# Find files >1GB recursively
find / -type f -size +1G -exec ls -lh {} + 2>/dev/null | sort -k5 -rh
For interactive analysis, ncdu is the Swiss Army knife:
sudo apt install ncdu
ncdu -x /
Navigation keys:
- ↑/↓ to move
- Enter to enter dir
- d to delete
- n to sort by name
- s to sort by size
Common storage black holes in web environments:
# Check PHP session files
ls -lh /var/lib/php/sessions/
# Inspect MySQL binary logs
du -sh /var/lib/mysql/mysql-bin.*
# Audit WordPress uploads
find /var/www/ -name "*.jpg" -o -name "*.mp4" | xargs du -h | sort -rh | head
Create a cron job script:
#!/bin/bash
REPORT=/var/log/disk_usage_$(date +%Y%m%d).log
echo "Top 50 files exceeding 100MB:" > $REPORT
find / -type f -size +100M -exec ls -lh {} + 2>/dev/null | sort -k5 -rh | head -50 >> $REPORT
echo "\nDirectory breakdown:" >> $REPORT
du -sh /* 2>/dev/null | sort -rh >> $REPORT
For log rotation and cleanup:
# Show largest log files
ls -lhS /var/log/*.log | head
# Set up logrotate configuration
cat > /etc/logrotate.d/custom << 'EOF'
/var/log/app/*.log {
daily
missingok
rotate 14
compress
delaycompress
notifempty
create 0640 www-data adm
}
EOF
For LVM or complex storage setups:
# Show physical volume usage
pvdisplay -m
# Analyze thin provisioned volumes
lvs -o+metadata_percent