Linux Disk Space Mystery: Resolving Mismatch Between df and du on Root Partition

When managing Linux servers, accurate disk space monitoring is crucial. Recently while maintaining a CentOS server, I encountered a confusing situation where different tools reported significantly different disk usage for the root partition:

# df -h /
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda2      16G   14G  952M  94% /

# du -sh /*
6.9M    /bin
15M     /boot
123M    /etc
...
4.2G    total

Even ncdu showed just 5.6GB usage. This 10GB+ discrepancy between df (14GB) and du (4.2GB) needed investigation.

Through troubleshooting, I identified several potential causes for such discrepancies:

Deleted but still open files: Processes holding references to deleted files
Filesystem journal and metadata: Overhead not accounted for in du
Hidden system files: Files not accessible to regular users
Mount points and bind mounts: Nested filesystems causing double-counting
LVM snapshots: Undocumented space usage

First, I checked for open deleted files using lsof:

sudo lsof +L1 | grep deleted

Surprisingly, this revealed several large log files held open by httpd and mysql processes despite being deleted. The space wouldn't be freed until these processes restarted.

Next, I examined sparse files and filesystem features:

# Check for sparse files
find / -type f -printf "%S\t%p\n" | grep -v "^1.000000"

# View filesystem details
tune2fs -l /dev/xvda2 | grep "Features"

For a more thorough analysis, I used these commands:

# Check for mounted filesystems within /
sudo mount | grep "on /"

# Audit all disk usage as root
sudo du -x --exclude=/proc --exclude=/sys --exclude=/dev / | sort -n -r | head -20

# Alternative with ncdu
sudo ncdu -x /

This revealed /var/lib/docker/containers was consuming significant space - a common issue when containers aren't properly cleaned up.

Based on my findings, here are actionable solutions:

Clean up docker artifacts:
```
docker system prune -a --volumes
```

Handle deleted but open files:

# Either restart holding processes
sudo systemctl restart httpd mysql

# Or empty the files without restart
sudo truncate -s 0 /proc/[pid]/fd/[fd_number]

Check for large rotated logs:

journalctl --vacuum-size=100M
logrotate -f /etc/logrotate.conf

To avoid future disk space mysteries:

# Set up monitoring
df -h | awk '$5+0 > 90 {print "WARNING: "$6" at "$5}'

# Regular cleanup cron job
0 3 * * * find /var/log -type f -name "*.log" -mtime +30 -delete

Understanding these Linux storage nuances helps maintain healthy systems and prevents unexpected outages from disk space issues.

When I ran df -h on my CentOS server, it reported 14GB used on my root partition (/), but du -sh / only showed 4.2GB. This significant discrepancy is a common issue that puzzles many Linux administrators. Let's explore why this happens and how to track down the "missing" space.

The fundamental difference lies in how these commands measure disk usage:

df - reports filesystem-level statistics (block usage)
du - calculates file-level disk usage

Common reasons for discrepancies include:

1. Open files that have been deleted (still consuming space)
2. Mount points masking underlying files
3. Disk space reserved for root (typically 5%)
4. Filesystem journal/metadata overhead
5. LVM snapshots or thin provisioning
6. Files excluded by du (hidden, mounted, or special files)

Here's a systematic approach to identify the hidden space:

1. Check for deleted but still open files:

lsof | grep deleted

2. Verify reserved blocks (typically 5% for ext filesystems):

tune2fs -l /dev/xvda2 | grep "Reserved block count"

3. Examine mount points that might mask files:

mount | grep -v "^/"

4. Look for large sparse files:

find / -type f -size +100M -exec ls -lh {} + 2>/dev/null

Here's my go-to command sequence for space investigations:

# Check filesystem usage
df -hT

# Find largest directories
du -h --max-depth=1 / 2>/dev/null | sort -hr | head -20

# Alternative with ncdu
ncdu -x /

# Check for open deleted files
lsof +L1 | grep -i deleted

# Verify inode usage
df -i

In one production incident, we discovered that an Apache log rotation script failed, leaving a 10GB log file deleted but still held open by the httpd process. The solution was:

# Identify the process holding the file
lsof | grep deleted | grep httpd

# Gracefully restart Apache to release the handle
systemctl restart httpd

For particularly stubborn cases, consider:

# Check for filesystem errors
fsck -n /dev/xvda2

# Verify LVM thin provisioning (if used)
lvs -o+metadata_percent

# Examine filesystem journal size
dumpe2fs -h /dev/xvda2 | grep Journal

Implement proactive monitoring with this Nagios check:

#!/bin/bash
THRESHOLD=90
USAGE=$(df -h / | awk 'NR==2 {print $5}' | tr -d '%')
if [ $USAGE -gt $THRESHOLD ]; then
    echo "CRITICAL: Root partition at ${USAGE}%"
    exit 2
else
    echo "OK: Root partition at ${USAGE}%"
    exit 0
fi

Remember that understanding your specific filesystem implementation (ext4, xfs, etc.) and any volume management (LVM) is crucial for accurate diagnosis.

ServerDevWorker

Linux Disk Space Mystery: Resolving Mismatch Between df and du on Root Partition

Related Articles