XFS Filesystem Space Inflation: Diagnosing and Resolving Sudden Sparse File Proliferation in RHEL/CentOS 6.2+


2 views

After a decade of stable XFS operation across Linux servers, I encountered a puzzling behavior post-upgrade to RHEL/CentOS 6.2+. Filesystems showed volatile capacity fluctuations (visible in the blue trend line below), with du reporting significantly higher usage than ls -l or du --apparent-size:

# Diagnostic output showing discrepancy:
$ du -skh largefile.db 
47G    largefile.db
$ du -skh --apparent-size largefile.db
32G    largefile.db
$ ls -lh largefile.db
-rw-r--r-- 1 appuser appgroup 32G Aug 15 11:23 largefile.db

Using filefrag, I discovered problematic fragmentation patterns:

$ filefrag -v high_usage_file.log
Filesystem type is: 58465342
File size of high_usage_file.log is 1099511627776 (268435456 blocks)
high_usage_file.log: 11782 extents found

Key findings from ncdu scans:

  • Actual disk usage: 436.8GiB
  • Apparent size: 365.2GiB
  • 71.6GiB difference attributed to sparse files

The behavior emerged after upgrading from kernel 2.6.32-131.17.1.el6 to 2.6.32-220.23.1.el6. Research uncovered these relevant changes:

xfs: Optimize delayed allocation for sparse files
commit 5f6bed76c1032a1e20f3d650e8001f7a7bea03a4
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Mar 13 14:00:47 2012 +1100

This optimization altered how XFS handles sparse regions, particularly for:

  • Database files with pre-allocated space
  • Log rotation scenarios
  • Virtual machine disk images

Immediate remediation:

# Find and process sparse files:
find /data -type f -printf "%S\t%p\n" | awk '$1 < 1.0 {print $2}' | xargs -I {} cp --sparse=always {} {}.tmp
mv {}.tmp {} 

Persistent configuration:

# /etc/sysctl.d/xfs_tuning.conf
# Disable aggressive speculative preallocation
fs.xfs.xfssyncd_centisecs = 6000
fs.xfs.age_buffer_centisecs = 1500

Automated monitoring script:

#!/bin/bash
THRESHOLD=0.9
FS=/data

SPARSE_RATIO=$(df -P $FS | awk 'NR==2 {print $3/$2}')
if (( $(echo "$SPARSE_RATIO > $THRESHOLD" | bc -l) )); then
    logger -t xfsmon "Sparse file threshold exceeded on $FS"
    /usr/sbin/xfs_fsr -v $FS
fi

For production systems experiencing this issue:

  1. Schedule regular defragmentation during low-usage periods:
    # Weekly cron job
    0 3 * * 6 /usr/sbin/xfs_fsr -v /data >> /var/log/xfs_fsr.log 2>&1
  2. Implement Zabbix monitoring for sparse file growth:
    UserParameter=xfs.sparse.ratio,df -k /data | awk 'NR==2 {printf "%.2f", $3/$2}'

After a decade of stable XFS usage across Linux servers, a concerning pattern emerged post-RHEL/CentOS 6.2 upgrades. Systems began exhibiting:

  • 20-30% discrepancies between du and du --apparent-size
  • Fragmented files with thousands of extents (verified via filefrag -v filename)
  • Temporary relief from xfs_fsr followed by rapid re-inflation

To quantify the issue, I developed this investigation script:

#!/bin/bash
# Compare physical vs apparent storage
xfs_path="/data"

echo "Scanning ${xfs_path}..."
printf "%-40s %12s %12s\n" "FILE" "PHYSICAL" "LOGICAL"
find ${xfs_path} -type f -size +100M -exec sh -c '
  for f; do 
    phys=$(du -k "$f" | awk "{print \$1}")
    log=$(du -k --apparent-size "$f" | awk "{print \$1}")
    ratio=$(( (phys-log)*100/log ))
    [ $ratio -gt 10 ] && \
    printf "%-40s %12sK %12sK (%d%%)\n" \
      "${f:0:37}..." "$phys" "$log" "$ratio"
  done
' sh {} + | sort -rnk2 | head -20

The transition from kernel 2.6.32-131 to 2.6.32-220 introduced several XFS modifications:

  • New speculative preallocation behavior (xfs_iomap_prealloc_size)
  • Modified extent allocation heuristics (commit 39a26b14)
  • Changed delayed allocation flushing thresholds

Temporary mitigation:

# Manual defragmentation
xfs_fsr -v /dev/sdX

# Disable aggressive preallocation (per-mount)
mount -o remount,allocsize=64k /data

# Find and compress sparse files
find /data -type f -printf "%S\t%p\n" | awk '$1 < 1.0' | sort -n

Permanent fix: Apply this kernel parameter:

# Add to /etc/sysctl.conf
vm.dirty_background_ratio = 5
vm.dirty_ratio = 10
vm.dirty_expire_centisecs = 3000

# Then reload
sysctl -p

Implement this Nagios check to track the issue:

#!/bin/bash
WARNING=15
CRITICAL=30

ratio=$(df -k /data | awk 'NR==2 {print $3/$2*100}')
phys=$(du -ks /data | cut -f1)
log=$(du -ks --apparent-size /data | cut -f1)
delta=$(( (phys-log)*100/log ))

if [ $delta -ge $CRITICAL ]; then
  echo "CRITICAL: XFS storage overhead ${delta}% | overhead=${delta}%;${WARNING};${CRITICAL}"
  exit 2
elif [ $delta -ge $WARNING ]; then
  echo "WARNING: XFS storage overhead ${delta}% | overhead=${delta}%;${WARNING};${CRITICAL}"
  exit 1
else
  echo "OK: XFS storage overhead ${delta}% | overhead=${delta}%;${WARNING};${CRITICAL}"
  exit 0
fi

For severely affected systems:

# Create optimized XFS (newer mkfs.xfs versions)
mkfs.xfs -f -d agcount=16 -l size=128m -m crc=1 /dev/sdX

# Mount with modern options
mount -o noatime,nodiratime,allocsize=64k,logbsize=256k /dev/sdX /data