After my painful experience with ext3 filesystem corruption, I switched to XFS specifically for its journaling capabilities and better handling of large files. However, unlike ext3's fsck
, XFS requires different monitoring approaches since xfs_check
demands an unmounted filesystem - a non-starter for production servers.
XFS provides several powerful utilities for live monitoring:
# Check filesystem metadata consistency
xfs_metadump /dev/sda1 | xfs_mdrestore - /tmp/metadump.log
# Analyze allocation groups
xfs_db -r /dev/sda1
xfs_db> agf 0
xfs_db> p
The xfs_stats
utility provides crucial metrics:
# Sample xfs_stats output for read/write operations:
xfs_stats -c "read write" /proc/fs/xfs/stat
# Continuous monitoring (refresh every 2 seconds)
watch -n 2 xfs_stats /proc/fs/xfs/stat
For proactive error detection (RHEL/CentOS 7.4+ or newer distros):
# Schedule monthly scrubbing via cron
0 3 1 * * /usr/sbin/xfs_scrub /mountpoint >> /var/log/xfs_scrub.log 2>&1
# Check scrub status
xfs_scrub -v /mountpoint
Combine XFS monitoring with disk health checks:
#!/bin/bash
# Check both disk and XFS health
smartctl -H /dev/sda
xfs_info /mountpoint
xfs_spaceman -df /mountpoint
Here's a Python script I use for comprehensive XFS monitoring:
import subprocess
import time
from datetime import datetime
def check_xfs_health(mountpoint):
try:
# Check free space
df = subprocess.run(['df', '-h', mountpoint], capture_output=True, text=True)
# Check inode usage
inodes = subprocess.run(['xfs_quota', '-x', '-c', f'report -h {mountpoint}'],
capture_output=True, text=True)
# Log results
with open('/var/log/xfs_monitor.log', 'a') as f:
f.write(f"{datetime.now()}\n{df.stdout}\n{inodes.stdout}\n")
except Exception as e:
print(f"Monitoring error: {str(e)}")
if __name__ == "__main__":
while True:
check_xfs_health('/data')
time.sleep(3600) # Run hourly
Keep these in your toolkit:
# Check filesystem structure
xfs_repair -n /dev/sda1
# Defragment files (when needed)
xfs_fsr /mountpoint
# Free preallocated space
xfs_freeze -u /mountpoint
After my painful experience with ext3 filesystem corruption, I switched to XFS precisely for its robustness and journaling capabilities. However, I quickly discovered that XFS demands different maintenance practices than traditional Linux filesystems. Unlike ext3/4 where you can run fsck
on mounted filesystems, XFS requires more proactive monitoring.
The most critical command for online XFS monitoring is xfs_db
, which allows inspection without unmounting:
# Check filesystem metadata consistency
sudo xfs_db -c check -v /dev/sdX
# Verify free space accounting
sudo xfs_db -c "freesp -s" /dev/sdX
# Check for corruption indicators
sudo xfs_db -c "blockget -n" /dev/sdX
For production systems, I recommend setting up regular checks via cron. This script captures critical metrics:
#!/bin/bash
DEVICE="/dev/sdX"
LOG="/var/log/xfs_health.log"
{
date
xfs_db -c "freesp -s" $DEVICE
xfs_quota -x -c "report -h" $DEVICE
xfs_spaceman -c "df -i" $DEVICE
} >> $LOG 2>&1
Modern Linux kernels (4.9+) support online XFS scrubbing:
# Schedule monthly scrubs
sudo xfs_scrub -v /mount/point
# Check scrub status
sudo xfs_scrub -p /mount/point
- Rapid growth of metadata blocks (check with
xfs_db -c "metadump"
) - Unexpected free space discrepancies
- Growing number of stale inodes (check with
xfs_repair -n
)
If you suspect corruption despite monitoring, first try:
# Force a clean unmount if possible
sudo umount -fl /mount/point
# Run repair (WARNING: requires unmounted FS)
sudo xfs_repair -v /dev/sdX