Preventing Root FS Overflow: Robust Solutions for Failed NFS Mount Scenarios in Ubuntu


1 views

Failed NFS mounts leading to root filesystem overflow is one of those insidious problems that can bring down production systems silently. I've seen this exact scenario play out multiple times where backup scripts happily write to what was supposed to be an NFS mount, but ends up filling the root partition when the mount disappears.

Here's what typically happens in chronological order:

1. NFS mount /mnt/backup becomes unavailable (network issue, server reboot)
2. Backup script continues writing to /mnt/backup
3. All data gets written to root filesystem instead
4. Critical services fail when / hits 100% utilization

Here are the most effective solutions I've implemented across various environments:

Mount Point Verification Script

Create a pre-backup verification script (best placed in /usr/local/bin/check_mount):

#!/bin/bash
MOUNT_POINT="/mnt/backup"
FILESYSTEM_TYPE="nfs"

if ! mountpoint -q "${MOUNT_POINT}"; then
    logger -t backup "ERROR: ${MOUNT_POINT} not mounted"
    exit 1
fi

if ! grep -q "${MOUNT_POINT}.*${FILESYSTEM_TYPE}" /proc/mounts; then
    logger -t backup "ERROR: ${MOUNT_POINT} has wrong filesystem type"
    exit 1
fi

Automount with Timeout Protection

Configure /etc/auto.master with safety parameters:

/mnt/backup /etc/auto.backup --timeout=60 --ghost

And in /etc/auto.backup:

backup -fstype=nfs,soft,intr,timeo=300,retrans=3 nfsserver:/export/backup

Immutable Mount Points

Make the mount point directory immutable:

# Prevent writes to unmounted directory
chattr +i /mnt/backup
# When you need to mount:
chattr -i /mnt/backup
mount /mnt/backup

Alternative Mount Structure

Create a more robust directory structure:

mkdir -p /mnt/.protected/backup
mount --bind /mnt/.protected /mnt/protected

For modern Ubuntu systems using systemd, create a mount unit:

# /etc/systemd/system/mnt-backup.mount
[Unit]
Description=Backup NFS Mount
Requires=network-online.target
After=network-online.target

[Mount]
What=nfsserver:/export/backup
Where=/mnt/backup
Type=nfs
Options=soft,intr,timeo=300,retrans=3

[Install]
WantedBy=multi-user.target

Implement Prometheus monitoring for mount points:

# node_exporter textfile collector script
#!/bin/bash
echo "# HELP node_mount_point_healthy Mount point health"
echo "# TYPE node_mount_point_healthy gauge"
if mountpoint -q /mnt/backup; then
    echo 'node_mount_point_healthy{mountpoint="/mnt/backup"} 1'
else
    echo 'node_mount_point_healthy{mountpoint="/mnt/backup"} 0'
fi

The most robust solution combines several approaches:

  1. Use systemd mount units for predictable behavior
  2. Implement mount verification in backup scripts
  3. Set up monitoring for early detection
  4. Consider using overlayfs for an extra layer of protection

Remember that no single solution is perfect - defense in depth is key when protecting your root filesystem from accidental fills.


Every sysadmin managing NFS-mounted backup directories has faced this scenario: Your backup script dutifully writes to /mnt/backup, but when the NFS mount fails, it silently dumps everything into the root filesystem instead. Before you know it, your services start failing with "no space left on device" errors.

The typical approach of simply checking df output in cron jobs has several flaws:

# Basic but problematic check
if ! df -h | grep -q '/mnt/backup'; then
    echo "NFS mount failed!" | mail -s "Alert" admin@example.com
fi

This detects the failure but often too late - backups may have already filled the root FS during the check interval.

Here's a comprehensive approach combining several techniques:

1. Filesystem-Level Protection

Create a protective layer at the mount point:

# Make mount point unwritable when unmounted
sudo chmod 000 /mnt/backup
sudo chattr +i /mnt/backup  # Immutable flag as last resort

2. Smart Backup Script Wrapper

Implement mount verification in your backup scripts:

#!/bin/bash

BACKUP_DIR="/mnt/backup"
MOUNT_TYPE="nfs4"

# Verify mount is active and correct type
if ! mountpoint -q "$BACKUP_DIR" || \
   ! grep -q "$BACKUP_DIR.*$MOUNT_TYPE" /proc/mounts; then
    logger -t backup "ERROR: $BACKUP_DIR not properly mounted"
    exit 1
fi

# Verify filesystem has sufficient space
MIN_SPACE=10000000  # 10GB in KB
AVAIL=$(df -k "$BACKUP_DIR" | awk 'NR==2 {print $4}')
if [ "$AVAIL" -lt "$MIN_SPACE" ]; then
    logger -t backup "ERROR: Insufficient space in $BACKUP_DIR"
    exit 1
fi

# Proceed with actual backup
/usr/local/bin/real_backup_script.sh

3. Systemd-Based Mount Monitoring

For Ubuntu systems, create a systemd service to monitor the mount:

# /etc/systemd/system/nfs-backup-monitor.service
[Unit]
Description=NFS Backup Mount Monitor
After=network.target

[Service]
ExecStart=/usr/local/bin/mount_watchdog.sh
Restart=always

[Install]
WantedBy=multi-user.target

4. Alternative Backup Target Structure

Implement a foolproof directory structure:

# On the backup server:
mkdir -p /exports/backups/.mounted_flag

# Local mount options in /etc/fstab:
backupserver:/exports/backups /mnt/backup nfs 
    rw,hard,intr,noexec,nosuid,nodev 0 0

For critical systems, consider an OverlayFS solution:

# Create protected overlay
mkdir -p /var/lib/backup/{lower,upper,work,merged}
mount -t overlay overlay -o \
    lowerdir=/mnt/backup:/var/lib/backup/lower,\
    upperdir=/var/lib/backup/upper,\
    workdir=/var/lib/backup/work \
    /var/lib/backup/merged

# Configure backups to use /var/lib/backup/merged
# If NFS fails, writes go to upperdir which is on separate filesystem

Implement Prometheus monitoring for comprehensive protection:

# Node exporter textfile collector script
#!/bin/bash

MOUNT_STATUS=$(mountpoint -q /mnt/backup && echo 1 || echo 0)
echo "nfs_mount_status{path=\"/mnt/backup\"} $MOUNT_STATUS" \
    > /var/lib/node_exporter/textfile_collector/nfs_mount.prom
  • Set noexec,nosuid,nodev on all NFS mounts
  • Implement daily integrity checks for backup files
  • Configure separate partition for /var with quota support
  • Monitor inode usage in addition to disk space