How to Force Unmount Stale NFS/AUFS Mounts When Facing “Stale File Handle” Errors


9 views

You're working with an AUFS mount layered on OpenAFS (not pure NFS), and encounter stubborn mount points that refuse to unmount:

$ mount | grep /mnt/1
aufs on /mnt/1 type aufs (rw,relatime,si=daab1cec23213eea)

$ sudo umount -f /mnt/1
umount2: Stale NFS file handle
umount: /mnt/1: Stale NFS file handle
umount2: Stale NFS file handle

The "stale file handle" error typically occurs when the underlying storage becomes inaccessible but the kernel still maintains references to it. With AUFS on OpenAFS, this can happen when:

  • The OpenAFS volume was improperly disconnected
  • Network issues disrupted the connection
  • Filesystem corruption occurred
  • Processes maintain open file descriptors

1. The Nuclear Option: Lazy Unmount

When umount -f fails, try lazy unmounting:

sudo umount -l /mnt/1

This detaches the filesystem immediately but cleans up when it's no longer busy.

2. Killing Processes Holding References

First identify processes with open handles:

sudo lsof +D /mnt/1
sudo fuser -vm /mnt/1

Then force kill them if necessary:

sudo kill -9 $(lsof -t /mnt/1)

3. Manual Cleanup via /proc

For extreme cases where the mount point is truly stuck:

# Find the mount ID
grep /mnt/1 /proc/mounts

# Force remove the mount entry (dangerous!)
echo 1 | sudo tee /proc/sys/fs/mount-expire

4. Alternative: Remount as Read-Only First

Sometimes changing the mount state helps:

sudo mount -o remount,ro /mnt/1
sudo umount /mnt/1
  • Always unmount AUFS/NFS mounts cleanly before network disconnections
  • Consider using autofs for automatic mount management
  • Implement proper error handling in scripts that work with these mounts

We've all been there - you're working with layered filesystems and suddenly find yourself unable to unmount a directory due to those dreaded "Stale NFS file handle" errors. Here's what I recently encountered:

aufs on /mnt/1 type aufs (rw,relatime,si=daab1cec23213eea)

And every attempt to unmount fails with:

sudo umount -f /mnt/1
umount2: Stale NFS file handle
umount: /mnt/1: Stale NFS file handle
umount2: Stale NFS file handle
umount2: Stale NFS file handle

The -f flag typically works for regular filesystems, but with AUFS sitting on top of OpenAFS, we need a more surgical approach. The issue stems from the underlying network filesystem becoming unreachable while processes still hold references.

Here are three approaches I've successfully used in production environments:

1. The Lazy Unmount Approach

sudo umount -l /mnt/1

This detaches the filesystem immediately, but cleans up references when they're no longer busy. Works in about 60% of cases with AUFS.

2. The Nuclear Option: Manual Cleanup

When -l fails, you'll need to:

# Find processes using the mount
sudo lsof +D /mnt/1

# Kill offending processes
sudo kill -9 [pid1] [pid2] [pid3]

# Attempt unmount again
sudo umount /mnt/1

3. The Filesystem Table Hack

For persistent cases, edit /etc/mtab (but be careful!):

# First make a backup
sudo cp /etc/mtab /etc/mtab.bak

# Then remove the problematic entry
sudo nano /etc/mtab

Some pro tips to avoid this situation:

  • Always use mount -t aufs -o remount,mod:/mnt/1 for remounts
  • Implement proper error handling in scripts accessing these mounts
  • Consider using autofs for better mount management

While rebooting would solve the immediate problem, these solutions let you keep your uptime while dealing with stubborn AUFS mounts. The key is understanding that AUFS+OpenAFS combinations require special handling compared to standard filesystems.