How to Fix “Stale NFS File Handle” Error After Server/Client Reboot in Linux


2 views

Many sysadmins encounter this frustrating scenario: Your NFS shares work perfectly until the next reboot cycle, then clients suddenly report "Stale NFS file handle" errors. Let me walk through the complete diagnostic and resolution process I've used in production environments.

The "stale file handle" error occurs when the client's cached file references no longer match the server's actual state. This typically happens when:

  • The server's filesystem gets recreated or reformatted
  • NFS exports get reorganized without proper remounting
  • Inode numbers change after server maintenance
  • The server's nfsd service restarts with different configurations

First, confirm your exports are properly configured and active:

# Verify exports file
cat /etc/exports
/data 192.168.1.0/24(rw,no_subtree_check,async,no_root_squash)

# Check active exports
exportfs -v
/data      192.168.1.0/24(rw,wdelay,no_root_squash,no_subtree_check)

When the error occurs, these steps typically resolve it:

# Force unmount the stale mount point
umount -f -l /data

# Clear NFS client cache
echo 1 > /proc/sys/vm/drop_caches

# Remount the share
mount -t nfs server:/data /data

To prevent recurrence after reboots:

# On server: Edit /etc/exports with these options
/data 192.168.1.0/24(rw,sync,no_subtree_check,fsid=0)

# On client: Use more resilient mount options in /etc/fstab
server:/data /data nfs rw,bg,hard,intr,rsize=8192,wsize=8192,timeo=14 0 0

For persistent cases, use these diagnostic tools:

# Check NFS server status
rpcinfo -p

# View active NFS connections
nfsstat -c

# Monitor NFS operations in real-time
mount -t nfs -o snoop server:/data /data

If the error persists after trying all above solutions:

  1. Restart NFS services on both ends
  2. Consider using autofs for dynamic mounting
  3. Check for duplicate inodes with 'find -inum'
  4. Verify network connectivity isn't dropping packets

This classic NFS error typically occurs when the client attempts to access files after the server's NFS exports have changed or the server was rebooted. The "stale file handle" indicates the client's cached file references no longer match the server's current state.

First verify the basic NFS setup on both machines:

# On server
showmount -e localhost
# Should display:
# /data 192.168.1.0/24

# On client
showmount -e nfs_server_ip
# Should match server's output

Here's the complete remediation process:

# On all clients:
umount -f -l /data
# -f forces unmount
# -l lazy unmount for busy mounts

# Then remount:
mount -t nfs server_ip:/data /data

For automated remounting after reboots, add to /etc/fstab:

server_ip:/data  /data  nfs  rw,hard,intr  0  0

Modify your /etc/exports on the server for better resilience:

/data 192.168.1.0/24(rw,sync,no_subtree_check,no_root_squash,fsid=0)
# Then reload exports:
exportfs -ra

When the issue persists, enable NFS debugging:

# On client:
mount -t nfs -o debug server_ip:/data /mnt
# Check logs:
dmesg | tail -20

# On server:
rpcdebug -m nfsd -s all
# Monitor operations:
nfsstat -o all

Consider these production-grade optimizations:

# Server's /etc/exports:
/data 192.168.1.0/24(rw,sync,no_wdelay,insecure_locks,no_root_squash)

# Client's mount options:
mount -t nfs -o hard,intr,timeo=30,retrans=3,rsize=32768,wsize=32768 server_ip:/data /data

For dynamic environments, configure automount:

# Install autofs
apt install autofs

# Configure /etc/auto.master:
/data /etc/auto.nfs --timeout=60

# Create /etc/auto.nfs:
* -fstype=nfs,hard,intr server_ip:/data/&

# Restart service:
systemctl restart autofs