Debugging Linux ls Command Hanging in /var/www Directory: NFS Mounts, Permission Issues and Kernel Blocking Analysis


2 views

When encountering a hanging ls command in a specific directory like /var/www, while other commands like find and tab completion work normally, we're likely dealing with one of these scenarios:

# Basic diagnostics to run immediately:
df -h /var/www          # Check disk space
df -i /var/www          # Check inode usage
mount | grep /var/www   # Check mount options
ls -ld /var/www         # Check directory permissions

The dmesg output shows the process is blocked in the kernel's VFS layer during a stat operation. The call trace indicates a mutex lock contention:

# Key indicators from dmesg:
INFO: task ls:23579 blocked for more than 120 seconds.
Call Trace:
 [<ffffffff8119d279>] ? __find_get_block+0xa9/0x200
 [<ffffffff814c97ae>] __mutex_lock_slowpath+0x13e/0x180

Based on the strace output showing successful directory opening but hanging during operations, these are potential causes:

  • NFS or network-mounted filesystem timeouts
  • Filesystem corruption in metadata areas
  • Kernel bugs in specific versions handling certain inode types
  • Broken symlinks or circular references

To pinpoint the exact issue, try these investigative commands:

# Check for NFS issues:
nfsstat -m

# Look for filesystem errors:
sudo touch /var/www/testfile
sudo rm /var/www/testfile

# Alternative directory listing methods:
ls -U /var/www       # Unsorted (may bypass sorting hangs)
ls -f /var/www       # Force no sorting
ls --hide="*" /var/www  # Try listing no files

While investigating, these approaches can help maintain productivity:

# Use alternative commands:
find /var/www -maxdepth 1 -printf '%f\n'

# Create an alias for safe listing:
alias safels='ls -U --color=never'

# Use a different shell builtin:
echo /var/www/*

After identifying the root cause, consider these fixes:

# For NFS issues:
sudo mount -o remount,soft,intr /var/www

# For filesystem corruption:
sudo fsck /dev/mapper/vg_dev-lv_root

# For permission problems:
sudo restorecon -Rv /var/www

# Kernel parameter adjustment (temporary):
echo 0 > /proc/sys/kernel/hung_task_timeout_secs

To avoid recurrence:

  • Implement monitoring for hung processes
  • Regularly check filesystem integrity
  • Keep kernel and filesystem utilities updated
  • Consider using ls --color=never in scripts

The problem manifests when trying to list contents of /var/www directory, where ls hangs indefinitely while other commands like find work normally. Key observations:

# These work:
$ find /var/www
$ cd /var/www/<TAB> # shows completion

# These hang:
$ ls /var/www
$ ls -l /var/www

The dmesg output reveals the ls process is blocked in kernel space:

INFO: task ls:23579 blocked for more than 120 seconds.
Call Trace:
 [<ffffffff8119d279>] ? __find_get_block+0xa9/0x200
 [<ffffffff814c97ae>] __mutex_lock_slowpath+0x13e/0x180

This indicates a filesystem-level lock contention, possibly related to:

  • Broken symbolic links
  • NFS mounts (if applicable)
  • Filesystem corruption
  • Mount point issues

Try these troubleshooting steps:

# Check for broken symlinks
$ find /var/www -type l -exec test ! -e {} \; -print

# Check mount points
$ mount | grep /var/www

# Alternative directory listing
$ ls -U /var/www  # Unsorted (might bypass sorting issues)

The strace output shows successful directory operations:

getdents(3, /* 16 entries */, 32768)    = 488
getdents(3, /* 0 entries */, 32768)     = 0

Yet the command hangs afterwards, suggesting the issue occurs during:

  • Permission checking
  • File stat operations
  • Color formatting output

Immediate fixes to try:

# Skip color formatting
$ ls --color=never /var/www

# Use alternative tools
$ tree -L 1 /var/www
$ vdir /var/www

# Disable NFS attribute caching (if applicable)
$ mount -o remount,noac /var/www

For a long-term solution:

# Check filesystem
$ sudo fsck /dev/mapper/vg_dev-lv_root

# Rebuild directory index
$ sudo bash -c "cd /var/www && ls -la > /dev/null"

# Check for SELinux context
$ ls -Z /var/www
$ restorecon -Rv /var/www

For deeper analysis:

# Capture kernel debug info
$ echo 1 | sudo tee /proc/sys/kernel/sysrq
$ echo t | sudo tee /proc/sysrq-trigger

# Check for kernel bugs
$ grep -i "filesystem" /var/log/kern.log