“Resolving ‘Device or Resource Busy’ Errors When Mounting/fscking Partitions on HP Proliant Servers”


1 views

After an improper shutdown of our CentOS 5.9 server running on an HP Proliant, we encountered a particularly stubborn issue with the /home partition. The system insisted the device was busy despite all evidence to the contrary:

# mount -t ext3 /dev/cciss/c0d0p1 /home
mount: /dev/cciss/c0d0p1 already mounted or /home busy

# fsck /dev/cciss/c0d0p1
fsck.ext3: Device or resource busy while trying to open /dev/cciss/c0d0p1
Filesystem mounted or opened exclusively by another program?

Standard troubleshooting tools showed no obvious culprits:

# lsof /dev/cciss/c0d0p1
(no output)

# fuser /dev/cciss/c0d0p1
(no output)

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d1p3    190G   24G  156G  14% /
(no /home partition shown)

Booting from a CentOS LiveCD revealed the partition was actually healthy:

  • Mount/unmount operations worked normally
  • fsck completed without errors
  • Journal inode could be cleared and rebuilt

The critical breakthrough came when examining device mapper information:

# dmsetup table
mpath0: 0 3516173232 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 104:0 1000 
mpath0p1: 0 3516162552 linear 253:0 63

# multipath -ll
mpath0 (3600508b1001cb6e6453d25c4052abca5) dm-0 HP,LOGICAL VOLUME
[size=1.6T][features=1 queue_if_no_path][hwhandler=0][rw]

Flushing the multipath mappings resolved the issue:

# multipath -F
# multipath -ll
(no output - mappings cleared)

# mount -t ext3 /dev/cciss/c0d0p1 /home
(successful)

The HP Proliant's RAID controller combined with CentOS's multipath daemon created a situation where:

  1. The physical device (/dev/cciss/c0d0p1) was being managed by device mapper
  2. After improper shutdown, the multipath mappings weren't properly released
  3. This caused the kernel to maintain an exclusive lock on the device

To prevent recurrence:

# vi /etc/multipath.conf
# Add under "defaults":
    fast_io_fail_tmo 10
    dev_loss_tmo 30
    no_path_retry fail

When dealing with storage devices on HP Proliant servers running CentOS, you might encounter a particularly stubborn situation where the system insists a partition is busy despite all evidence to the contrary. Let's examine how to properly diagnose and resolve this multipath-related issue.

After an improper shutdown, you'll typically see:

# mount -t ext3 /dev/cciss/c0d0p1 /home
mount: /dev/cciss/c0d0p1 already mounted or /home busy

# fsck /dev/cciss/c0d0p1
fsck.ext3: Device or resource busy while trying to open /dev/cciss/c0d0p1

Yet standard diagnostics show no obvious mounts or processes:

# df
# mount
# lsof /dev/cciss/c0d0p1
# fuser /dev/cciss/c0d0p1  

The real issue often lies with the device mapper maintaining a hold on the storage:

# dmsetup table
mpath0: 0 3516173232 multipath 1 queue_if_no_path 0 1 1 round-robin 0 1 1 104:0 1000
mpath0p1: 0 3516162552 linear 253:0 63

Even though the logical volume isn't actively mounted, the multipath daemon keeps the device locked.

Here's the complete procedure to regain control of your storage:

# First verify multipath status
multipath -ll

# Flush all multipath mappings
multipath -F

# Verify all mappings are gone
multipath -ll

# Now attempt the mount
mount -t ext3 /dev/cciss/c0d0p1 /home

# Confirm successful mount
cat /proc/mounts | grep home

To avoid similar situations:

  • Implement proper shutdown procedures
  • Consider adding fast_io_fail_tmo=10 and dev_loss_tmo=30 to your multipath.conf
  • Regularly check multipath status with multipath -v3

If the standard solution doesn't work, try these advanced techniques:

# Forcefully remove device mapper entries
dmsetup remove_all

# Or target specific devices
dmsetup remove mpath0
dmsetup remove mpath0p1

Remember that these more aggressive approaches should only be used when standard methods fail.