# Typical error seen in /var/log/messages
May 5 16:54:35 fs-2 kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
May 5 16:54:35 fs-2 kernel: ata1: SError: { PHYRdyChg CommWake }
May 5 16:54:40 fs-2 kernel: ata1: link is slow to respond, please be patient (ready=0)
From your lspci output, the problematic system uses:
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
00:05.2 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a3)
The MCP55 chipset has limited hot-plug support. Compare this to your working system (www-1) which has:
09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS (rev 04)
For AHCI-enabled controllers, try:
# First method
echo 1 > /sys/class/scsi_host/hostX/scan
# Alternative method
echo "- - -" > /sys/class/scsi_host/hostX/scan
For legacy IDE/SATA controllers, force a rescan with:
# For the entire HBA
echo 1 > /sys/bus/pci/rescan
# For specific device (replace 00:05.0 with your device)
echo 1 > /sys/bus/pci/devices/0000:00:05.0/rescan
When basic scans fail, try these deeper diagnostics:
# Check link status
cat /sys/class/ata_link/link1/ata_device/dev_state
# Reset the entire port
echo 1 > /sys/class/scsi_host/hostX/ahci_port_hardreset
# Check supported features
cat /sys/class/scsi_host/hostX/ahci_host_caps
If the drive remains undetected, try unloading and reloading the driver module:
# For AHCI
modprobe -r ahci && modprobe ahci
# For legacy IDE
modprobe -r ata_piix && modprobe ata_piix
Warning: This will temporarily disconnect ALL drives using that controller.
Consider these hardware/configuration improvements:
- Upgrade to AHCI-capable controllers (check BIOS settings)
- Use enterprise-grade HBAs with proper hot-plug support
- Enable PCIe hotplug in BIOS (if available)
- Consider SAS drives instead of SATA for critical storage
When replacing a failed SATA drive (/dev/sda
) on a running RHEL 5.3 system, the kernel fails to recognize the new device despite multiple reset attempts. The logs show:
ata1: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
ata1: SError: { PHYRdyChg CommWake }
ata1: reset failed, giving up
Common approaches like rescan-scsi-bus.sh
and manual host scans failed:
echo "---" > /sys/class/scsi_host/host0/scan
-bash: echo: write error: Invalid argument
/root/rescan-scsi-bus.sh -l
0 new device(s) found.
The issue stems from the NVIDIA MCP55 SATA controller's limited hot-swap support. For systems where it works (like our www-1 server), the key difference is an additional LSI SAS controller:
09:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1064ET
Try these steps in order when standard methods fail:
# 1. Remove the device path
echo 1 > /sys/block/sda/device/delete
# 2. Full controller reset (for NVIDIA MCP55)
for i in /sys/class/scsi_host/host*/scan; do
echo "- - -" > $i
done
# 3. Manual AHCI reinitialization
echo 0 > /sys/class/scsi_host/host0/ahci_port_stop
sleep 2
echo 1 > /sys/class/scsi_host/host0/ahci_port_start
As last resort before rebooting:
# Unload and reload the AHCI driver
modprobe -r ahci
sleep 5
modprobe ahci
For reliable hot-swap:
- Use enterprise-grade HBAs (LSI/Broadcom)
- Verify drive firmware supports hot-swap (Seagate ES.2 series does)
- Consider upgrading to RHEL 6+ with better AHCI support