Dell's PERC H710 RAID controller presents a unique monitoring challenge on CentOS 6 systems. Unlike standard disks, the controller abstracts physical drives, making traditional tools like smartctl ineffective:
# smartctl -a /dev/sda
Device does not support SMART
Error Counter logging not supported
Device does not support Self Test logging
After extensive testing, I found that Dell's MegaCLI utility (now replaced by storcli) provides the necessary access. However, the standard LSI tools don't support H710 out of the box.
First, download the appropriate version for CentOS 6:
wget https://dl.dell.com/FOLDERXXXXXX/perccli-1.17.10-1.noarch.rpm
rpm -ivh perccli-1.17.10-1.noarch.rpm
Use the following command to view physical disk health:
# /opt/MegaRAID/perccli/perccli64 /c0 show all
Sample output showing healthy disks:
----------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
----------------------------------------------------------------------
252:0 3 Onln 0 1.818 TB SATA HDD N N 512 WDC WD20EFRX-68A U
252:1 4 Onln 0 1.818 TB SATA HDD N N 512 WDC WD20EFRX-68A U
Here's a bash script to check disk status and send email alerts:
#!/bin/bash
LOG_FILE="/var/log/disk_monitor.log"
EMAIL="admin@example.com"
# Check disk status
DISK_STATUS=$(/opt/MegaRAID/perccli/perccli64 /c0 /eall /sall show | grep -E 'Offln|Degraded|Failed')
if [ -n "$DISK_STATUS" ]; then
echo "$(date) - Disk alert: $DISK_STATUS" >> $LOG_FILE
echo "$DISK_STATUS" | mail -s "RAID Disk Alert" $EMAIL
fi
Add this to crontab for hourly checks:
0 * * * * /path/to/disk_monitor.sh
For newer systems, consider migrating to storcli:
wget https://downloads.dell.com/FOLDERYYYYYY/storcli-1.21.06-1.noarch.rpm
rpm -ivh storcli-1.21.06-1.noarch.rpm
The equivalent check command becomes:
# /opt/MegaRAID/storcli/storcli64 /c0 /eall /sall show
- Always verify the download URLs from Dell's official support site
- Test email functionality before deploying in production
- Consider logging to syslog for better integration
- Monitor controller battery health separately
When working with Dell PowerEdge servers running CentOS 6 with PERC H710 RAID controllers, traditional monitoring tools hit significant limitations. The combination of:
- LSI MegaRAID SAS 2208 chipset (Thunderbolt architecture)
- CentOS 6's aging driver stack
- Dell's limited Linux support for legacy systems
creates a perfect storm for administrators needing physical disk health monitoring.
Standard approaches fail here:
# smartctl cannot access physical disks behind RAID
smartctl -a /dev/sda
# Output: "Device does not support SMART"
# megacli tools from LSI aren't compatible
rpm -ivh MegaCli-8.07.14-1.noarch.rpm
# Fails with driver conflicts
While not officially supported for CentOS 6, Dell OpenManage Server Administrator (OMSA) v7.4 can be forced to work:
# Add Dell repository
wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash
# Install specific components
yum install srvadmin-all libsmbios libcmpiCppImpl0 openwsman-client
# Start services
/opt/dell/srvadmin/sbin/srvadmin-services.sh start
Create a bash script to parse OMSA output:
#!/bin/bash
# monitor_raid.sh
ALERT_EMAIL="admin@example.com"
# Check physical disk states
DISK_STATUS=$(omreport storage pdisk controller=0 | grep "State" | awk '{print $3}')
for status in $DISK_STATUS; do
if [[ "$status" != "Online" ]]; then
FAILED_DISKS=$(omreport storage pdisk controller=0 | grep -B 5 -A 3 "State.*$status")
echo "$FAILED_DISKS" | mail -s "RAID Disk Alert on $(hostname)" $ALERT_EMAIL
exit 1
fi
done
# Check battery status
BATTERY_HEALTH=$(omreport storage battery | grep "State" | awk '{print $3}')
if [[ "$BATTERY_HEALTH" != "Good" ]]; then
echo "RAID battery problem: $BATTERY_HEALTH" | mail -s "RAID Battery Alert" $ALERT_EMAIL
fi
Add to crontab for regular checks:
# Add to /etc/crontab
0 */4 * * * root /usr/local/bin/monitor_raid.sh
For systems where OMSA won't install, try this direct approach:
# Extract physical disk info
PDLIST=$(/opt/MegaRAID/MegaCli/MegaCli64 -PDList -aALL | grep -E "Firmware state|Slot Number")
# Parse output
while read -r line; do
if [[ $line == *"Slot"* ]]; then
SLOT=${line#*:}
elif [[ $line == *"Firmware"* ]]; then
if [[ $line != *"Online"* && $line != *"Hotspare"* ]]; then
echo "Disk in slot $SLOT failed: $line"
fi
fi
done <<< "$PDLIST"
- OMSA 7.4 is the last version with CentOS 6 compatibility
- MegaCLI 8.07.14 may work but requires manual driver patching
- Always test email alerts before production deployment