How to Physically Identify a Failed Hard Drive in Linux with Software RAID

When working with Linux servers using software RAID, one of the most frustrating scenarios is dealing with identical-looking failed drives. Unlike hardware RAID controllers with indicator lights, software RAID setups on commodity hardware provide no visual cues when a drive fails.

The professional approach begins before failure occurs. Create a physical-to-logical mapping of your drives:

# Get drive serial numbers and device mappings
for drive in /dev/sd[a-f]; do
    echo -n "$drive: "
    sudo smartctl -i $drive | grep -i serial
done

# Sample output:
/dev/sda: Serial Number:    WD-WCC4N5PH6K45
/dev/sdb: Serial Number:    WD-WCC4N5PH6K46
/dev/sdc: Serial Number:    WD-WCC4N5PH6K47
[...]

Document these serial numbers physically near your server or in your documentation system.

When a failure occurs, use these Linux commands to identify the problematic drive:

Method 1: Check RAID Status

# Check software RAID status
cat /proc/mdstat
sudo mdadm --detail /dev/md0

This will show which device is marked as failed or removed from the array.

Method 2: Cross-Reference with SMART Data

# Check all drives for SMART errors
for drive in /dev/sd[a-f]; do
    echo "Checking $drive:"
    sudo smartctl -H $drive | grep -i "test result"
done

Method 3: Locate by Device ID

Match the failed device from RAID to physical slot:

# Find physical port mapping
ls -l /dev/disk/by-path/

When you need to physically locate the drive:

# Make the drive LED blink (if supported)
sudo hdparm --identify /dev/sdX

For systems without LED support:

Note the failed device (e.g., /dev/sdd)
Check physical connections to match SATA port numbers
Use the serial number mapping you created earlier

Label drives with their device IDs when installing
Document drive-to-port mapping in your wiki
Implement monitoring that includes drive serial numbers
Consider using drive trays with individual LEDs

When working with commodity servers using software RAID, you'll often encounter identical-looking drives. Unlike enterprise storage with dedicated LED indicators, consumer-grade hardware requires smarter identification methods. Let me share the techniques I've developed over years of Linux sysadmin work.

First, gather intelligence while the system is running. The lsblk command gives you the physical disk hierarchy:

lsblk -o NAME,MODEL,SERIAL,SIZE,ROTA,MOUNTPOINT

Sample output:

sda     ST4000DM004    ZDH1A2K3    3.7T  1
└─sda1                 3.7T        /mnt/data
sdb     ST4000DM004    ZDH1A2K8    3.7T  1
└─md127               3.7T        /mnt/array

Check your RAID status to identify the failed device:

cat /proc/mdstat

md127 : active raid5 sdb1[0] sdc1[2] sdd1[3](F) sde1[4]
      11721038848 blocks super 1.2 level 5, 512k chunk

The (F) flag marks the failed drive. Now match this to physical devices.

Modern Linux provides multiple ways to map logical devices to physical ports:

ls -l /sys/block/sd*/device

This shows the SATA port connections:

lrwxrwxrwx 1 root root 0 Aug  1 09:00 /sys/block/sda/device -> ../../../0:0:0:0
lrwxrwxrwx 1 root root 0 Aug  1 09:00 /sys/block/sdb/device -> ../../../0:0:1:0

For drives still responding but showing errors:

for drive in /dev/sd[a-f]; do
    echo "=== $drive ==="
    smartctl -i $drive | grep -E "Model|Serial"
    smartctl -H $drive | grep "test result"
done

When you finally open the case:

Use SATA port numbers (often printed on motherboard)
Create a physical diagram during initial setup
Temporary label drives with erasable marker
Use drive serial numbers (printed on label)

Here's a script I use for quick identification:

#!/bin/bash
echo "Physical Drive Identification Report"
echo "Generated: $(date)"
echo "------------------------------------"

for device in /sys/block/sd*; do
    devname=$(basename $device)
    model=$(cat $device/device/model)
    serial=$(cat $device/device/serial)
    port=$(readlink $device/device | awk -F: '{print $NF}')
    echo "[Port $port] $devname: $model (SN: $serial)"
done

For future-proofing:

Document drive positions during initial setup
Consider drive caddies with LED indicators
Implement monitoring that logs physical locations

ServerDevWorker

How to Physically Identify a Failed Hard Drive in Linux with Software RAID

Method 1: Check RAID Status

Method 2: Cross-Reference with SMART Data

Method 3: Locate by Device ID

Related Articles