How to Configure a Hot Spare Disk in Linux Software RAID 1 (mdadm) for Automatic Failover

In a software RAID 1 configuration with mdadm, a hot spare is an inactive disk that automatically replaces a failed member of the array. The key characteristics:

Remains idle until a failure occurs
Requires no manual intervention for failover
Must be equal to or larger than existing array members
Automatically synchronizes with working disks when activated

Before proceeding, ensure:

# Verify current RAID status
cat /proc/mdstat
mdadm --detail /dev/md0

# Check disk space (new disk should be ≥ existing members)
lsblk -o NAME,SIZE,ROTA

Assuming your existing array is /dev/md0 with three disks and you're adding /dev/sdd:

# 1. Prepare the new disk (if needed)
parted /dev/sdd mklabel gpt
parted /dev/sdd mkpart primary 1MiB 100%

# 2. Create RAID superblock (if not pre-partitioned)
mdadm --zero-superblock /dev/sdd1

# 3. Add as spare to existing array
mdadm --add /dev/md0 /dev/sdd1

# 4. Verify spare status
mdadm --detail /dev/md0 | grep -A5 'Spare Devices'

After adding the spare, monitor its status:

watch cat /proc/mdstat

# Check detailed array status
mdadm --detail /dev/md0 | grep -E 'State|Spare'

To simulate a disk failure (for testing only):

# Mark disk as faulty
mdadm --manage /dev/md0 --fail /dev/sda1

# Verify automatic replacement
watch -n 1 'mdadm --detail /dev/md0 | grep -A10 "Number"'

Add to /etc/mdadm.conf to persist after reboot:

ARRAY /dev/md0 metadata=1.2 spares=1 name=myserver:0 UUID=xxxxxxx

Monitor email alerts by configuring mdadm.conf mail options
Consider adding multiple spares for critical systems

While the spare is idle:

No performance impact on active array
During resync, expect performance degradation
Monitor with: iostat -x 1

If the spare doesn't activate:

Verify spare is properly added to array
Check kernel logs: dmesg | grep md
Ensure mdadm daemon is running

When maintaining a software RAID1 array with three active disks on CentOS 7, adding a hot spare provides automatic failover protection. The hot spare remains inactive until a disk failure occurs, at which point it automatically rebuilds the array using data from the remaining healthy disks.

Before proceeding, ensure:

The new disk is properly connected and recognized by the system
The disk is at least as large as the smallest disk in the array
You have root privileges on the CentOS 7 system
Backup of important data exists (recommended)

First, identify the new disk:

lsblk
fdisk -l

Assume the new disk is /dev/sdd and the existing RAID is /dev/md0. Prepare the disk as a hot spare:

mdadm --add /dev/md0 /dev/sdd --spare=1

Check the RAID status to confirm the hot spare is properly configured:

mdadm --detail /dev/md0

You should see output similar to:

Number   Major   Minor   RaidDevice State
   0       8        0        0      active sync   /dev/sda
   1       8       16        1      active sync   /dev/sdb
   2       8       32        2      active sync   /dev/sdc
   3       8       48        -      spare          /dev/sdd

To simulate a disk failure (for testing purposes only):

mdadm --manage /dev/md0 --fail /dev/sda

Monitor the rebuild process:

watch -n 1 cat /proc/mdstat

Configure email alerts for RAID events by editing /etc/mdadm.conf:

MAILADDR admin@example.com
PROGRAM /usr/local/bin/raid-alert

Then update the initramfs:

dracut -f

Remember that after a failover:

The failed disk should be replaced with a new spare
Rebuild operations are resource-intensive
Monitor disk health with SMART tools

Regularly verify your RAID status with:

mdadm --detail --scan >> /etc/mdadm.conf

ServerDevWorker

How to Configure a Hot Spare Disk in Linux Software RAID 1 (mdadm) for Automatic Failover

Related Articles