When you receive a "DegradedArray" notification from mdadm regarding /dev/md1, it means one of your RAID1 mirrors has failed. In your case, the kernel logs show:
md1 : active raid1 sdb3[2](F) sda3[1]
1860516800 blocks [2/1] [_U]
The _U
status indicates only /dev/sda3 is active, while (F)
marks /dev/sdb3 as failed.
Before attempting repairs, verify the physical drive status:
sudo smartctl -a /dev/sdb | grep -i error
sudo dmesg | grep -i sdb
Your kernel logs reveal UNC (Uncorrectable) errors, which typically indicate physical media issues:
Feb 23 14:55:19 triton1017 kernel: [24036613.378608] ata1.00: error: { UNC }
You followed the correct procedure to re-add the drive:
sudo mdadm --remove /dev/md1 /dev/sdb3
sudo mdadm --add /dev/md1 /dev/sdb3
However, the array remains degraded because the drive keeps failing during resync:
md1 : active raid1 sdb3[2](S) sda3[1]
1860516800 blocks [2/1] [_U]
For production systems, I recommend:
- Immediate backup of critical data from /dev/md1
- Drive replacement procedure:
# Mark the drive as failed if not already
sudo mdadm --fail /dev/md1 /dev/sdb3
# Remove from array
sudo mdadm --remove /dev/md1 /dev/sdb3
# After physical replacement, add new drive
sudo mdadm --add /dev/md1 /dev/sdb3
Track rebuild progress with:
watch -n 10 cat /proc/mdstat
Or get detailed status:
sudo mdadm --detail /dev/md1 | grep -i recovery
Add these to your monitoring:
# Check array status
sudo mdadm --monitor --scan --daemonize
# SMART monitoring
sudo smartd --scan
Configure email alerts in /etc/mdadm/mdadm.conf
:
MAILADDR your@email.com
When your Linux system emails you about a "DegradedArray event on md device /dev/md1", it indicates one of your RAID1 mirrors has failed. The key indicators are:
md1 : active raid1 sdb3[2](F) sda3[1]
1860516800 blocks [2/1] [_U]
The [2/1]
shows only 1 of 2 drives is active, and [_U]
confirms /dev/sdb3 is marked as failed (F).
Kernel logs reveal media errors on the drive:
Feb 23 14:55:21 triton1017 kernel: [24036616.262531] ata1.00: failed command: READ FPDMA QUEUED
Feb 23 14:55:21 triton1017 kernel: [24036616.262540] res 41/40:80:38:5a:b4/00:00:75:00:00/00 Emask 0x409 (media error) <F>
This UNC (Uncorrectable Error) suggests physical sector failure. Check SMART status:
smartctl -a /dev/sdb | less
# Check for:
# - Reallocated_Sector_Ct
# - Current_Pending_Sector
# - Offline_Uncorrectable
Before replacing hardware, attempt re-adding the drive:
# Remove failed device
sudo mdadm --remove /dev/md1 /dev/sdb3
# Re-add after checking connections
sudo mdadm --add /dev/md1 /dev/sdb3
# Monitor rebuild progress
watch -n 5 cat /proc/mdstat
If rebuild fails repeatedly, you'll see:
md1 : active raid1 sdb3[2](S) sda3[1]
1860516800 blocks [2/1] [_U]
For persistent media errors:
# 1. Mark drive as failed if not auto-detected
sudo mdadm --fail /dev/md1 /dev/sdb3
# 2. Remove from array
sudo mdadm --remove /dev/md1 /dev/sdb3
# 3. Schedule physical replacement with DC
# 4. After replacement, partition new disk identically:
sudo sfdisk -d /dev/sda | sudo sfdisk /dev/sdb
sudo mdadm --add /dev/md1 /dev/sdb3
Add this to /etc/mdadm/mdadm.conf:
MAILADDR admin@yourdomain.com
ARRAY /dev/md1 metadata=0.90 UUID=ec02d5ce:8554d4ad:7792c71e:7dc17aa4
If both drives fail, force assemble read-only:
sudo mdadm --assemble --readonly /dev/md1 /dev/sda3