How to Force Rebuild a Degraded mdadm RAID5 Array with DriveError Flag on Synology NAS


3 views

When working with Synology NAS devices, you might encounter a particularly stubborn situation where a RAID array refuses to rebuild despite having a functional spare drive. The root cause lies in Synology's customized md driver implementation that adds a special 'DriveError' flag (E state) to the rdev->flags structure.

From the /proc/mdstat output, we can see the critical indicators:

md2 : active raid5 sdb5[1] sda5[5](S) sde5[4](E) sdd5[3] sdc5[2]
      11702126592 blocks super 1.2 level 5, 64k chunk, algorithm 2 [5/4] [_UUUE]

Key observations:
- The array shows 4 active devices out of 5 ([5/4])
- /dev/sda5 is marked as spare (S)
- /dev/sde5 has the problematic (E) state
- The array won't automatically rebuild despite having a spare available

Before resorting to more drastic measures, these standard approaches typically fail with Synology's modified mdadm:
- mdadm --manage /dev/md2 --add /dev/sda5
- mdadm --manage /dev/md2 --replace /dev/sde5 --with /dev/sda5
- Various combinations of --remove and --re-add commands

The only reliable method I've found involves stopping and recreating the array:

# First stop the array
mdadm --stop /dev/md2

# Recreate with original parameters
mdadm --verbose \
   --create /dev/md2 --chunk=64 --level=5 \
   --raid-devices=5 missing /dev/sdb5 /dev/sdc5 /dev/sdd5 /dev/sde5

# Add the spare drive
mdadm --manage /dev/md2 --add /dev/sda5

1. Drive Order Matters: Maintain the original drive ordering as seen in /proc/mdstat
2. Missing Parameter: The "missing" keyword replaces the failed drive position
3. Monitoring Progress: Watch the rebuild via cat /proc/mdstat or mdadm --detail /dev/md2

After successful rebuild:
- Consider replacing the questionable (E) flagged drive
- Run a full filesystem check if using LVM or other layers
- Verify data integrity with checksum comparisons if possible

By recreating the array, we effectively reset the DriveError flag state while maintaining the existing data structure. The key is that we're not actually destroying data - we're just reinitializing the RAID metadata while keeping the existing stripe information intact.

For those uncomfortable with command-line operations, consider booting the Synology into a standard Linux environment where you might have more control over the mdadm parameters. However, in my experience, the above method has worked reliably across multiple Synology models.


Synology's customized md driver introduces a unique challenge with its 'DriveError' flag (shown as (E) in mdstat). This proprietary modification prevents standard recovery procedures when:

md2 : active raid5 sdb5[1] sde5[4](E) sdd5[3] sdc5[2]
      11702126592 blocks [5/4] [_UUUE]

Always verify the array status first:

cat /proc/mdstat
mdadm --detail /dev/mdX
smartctl -a /dev/sdX5

The most effective solution I've found involves controlled array reconstruction:

# Stop the array safely
mdadm --stop /dev/md2

# Recreate with original parameters (note the 'missing' placeholder)
mdadm --verbose --create /dev/md2 \
   --chunk=64 --level=5 --raid-devices=5 \
   missing /dev/sdb5 /dev/sdc5 /dev/sdd5 /dev/sde5

# Add the replacement drive
mdadm --manage /dev/md2 --add /dev/sda5

For less severe cases, try forcing the E-state drive back online:

mdadm --manage /dev/md2 --force /dev/sde5
echo repair > /sys/block/md2/md/sync_action
  • Always backup critical data before attempting recovery
  • Record your original RAID parameters (chunk size, layout)
  • Monitor rebuild progress with: watch cat /proc/mdstat

After successful rebuild:

# Check array consistency
mdadm --detail /dev/md2 | grep -i state

# Verify filesystem integrity
e2fsck -f /dev/md2