Best Methods to Stress Test New WD RED HDDs for ZFS RAID-Z2 Storage Server Deployment

Before anything else, perform a visual inspection of all drives. Then check SMART attributes using smartctl:


# Install smartmontools if needed
sudo apt install smartmontools

# Check basic SMART info for /dev/sdX
sudo smartctl -i /dev/sdX

# Run short self-test
sudo smartctl -t short /dev/sdX

# Check test results
sudo smartctl -l selftest /dev/sdX

A destructive read-write test is the most thorough way to detect early failures:


# WARNING: This will erase all data!
sudo badblocks -b 4096 -wsv /dev/sdX

# Non-destructive read-only alternative
sudo badblocks -b 4096 -sv /dev/sdX

Schedule extended SMART tests overnight for all drives:


for drive in /dev/sd{b..k}; do
  sudo smartctl -t long $drive
done

# Monitor progress (run next day)
for drive in /dev/sd{b..k}; do
  sudo smartctl -l selftest $drive | grep -i "test remaining"
done

After basic validation, create a temporary pool for stress testing:


# Create test pool (adjust devices accordingly)
sudo zpool create -f -o ashift=12 testpool raidz2 /dev/sd{b..k}

# Generate random test data
openssl enc -aes-256-ctr -pass pass:"$(dd if=/dev/urandom bs=128 count=1 2>/dev/null | base64)" \
  -nosalt


Here's a comprehensive test script for multiple drives:

#!/bin/bash

DEVICES=(/dev/sd{b..k})

for device in "${DEVICES[@]}"; do
  echo "=== Testing $device ==="
  
  # SMART short test
  smartctl -t short $device
  sleep 2m
  smartctl -l selftest $device | grep "test result"
  
  # Badblocks non-destructive
  badblocks -b 4096 -sv $device -o "${device##*/}_badblocks.txt"
  
  # SMART extended test
  smartctl -t long $device
  echo "Started long test on $device"
done

echo "All tests initiated. Monitor progress with:"
echo "smartctl -l selftest /dev/sdX"

Keep monitoring for at least 72 hours after initial tests:

watch -n 3600 'for d in /dev/sd{b..k}; do \
  echo -n "$d: "; \
  smartctl -a $d | grep -E "Temperature|Reallocated|Pending|Uncorrectable"; \
done'

Key warning signs to watch for:

Reallocated sectors > 0
Pending sectors > 0
Uncorrectable sectors > 0
Temperature consistently > 50°C
Any SMART test failures
ZFS checksum errors during scrub


When setting up a storage server with multiple new HDDs (especially in a ZFS RAID-Z2 configuration like your 10x2TB WD Red setup), proper pre-deployment testing is crucial. Infant mortality in hard drives follows the "bathtub curve" - failures are most likely either immediately or after years of use. Here's my professional testing protocol:
# Basic SMART quick test
smartctl -t short /dev/sdX

# Extended SMART test (takes hours but thorough)
smartctl -t long /dev/sdX

# Check reallocated sectors count
smartctl -A /dev/sdX | grep Reallocated_Sector_Ct

# Check pending sectors
smartctl -A /dev/sdX | grep Current_Pending_Sector
I recommend running a full read/write cycle using badblocks (destructive test - only for new drives):
badblocks -wsv -b 4096 -t random -o badblocks.log /dev/sdX
This performs:

4 passes (-w): write pattern, read verify, write inverse, read verify
Verbose output (-v) and sector size specification (-b 4096 for 4K sectors)
Random pattern testing (-t random) which is more thorough than sequential

Once individual drives pass testing, create your ZFS pool with proper ashift:
zpool create -o ashift=12 tank raidz2 sda sdb sdc sdd sde sdf sdg sdh sdi sdj
Then perform a scrub to verify the entire array:
zpool scrub tank
Here's a bash script I use to automate testing across multiple drives:
#!/bin/bash
for drive in /dev/sd{a..j}; do
  echo "Testing $drive..."
  smartctl -t short $drive
  sleep 2m  # Wait for short test completion
  smartctl -H $drive | grep "test result" || echo "SMART test failed for $drive"
  badblocks -sv -b 4096 -t random -o ${drive##*/}_badblocks.log $drive
done
Run this command to watch SMART attributes during testing:
watch -n 60 'for d in /dev/sd{a..j}; do echo $d; smartctl -A $d | grep -E "Reallocated|Pending|Uncorrectable"; done'
Red flags to watch for:

Any reallocated sectors (should be 0 on new drives)
Pending sectors that don't clear after multiple tests
Rising UDMA CRC errors (could indicate cable issues)
High seek error rates or spin retries

ServerDevWorker

Best Methods to Stress Test New WD RED HDDs for ZFS RAID-Z2 Storage Server Deployment

Related Articles