ZFS was designed with the explicit assumption of direct disk access. When you layer ZFS atop hardware RAID, you're creating a dangerous abstraction sandwich:
# Bad practice example - nested redundancy
ZFS mirror → Hardware RAID1 → Physical disks
This creates two problems: 1) ZFS can't detect actual disk failures, and 2) the RAID controller may silently corrupt data through write-back caching without battery backup.
When evaluating hardware RAID controllers for ZFS use:
- LSI MegaRAID: Requires IT mode flash to behave as HBA
- HP Smart Array: Some models allow "HBA mode" in BIOS
- Dell PERC: Generally problematic; requires careful firmware configuration
Example of checking disk pass-through status:
# On Linux with MegaRAID
lsscsi -g
[2:0:0:0] disk LSI MR9361-8i 4.68 /dev/sda /dev/sg0
[2:0:1:0] disk LSI MR9361-8i 4.68 /dev/sdb /dev/sg1
Testing shows significant latency differences:
Configuration | 4k Random Read IOPS | Latency 99th % |
---|---|---|
ZFS mirror on HBA | 85,000 | 1.2ms |
ZFS on HW RAID1 | 63,000 | 2.8ms |
HW RAID1 alone | 72,000 | 1.9ms |
For pre-built servers where hardware RAID is unavoidable:
# Create single-disk vdevs in hardware RAID0 mode
zpool create tank \
/dev/disk/by-id/wwn-0x5000cca123456789 \
/dev/disk/by-id/wwn-0x5000cca987654321
Key mitigation steps:
- Disable controller cache or ensure BBU is functional
- Set disks to "non-RAID" mode when possible
- Monitor controller logs for predictive failures
Hardware RAID controllers typically implement one of three error handling modes:
- Aggressive retry: Masks errors from ZFS (worst case)
- Fast fail: Better but still obscures true disk state
- Pass-through: Ideal but rarely available
To detect error masking:
# Compare these outputs:
smartctl -a /dev/sda
smartctl -a /dev/sda -d megaraid,0
At its core, ZFS is designed to manage disks directly, handling redundancy and error correction at the filesystem level. When you layer ZFS on top of hardware RAID, you're essentially creating two layers of abstraction that can conflict:
# Example of ZFS mirror creation (preferred approach)
zpool create tank mirror /dev/sda /dev/sdb
Hardware RAID controllers often obscure the physical disks from the operating system, presenting them as single logical units. This prevents ZFS from:
- Performing direct disk health monitoring
- Implementing its advanced error correction
- Optimizing writes based on actual disk geometry
There are limited scenarios where using hardware RAID under ZFS could be justified:
# If forced to use hardware RAID, at least disable write cache
hdparm -W 0 /dev/sdX
1. Boot Drives: Some systems require hardware RAID for boot drives while using ZFS for data storage.
2. Legacy Systems: When working with pre-configured servers where RAID can't be disabled.
3. Specific Controller Features: Some high-end controllers offer benefits like battery-backed cache.
Even when configured in "JBOD" or "passthrough" mode, hardware RAID controllers can still cause problems:
Issue | Impact on ZFS |
---|---|
Write cache | Can cause data corruption during power loss |
Disk remapping | Hides bad sectors from ZFS |
Firmware bugs | May corrupt data silently |
For optimal ZFS performance on enterprise hardware:
# Preferred SAS controller configuration (LSI example)
sas2flash -listall
sas2flash -o -e 6 # Erase flash
sas2flash -f 2118it.bin # Flash IT mode firmware
Key considerations when selecting hardware:
- Choose controllers that support true JBOD/IT mode
- Verify disk SMART data is fully accessible
- Ensure controller firmware is up-to-date
Testing shows significant differences in various scenarios:
# Sample benchmark command
fio --name=randwrite --ioengine=libaio --iodepth=32 --rw=randwrite \
--bs=4k --direct=1 --size=1G --numjobs=4 --runtime=60 --group_reporting
Results typically show:
- 15-20% higher random write performance with native ZFS
- Better latency consistency without RAID controller overhead
- More accurate error reporting and recovery