When working with Linux software RAID (mdadm), the stripe_cache_size
parameter plays a crucial role in performance optimization for RAID5/6 arrays. This kernel parameter controls the size of the stripe cache - a memory buffer used to optimize write operations by reducing the read-modify-write overhead inherent in parity-based RAID levels.
The stripe cache acts as a write-back cache for partial stripe writes. When enabled:
- Small writes are accumulated in memory until a full stripe can be written
- Reduces the number of expensive read-modify-write cycles
- Improves sequential write performance significantly
# View current stripe_cache_size value
cat /sys/block/md0/md/stripe_cache_size
As observed in the example, increasing stripe_cache_size from default (typically 256 or 512) to 16384 doubled the sync rate from 71MB/s to 143MB/s. However, this comes with increased RAM usage - approximately 64KB per cache entry (for 4KB chunks).
The optimal value depends on:
- Available system memory
- Workload characteristics (random vs sequential writes)
- Number of disks in the array
To set the value temporarily (until reboot):
echo 16384 > /sys/block/md0/md/stripe_cache_size
For permanent configuration, add to /etc/rc.local or create a udev rule:
# Example udev rule
ACTION=="add|change", KERNEL=="md0", ATTR{md/stripe_cache_size}="16384"
After modification, verify the change took effect:
cat /sys/block/md0/md/stripe_cache_size
Monitor performance impact through:
cat /proc/mdstat
iostat -x 1
dstat -td --disk-util
For NUMA systems, you may need to adjust stripe_cache_active
as well. The ratio between active and inactive cache can affect performance on large systems.
Remember that extremely large values can:
- Cause memory pressure
- Lead to longer recovery times after crashes
- Increase latency for some workloads
The stripe_cache_size
is a tunable parameter in Linux's software RAID (md) subsystem that controls the size of the stripe cache - a memory buffer used to optimize write operations in RAID5/6 arrays. This cache temporarily stores data before it's written to disks, helping to mitigate the "write hole" problem and improve performance.
When writing to a RAID5/6 array, the system needs to:
- Read existing data and parity
- Compute new parity
- Write new data and parity
The stripe cache stores these intermediate computations in RAM, reducing the number of disk I/O operations required. A larger cache can hold more stripes in memory, potentially improving performance for sequential writes.
As you've observed, increasing stripe_cache_size
from the default (usually 256 or 512) to 16384 can significantly improve sync speeds. The improvement comes from:
- Reduced disk seeks (more operations can be batched)
- Better sequential write patterns
- Less time waiting for disk I/O
To view current value:
cat /sys/block/md0/md/stripe_cache_size
To set a new value (requires root):
echo 16384 > /sys/block/md0/md/stripe_cache_size
To make it persistent across reboots, add to /etc/rc.local
:
#!/bin/sh
echo 16384 > /sys/block/md0/md/stripe_cache_size
exit 0
The optimal value depends on:
- Available RAM (each entry uses ~8KB)
- Workload characteristics (sequential vs random writes)
- Number of disks in the array
A good starting point is 4096 for arrays with 4-6 disks, scaling up to 32768 for larger arrays with sufficient RAM.
Check performance before/after changes:
cat /proc/mdstat
iostat -x 1
Monitor memory usage:
free -m
cat /proc/meminfo
While not extensively documented, some references exist in:
- Linux kernel source (drivers/md/md.c)
- mdadm man pages
- Kernel documentation (Documentation/md.txt)