Advanced Tuning for Hardware RAID Controllers (cciss/scsi) on Linux: Sysctl and Sysfs Optimizations for High-Performance Storage


2 views

When working with enterprise-grade hardware RAID controllers like HP Smart Array, PERC, or LSI MegaRAID, Linux presents several tunable parameters that significantly impact performance. Unlike software RAID, these controllers handle parity calculations and cache management in hardware, but the OS still plays a crucial role in I/O path optimization.

# For HP SmartArray controllers (cciss)
echo "noop" > /sys/block/cciss\\!c0d0/queue/scheduler
blockdev --setra 65536 /dev/cciss/c0d0
echo 512 > /sys/block/cciss\\!c0d0/queue/nr_requests
echo 2048 > /sys/block/cciss\\!c0d0/queue/read_ahead_kb

# For modern SCSI/SAS controllers
echo "deadline" > /sys/block/sdX/queue/scheduler
blockdev --setra 131072 /dev/sdX
echo 1024 > /sys/block/sdX/queue/nr_requests
echo 4096 > /sys/block/sdX/queue/read_ahead_kb

The choice between noop, deadline, and other schedulers depends on your workload:

  • noop: Best for battery-backed controllers with sophisticated cache algorithms
  • deadline: Preferred for mixed read/write workloads on direct-attached storage
  • cfq: Only recommended for shared storage environments

For controllers with battery-backed cache (BBWC) or flash-backed cache (FBWC), these settings are critical:

# Check and enable write cache (requires BBU/FBWC)
hdparm -W 1 /dev/sdX

# Force cache flushing behavior (RHEL/CentOS specific)
echo "10" > /proc/sys/vm/dirty_ratio
echo "5" > /proc/sys/vm/dirty_background_ratio
echo "500" > /proc/sys/vm/dirty_expire_centisecs

These parameters in /etc/sysctl.conf can help:

# Increase SCSI command timeout (seconds)
scsi_mod.scsi_timeout=60

# Max number of SG_IO commands
sg.dev_hndl_max=1024

# CCISS driver options (HP SmartArray)
cciss.cciss_max_sectors=128
cciss.cciss_allow_hpsa=1

Proper alignment is crucial for RAID 10 configurations:

# Create aligned partition (2048-sector alignment)
parted -a optimal /dev/sdX mklabel gpt
parted -a optimal /dev/sdX mkpart primary 2048s 100%

# Format with optimal settings
mkfs.xfs -f -d su=64k,sw=4 -l su=64k /dev/sdX1

# Mount options for XFS
mount -o noatime,nodiratime,logbsize=256k,logbufs=8 /dev/sdX1 /data

Essential tools for verifying your changes:

# Check current queue parameters
cat /sys/block/sdX/queue/scheduler
blockdev --getra /dev/sdX

# Real-time monitoring
iostat -xmt 2
sar -d 1 3

# Comprehensive benchmark
fio --filename=/dev/sdX --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=64 --runtime=60 --numjobs=4 --time_based --group_reporting --name=iotest

When dealing with enterprise-grade hardware RAID controllers like HP Smart Array, LSI MegaRAID, or Dell PERC, the default Linux storage stack configuration often leaves performance on the table. Modern controllers with battery/flash-backed cache and multiple SAS drives (especially in RAID 10 configurations) can benefit significantly from targeted tuning.

These sysfs parameters form the foundation of RAID controller optimization:

# Basic tuning for HP Smart Array (cciss)
echo "noop" > /sys/block/cciss\!c0d0/queue/scheduler
blockdev --setra 65536 /dev/cciss/c0d0
echo 512 > /sys/block/cciss\!c0d0/queue/nr_requests
echo 2048 > /sys/block/cciss\!c0d0/queue/read_ahead_kb

# For newer SCSI-based controllers (like PERC H730)
echo "noop" > /sys/block/sda/queue/scheduler
blockdev --setra 65536 /dev/sda
echo 512 > /sys/block/sda/queue/nr_requests
echo 4096 > /sys/block/sda/queue/read_ahead_kb

The interaction between device queue depth and Linux block layer settings is critical:

# Check current queue depth
cat /sys/block/sda/device/queue_depth

# Optimal settings for modern controllers (adjust based on your workload)
echo 128 > /sys/block/sda/queue/nr_requests
echo 64 > /sys/block/sda/device/queue_depth

Digging deeper into controller-specific settings can yield additional gains:

# For CCISS controllers
echo 1 > /proc/driver/cciss/cciss0_allow_highspeed

# For SCSI controllers (check your driver first)
echo "max_performance=1" > /sys/class/scsi_host/host0/host_param

# Disk timeout adjustment (critical for RAID rebuild scenarios)
echo 180 > /sys/block/sda/device/timeout

These sysctl parameters affect the entire storage subsystem:

# Add to /etc/sysctl.conf
vm.dirty_ratio = 10
vm.dirty_background_ratio = 5
vm.swappiness = 1
vm.zone_reclaim_mode = 0

Different applications require different approaches:

# For OLTP databases (random I/O focused)
echo "deadline" > /sys/block/sda/queue/scheduler
echo 256 > /sys/block/sda/queue/nr_requests
echo 32 > /sys/block/sda/device/queue_depth

# For sequential workloads (backups, media)
echo "noop" > /sys/block/sda/queue/scheduler
echo 1024 > /sys/block/sda/queue/read_ahead_kb
echo 128 > /sys/block/sda/queue/nr_requests

Essential tools to verify your changes:

# Basic monitoring
iostat -xmt 1
sar -d 1

# Advanced analysis
blktrace -d /dev/sda -o trace
btt -i trace.blktrace.bin

# Check actual I/O scheduler in use
cat /sys/block/sda/queue/scheduler

Don't overlook these critical hardware settings:

  • Cache ratio: 75% write / 25% read for mixed workloads
  • Strip size: 256KB for general purpose, 1MB+ for sequential workloads
  • Always enable write-back cache when BBU/flash backup exists
  • Disable controller-side read ahead when using Linux-side read ahead