Optimal RAID 5 Configuration for High-Performance KVM Virtualization on Dell R920 with 24×1.2TB Drives


4 views

When configuring RAID 5 with 24x1.2TB 10K SAS drives on Dell PowerEdge R920 hardware, we need to balance several factors:

# Sample lsblk output for reference:
NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda       8:0    0   1.2T  0 disk 
├─sda1    8:1    0   500M  0 part /boot
└─sda2    8:2    0   1.2T  0 part 
  ├─vg-root 253:0    0    50G  0 lvm  /
  └─vg-data 253:1    0   1.1T  0 lvm  /data

The stripe size (or stripe width) refers to the amount of data written to each disk before moving to the next disk in the array. For your 23-disk configuration (22 data + 1 parity):

  • 1MB stripe size means 1MB total per stripe across all disks
  • Each disk receives 1MB/22 ≈ 46.5KB per stripe
  • Small files will occupy partial stripes (no space is 'wasted' as multiple files share stripes)

For the H730P controller with your workload:

# Recommended MegaCLI configuration:
MegaCli -CfgLdAdd -r5 [252:0,252:1,252:2,...] WB Direct NoCachedBadBBU -strpsz1024 -sz -a0
MegaCli -LDSetProp -EnDskCache -Lall -a0
MegaCli -LDSetProp -RA -Lall -a0

For optimal ext4 performance on RAID 5:

# Calculate stride and stripe-width:
# chunk_size (KB) = stripe_size / number_of_data_disks
# stride = chunk_size / block_size (typically 4KB)

mkfs.ext4 -b 4096 -E stride=16,stripe-width=352 /dev/mapper/vg-data

After configuration, verify performance with:

# Basic benchmark:
hdparm -tT /dev/mapper/vg-data

# Advanced testing:
fio --name=randwrite --ioengine=libaio --iodepth=32 \
--rw=randwrite --bs=4k --direct=1 --size=10G --numjobs=4 \
--runtime=60 --group_reporting

For VM storage backend:

<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native'/>
  <source dev='/dev/mapper/vg-vm01'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
</disk>

Consider adding discard='unmap' for TRIM support on SSD-backed storage.


When configuring RAID 5 with 24 drives (23 data + 1 parity), stripe size becomes critical for performance optimization. The stripe size represents the amount of data written to each disk before moving to the next disk in the array. Your calculation of 1MB/22 ≈ 46.5KB per disk is incorrect - the stripe size is actually the chunk written to each individual disk before moving to the next.

For your Dell R920 with H730P controller, consider these settings:

# Example MegaCLI configuration (adjust for your controller)
MegaCli -CfgLdAdd -r5 [252:0,252:1,252:2,...] WB Direct -sz1M -a0
# Where '252' is enclosure ID, numbers are slot numbers
# WB = Write Back, Direct = no read ahead, 1M stripe size

For VM workloads with mixed file sizes:

  • Small files (4KB-64KB): 64KB stripe size
  • Medium files (64KB-1MB): 256KB stripe size
  • Large files (>1MB): 1MB stripe size

Your 10K RPM SAS drives can handle 1MB stripes effectively for large VM disk images.

For optimal performance with ext4 on CentOS 6:

# Calculate stride and stripe-width:
# stride = stripe_size / block_size (typically 4KB)
# stripe-width = stride * (number_of_data_disks)

# For 1MB stripe size and 23 data drives:
mkfs.ext4 -b 4096 -E stride=256,stripe-width=5888 /dev/sdX

Verify your configuration with fio:

# Sequential read/write test
fio --filename=/mnt/raidtest/testfile --direct=1 \
--rw=rw --bs=1M --size=10G --numjobs=4 \
--runtime=60 --group_reporting --name=seq-test

# Random IO test for VM workload simulation
fio --filename=/mnt/raidtest/testfile --direct=1 \
--rw=randrw --bs=4k --size=10G --numjobs=16 \
--runtime=60 --group_reporting --name=rand-test

With 2GB NV cache on H730P:

  • Enable Write Back with BBU
  • Set disk cache policy to "Enabled" (default is usually disabled)
  • Consider 75% write/25% read cache allocation

Implement regular performance monitoring:

# Check RAID status
megacli -LDInfo -LAll -aAll

# Check disk health
smartctl -a /dev/sdX

# Performance monitoring with iostat
iostat -xm 1