Advanced iSCSI Performance Tuning: OS-Specific Configuration Guide for VMware, Hyper-V, Linux and AIX


2 views

When implementing iSCSI storage, we're essentially transporting SCSI commands over TCP/IP networks. This creates unique challenges compared to Fibre Channel:

  • Ethernet's default behavior of dropping frames during congestion
  • TCP retransmission latency impacting storage responsiveness
  • Jumbo frame misconfigurations causing fragmentation

VMware ESXi 4/5 Configuration

esxcli system module parameters set -m iscsi_vmk -p iscsi_max_r2t=8
esxcli system module parameters set -m iscsi_vmk -p iscsi_imm_mode=1
esxcfg-advcfg -s 1 /Net/TcpipHeapSize
esxcfg-advcfg -s 1 /Net/TcpipHeapMax

Windows Hyper-V 2008/R2 Optimization

netsh int tcp set global rss=enabled
netsh int tcp set global chimney=enabled
Set-NetTCPSetting -SettingName Datacenter -InitialCongestionWindow 10 -CongestionProvider CTCP
Set-NetOffloadGlobalSetting -ReceiveSideScaling Enabled

Switch configuration plays a critical role in iSCSI performance:

  • Enable flow control (pause frames) on all iSCSI ports
  • Configure proper QoS with DSCP values (CS6 recommended)
  • Disable spanning tree protocol on iSCSI VLANs
# Example Linux multipath.conf for Dell EqualLogic
devices {
    device {
        vendor "EQLOGIC"
        product ".*"
        path_grouping_policy group_by_prio
        path_checker tur
        features "1 queue_if_no_path"
        prio "const"
        failback immediate
    }
}

For Linux bare metal implementations:

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
echo 4096 87380 16777216 > /proc/sys/net/ipv4/tcp_rmem
echo 4096 65536 16777216 > /proc/sys/net/ipv4/tcp_wmem

Essential commands for troubleshooting:

# Windows
Get-SmbClientNetworkInterface | fl
Test-NetConnection -ComputerName target -Port 3260

# Linux
iscsiadm -m session -P 3
ethtool -S ethX | grep -i drop

Unlike Fibre Channel, iSCSI encapsulates SCSI commands within TCP packets, creating distinct performance characteristics. The protocol inherits Ethernet's default behavior of dropping frames during congestion rather than implementing flow control. This manifests in storage-specific issues:

// Example TCP retransmission impact (simplified Python simulation)
import time

def iscsi_request():
    start_time = time.time()
    try:
        # Frame dropped due to congestion
        if network_congested:
            raise FrameLossException()
        return process_scsi_command()
    except FrameLossException:
        time.sleep(0.5)  # TCP retransmission delay
        return iscsi_request()  # Retry
    finally:
        latency = time.time() - start_time
        log_latency_spike(latency)

For VMware environments, these CLI commands optimize iSCSI:

esxcli system module parameters set -m iscsi_vmk -p iscsivmk_LunDiscoveryPollTime=120
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_MaxConnectionsPerSession=4
esxcli system module parameters set -m iscsi_vmk -p iscsivmk_SendSegmentLength=262144

Key network adjustments:

  • Enable jumbo frames (MTU 9000) end-to-end
  • Configure NIC teaming with explicit failover (not load balancing)
  • Set VMkernel port advanced setting: Net.TcpipHeapSize=32

PowerShell commands for optimal iSCSI:

Set-NetTCPSetting -InternetCustom 
    -InitialRto 3000 
    -MinRto 300 
    -MaxSynRetransmissions 4 
    -AutoTuningLevel Restricted

Set-NetOffloadGlobalSetting -NetworkDirectAcrossIPSubnets Enabled
Set-NetAdapterAdvancedProperty -Name "iSCSI NIC" -DisplayName "Interrupt Moderation" -DisplayValue "Extreme"

Add these to /etc/sysctl.conf for bare metal Linux:

net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_window_scaling = 1
net.core.rmem_max = 4194304
net.core.wmem_max = 4194304
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 87380 4194304

For Cisco Nexus switches:

interface Ethernet1/1
  mtu 9216
  flowcontrol receive on
  flowcontrol send on
  spanning-tree port type edge
  no cdp enable
  storm-control broadcast level 10

Critical QoS settings:

class-map match-any ISCSI
  match dscp cs4
policy-map TYPE_ISCSI
  class ISCSI
    bandwidth percent 50
    queue-limit packets 512

Example NetApp ONTAP tuning:

options iscsi.login_timeout 20
options iscsi.max_recv_data_segment_length 262144
options iscsi.max_xmit_data_segment_length 262144
options iscsi.time2wait 0
options iscsi.time2retain 0

For AIX VIO clients:

chdev -l iscsi0 -a max_targets=16 -a max_cmds=1024 -P
vmo -p -o tcp_recvspace=4194304 -o tcp_sendspace=4194304

Sample fio job file for validation:

[global]
ioengine=libaio
direct=1
runtime=300
time_based

[4k-random-read]
filename=/dev/sdc
rw=randread
bs=4k
iodepth=32
numjobs=4

[1m-sequential-write]
filename=/dev/sdc
rw=write
bs=1m
iodepth=8
numjobs=2