When dealing with high-frequency messaging applications, even sub-millisecond latency differences matter. Your measured 0.23ms RTT between hosts suggests room for optimization. Let's break down the investigation methodology.
# Measure base latency without switch interference (direct connect) ip link set eth0 down ip link set eth1 down ethtool -t eth0 online ethtool -t eth1 online
First establish baseline NIC performance. The ethtool
diagnostics will reveal hardware limitations. For Intel NICs specifically:
# Check NIC ring buffer settings ethtool -g eth0 # Sample output: # RX: 4096 # TX: 4096 # Consider reducing for low-latency: ethtool -G eth0 rx 256 tx 256
Modern switches should add <0.1ms latency. Verify with:
# On switch CLI (Cisco example): show platform hardware fed switch active fwd-asic resource-utilization show platform hardware fed switch active fwd-asic resource tcam utilization
Adjust these sysctl parameters in /etc/sysctl.conf:
net.core.rmem_default = 262144 net.core.wmem_default = 262144 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.tcp_low_latency = 1
For UDP-based applications (common in HFT), consider this sender configuration:
// C code snippet for UDP socket tuning int sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); int optval = 1; setsockopt(sock, SOL_SOCKET, SO_TIMESTAMPING, &optval, sizeof(optval)); setsockopt(sock, SOL_SOCKET, SO_BUSY_POLL, &optval, sizeof(optval));
Bad cables often manifest as retransmissions rather than pure latency. Check with:
# Check for physical errors ethtool -S eth0 | grep -E 'err|drop' # Compare before/after cable replacement
# Pin IRQ handlers to specific cores for irq in $(grep eth0 /proc/interrupts | awk '{print $1}' | sed 's/://'); do echo 3 > /proc/irq/$irq/smp_affinity done
For ultimate performance, consider DPDK or XDP:
# XDP example load command ip link set dev eth0 xdp obj xdp_drop.o sec xdp
When dealing with sub-millisecond latency requirements (like your 0.23ms current vs 0.1ms target), we need surgical precision in measurements. Traditional tools like ping and Wireshark verify latency but don't pinpoint the source.
For nano-second level analysis:
# Install precision timing tools
sudo apt install linuxptp ethtool tuned-utils
Run these on both hosts:
# Check NIC interrupt coalescing
ethtool -c eth0
# Verify DMA settings
cat /proc/interrupts | grep eth0
# Check kernel bypass capabilities
sudo ethtool -k eth0 | grep hw-tc-offload
Key switch parameters affecting sub-1ms latency:
# Sample output to verify (Cisco/Juniper syntax)
show interface xe-0/0/0 | match "cut-through|store-and-forward"
# Should return "cut-through" for lowest latency
Critical /etc/sysctl.conf parameters:
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.ipv4.tcp_no_metrics_save=1
net.ipv4.tcp_low_latency=1
For UDP-based high frequency messaging:
// C example using SO_TIMESTAMPING
int flags = SOF_TIMESTAMPING_TX_HARDWARE |
SOF_TIMESTAMPING_RX_HARDWARE |
SOF_TIMESTAMPING_RAW_HARDWARE;
setsockopt(sock_fd, SOL_SOCKET, SO_TIMESTAMPING, &flags, sizeof(flags));
Use TDR (Time Domain Reflectometer) commands if supported:
# Intel NIC example
ethtool --cable-test eth0
In financial trading systems where we reduced latency from 0.25ms to 0.09ms by:
- Enabling NIC kernel bypass (DPDK)
- Configuring switch port buffers to 64 bytes
- Using kernel-bypass libraries like libfabric