Optimizing Ubuntu TCP Stack for Ultra-Low Latency Network Applications


2 views

When dealing with high-frequency measurement systems where microsecond-level latency matters, every component in the data path must be optimized. The standard Ubuntu TCP/IP stack configuration isn't designed for ultra-low latency scenarios out of the box.

First, let's configure the kernel parameters in /etc/sysctl.conf:

# Disable TCP slow start after idle
net.ipv4.tcp_slow_start_after_idle = 0

# Increase TCP initial congestion window
net.ipv4.tcp_init_cwnd = 10

# Enable TCP low latency mode
net.ipv4.tcp_low_latency = 1

# Reduce minimum retransmission timeout
net.ipv4.tcp_rto_min = 200

# Disable TCP timestamps (reduces header overhead)
net.ipv4.tcp_timestamps = 0

# Increase socket buffers (adjust based on your expected throughput)
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

For your Intel 82546EB NIC, apply these optimizations:

# Set interrupt coalescing (adjust eth0 to your interface)
sudo ethtool -C eth0 rx-usecs 10 tx-usecs 10

# Enable PCIe ASPM L1 substates
sudo setpci -v -d 8086:101e CAP_EXP+0x10.l=0x11542

# Disable power saving features
sudo ethtool -s eth0 wol d speed 1000 duplex full autoneg off

For your 8-core Xeon system, dedicate cores for network processing:

# Install irqbalance and configure
sudo apt install irqbalance
sudo systemctl enable irqbalance

# Set IRQ affinity for NIC interrupts
for irq in $(grep eth0 /proc/interrupts | awk -F: '{print $1}'); do
    sudo echo 3 > /proc/irq/$irq/smp_affinity
done

# Use taskset to pin your application to specific cores
taskset -c 2,3 ./your_measurement_program

Here's an optimized TCP server implementation snippet:

#include <sys/socket.h>
#include <netinet/tcp.h>

int set_socket_options(int sockfd) {
    int yes = 1;
    int lowlatency = 1;
    int nodelay = 1;
    
    // Disable Nagle's algorithm
    setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, &nodelay, sizeof(nodelay));
    
    // Enable low latency mode
    setsockopt(sockfd, SOL_SOCKET, SO_LOW_LATENCY, &lowlatency, sizeof(lowlatency));
    
    // Reuse address
    setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
    
    // Set buffer sizes
    int bufsize = 65536;
    setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, &bufsize, sizeof(bufsize));
    setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &bufsize, sizeof(bufsize));
    
    return 0;
}

For the most demanding scenarios, consider using a real-time kernel:

sudo apt install linux-rt

Then configure CPU isolation by adding these kernel parameters:

isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3

Use these tools to verify your optimizations:

# Check TCP retransmissions
ss -ti

# Monitor interrupt distribution
watch -n 1 cat /proc/interrupts

# Measure latency with specialized tools
sudo apt install linux-tools-common
taskset -c 2 perf stat -e 'sched:*' -a ./your_program

When dealing with microsecond-sensitive TCP communications on Ubuntu (particularly 22.04 LTS), several kernel and network stack parameters need careful tuning. Our test environment features:

# lscpu output snippet
Architecture:        x86_64
CPU(s):              8
Model name:          Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
NUMA node(s):        2
L1d cache:           32K
L2 cache:            4096K

First, let's examine the critical sysctl parameters in /etc/sysctl.conf:

# Low latency TCP settings
net.core.rmem_default = 262144
net.core.wmem_default = 262144  
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_low_latency = 1
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_adv_win_scale = 1

For the Intel 82546EB NIC, we need to ensure proper IRQ balancing:

# Check IRQ affinity
cat /proc/interrupts | grep eth
# Set CPU affinity (example for IRQ 42)
echo 3 > /proc/irq/42/smp_affinity

Your measurement program should include these socket options:

// Sample TCP_NODELAY setting
int yes = 1;
setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (void *)&yes, sizeof(int));

// Recommended buffer sizes
int rcvbuf = 1024 * 1024;
int sndbuf = 1024 * 1024;
setsockopt(sockfd, SOL_SOCKET, SO_RCVBUF, &rcvbuf, sizeof(rcvbuf));
setsockopt(sockfd, SOL_SOCKET, SO_SNDBUF, &sndbuf, sizeof(sndbuf));

For the NIC's PCIe configuration:

# Check current value
lspci -vvv -s 02:00.0 | grep Latency
# Set to minimum (0x00)
setpci -v -s 02:00.0 LATENCY_TIMER=00

For extreme cases, consider installing the PREEMPT_RT kernel:

sudo apt install linux-rt-5.15
# Then adjust CPU isolation
isolcpus=2,3,6,7 nohz_full=2,3,6,7 rcu_nocbs=2,3,6,7

Remember to verify changes with tools like ping -f, netperf, or custom microbenchmarks before deploying in production.