Optimizing Linux for High-Volume TCP Connections: Handling 10K Requests/Second on CentOS


1 views

When building a stats server that needs to handle 10,000 TCP connections per second, you're dealing with several layers of potential bottlenecks. Even with modern 8-core servers, default OS configurations often impose artificial limits that need careful tuning.

  • File Descriptor Limits: The default 1024 limit won't cut it
  • TCP TIME_WAIT State: Can exhaust available ports
  • Kernel Parameters: net.core.somaxconn, tcp_max_syn_backlog etc.
  • NIC Queue Settings: IRQ balancing and ring buffers

Here are the critical sysctl settings for CentOS:

# /etc/sysctl.conf
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 15
net.core.somaxconn = 32768
net.ipv4.tcp_max_syn_backlog = 65536
net.core.netdev_max_backlog = 20000
fs.file-max = 100000

Here's a basic Python implementation using asyncio:

import asyncio

async def handle_client(reader, writer):
    global counter
    counter += 1
    writer.close()
    await writer.wait_closed()

async def main():
    server = await asyncio.start_server(
        handle_client, '0.0.0.0', 8888,
        backlog=10000
    )
    async with server:
        await server.serve_forever()

counter = 0
asyncio.run(main())

Essential tools to verify your configuration:

# Current connection statistics
ss -s

# Monitor TCP states
cat /proc/net/sockstat

# File descriptor usage
lsof | wc -l

# Network interrupts
cat /proc/interrupts | grep eth0

For production-grade implementations, consider:

  • Using kernel bypass techniques like DPDK for extreme cases
  • Implementing connection pooling if clients support it
  • Exploring UDP instead of TCP if you can tolerate packet loss
  • Distributing load across multiple ports if hitting single-port limits

While your 8-core box should handle this load, pay attention to:

  • NIC queue configuration (ethtool -L)
  • IRQ balancing (irqbalance service)
  • NUMA awareness if using multi-socket systems
  • PCIe bandwidth for high-speed NICs

When building a stats server that needs to handle 10,000 TCP connections per second, you'll encounter several bottlenecks in the Linux networking stack before hitting hardware limits. The 8-core CentOS box should have sufficient CPU power for this simple counter service, but the default OS configuration isn't optimized for such high connection rates.

# Check current kernel settings
sysctl net.ipv4.tcp_max_syn_backlog
sysctl net.core.somaxconn
sysctl net.core.netdev_max_backlog

These three parameters form the first bottleneck. The default tcp_max_syn_backlog (typically 128) limits SYN packets in the queue, somaxconn (usually 128) caps the connection accept queue, and netdev_max_backlog (often 1000) restricts packets waiting in the NIC driver queue.

# /etc/sysctl.conf optimizations
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
net.core.netdev_max_backlog = 5000
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65535

With 10k connections per second, you'll exhaust the default ephemeral port range (32768-60999) in about 3 seconds. Expand it to 1024-65535:

sysctl -w net.ipv4.ip_local_port_range="1024 65535"

For maximum performance, consider these architecture choices:

// Sample C pseudo-code for high-performance accept()
int listen_sock = socket(AF_INET, SOCK_STREAM, 0);
setsockopt(listen_sock, SOL_SOCKET, SO_REUSEPORT, &option, sizeof(option));
listen(listen_sock, 8192);  // Must match somaxconn

// Worker threads
for (int i = 0; i < num_cores; i++) {
    pthread_create(&thread, NULL, worker, NULL);
}

void* worker(void* arg) {
    while (1) {
        int client = accept(listen_sock, NULL, NULL);
        counter++;
        close(client);
    }
}

Even with 10G NICs, you might hit interrupt processing limits. Enable RSS (Receive Side Scaling) and spread interrupts across cores:

# Check available RSS queues
ethtool -l eth0

# Enable multi-queue
ethtool -L eth0 combined 8

# Balance IRQs across cores
service irqbalance start

Essential monitoring commands:

# Connection tracking
ss -s

# SYN backlog overflow
netstat -s | grep -i listen

# NIC statistics
ethtool -S eth0 | grep -i drop

If you see drops in the ListenOverflows or ListenDrops, increase your backlog queues further.