Linux Server Socket Limit Investigation: Why Stuck at 32,720 Despite Available Resources?


3 views

When monitoring socket usage via ss -s or netstat, you'll notice Linux systems often hit a hard ceiling around 32,720 connections despite having sufficient memory and CPU headroom. This isn't a coincidence - it's tied to fundamental kernel limitations.

The critical parameters controlling socket limits include:

# Current limits inspection
cat /proc/sys/fs/file-max
cat /proc/sys/fs/nr_open
ulimit -n

# Kernel memory allocation
cat /proc/sys/net/ipv4/tcp_mem
cat /proc/sys/net/core/rmem_max
cat /proc/sys/net/core/wmem_max

Even with ulimit -n set high (as in your 798621 example), the Linux kernel enforces a hard-coded maximum of 65,536 file descriptors per process (64K). However:

  • Kernel reserves about half for system use
  • Socket buffers consume additional FD slots
  • Each TCP connection uses 1-3 FDs internally

For high-performance servers needing >32K connections:

# 1. Implement connection pooling
import socket
from concurrent.futures import ThreadPoolExecutor

POOL_SIZE = 1000

def create_connection_pool():
    return [socket.socket() for _ in range(POOL_SIZE)]

# 2. Use multiple processes
import multiprocessing

def worker_process(port_range):
    # Each process handles its own socket subset
    pass

if __name__ == '__main__':
    ports = [(30000, 32720), (32721, 35440)]
    with multiprocessing.Pool() as pool:
        pool.map(worker_process, ports)

For systems requiring truly massive connection counts:

# Recompile kernel with adjusted parameters:
CONFIG_NR_OPEN=1048576
CONFIG_MAX_USER_FDS=900000

# Or apply these runtime tweaks:
echo 1000000 > /proc/sys/fs/nr_open
echo "10240 65535 2097152" > /proc/sys/net/ipv4/tcp_mem
echo 2000000 > /proc/sys/fs/file-max

When hitting Linux limitations, consider:

  • Implementing UDP instead of TCP where possible
  • Using epoll with edge triggering for massive I/O multiplexing
  • Deploying reverse proxies to distribute connections

Example epoll implementation:

import select

epoll = select.epoll()
serversocket = socket.socket()
serversocket.bind(('0.0.0.0', 8080))
serversocket.listen(50000)
epoll.register(serversocket.fileno(), select.EPOLLIN)

while True:
    events = epoll.poll(1)
    for fileno, event in events:
        if fileno == serversocket.fileno():
            clientsocket, address = serversocket.accept()
            epoll.register(clientsocket.fileno(), select.EPOLLIN)

When monitoring our high-traffic API gateway, we noticed the server consistently plateaued at exactly 32,720 concurrent connections despite having:

  • 4GB free memory
  • 80% idle CPU capacity
  • ulimit set to 798,621 files

The behavior persisted even after adjusting standard kernel parameters:

# Current configuration
sysctl net.netfilter.nf_conntrack_max=999999
sysctl net.ipv4.netfilter.ip_conntrack_max=999999
sysctl net.nf_conntrack_max=999999

After digging through kernel source code (particularly include/net/sock.h), we identified the 32K limit stems from Linux's default ephemeral port range configuration:

# Check current port range
cat /proc/sys/net/ipv4/ip_local_port_range
# Typical output: 32768 60999

This creates only 28,231 available ports (60999-32768). Combined with TIME_WAIT socket states, we hit the ceiling rapidly.

Solution 1: Expand Port Range

# For persistent change
echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range

# Verify with:
ss -lntp | wc -l

Solution 2: TCP Reuse and Recycling

# Enable socket reuse
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle

# Reduce TIME_WAIT timeout
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

For applications requiring >65K connections:

  1. Implement connection pooling
  2. Consider multiple IP addresses
  3. Review application architecture

Example connection pooling in Python:

from socketpool import ConnectionPool

pool = ConnectionPool(
    factory=TCPConnector,
    max_size=100000,
    options={'reuse': True}
)

Remember to monitor with:

watch -n 1 "cat /proc/net/sockstat"