When running netstat
or ss -ant
on Linux systems, seeing thousands of connections in TIME_WAIT state (especially targeting port 111/sunrpc) indicates a TCP/IP connection handling issue. This is a common pain point for developers working with high-throughput socket applications.
# Typical diagnostic commands:
netstat -ant | awk '/^tcp/ {print $6}' | sort | uniq -c
ss -ant | grep 'TIME-WAIT' | wc -l
Each TIME_WAIT connection consumes system resources for 60-240 seconds (default timeout) after closure. In high-traffic scenarios, this can:
- Exhaust available ephemeral ports (32768-61000 by default)
- Increase latency as new connections wait for ports
- Trigger "Address already in use" errors
The specific pattern showing connections between localhost ports and port 111 (sunrpc) suggests either:
- An overactive NFS client implementation
- A misconfigured service continually querying portmapper
- A connection pool not properly recycling sockets
Add these to /etc/sysctl.conf
(then run sysctl -p
):
# Reduce TIME_WAIT timeout to 30 seconds
net.ipv4.tcp_fin_timeout = 30
# Enable socket reuse
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1 # Note: Dangerous on NAT networks
# Increase ephemeral port range
net.ipv4.ip_local_port_range = 1024 65535
# Increase max connections
net.ipv4.tcp_max_tw_buckets = 2000000
For developers writing socket-based applications:
// Python example with proper socket handling
import socket
def create_connection():
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
try:
s.connect(('localhost', 111))
# ... handle connection ...
finally:
s.close() # Ensures proper FIN handshake
# Better alternative using context manager
from contextlib import closing
with closing(socket.socket()) as s:
s.connect(('localhost', 111))
# Automatic proper cleanup
For deeper investigation:
# Monitor connection states in real-time
watch -n 1 "ss -s | grep -i wait"
# Check which processes might be responsible
lsof -i :111
# Kernel connection tracking
cat /proc/net/nf_conntrack | grep sunrpc
# Network stack statistics
cat /proc/net/netstat | grep -i tcp
When you're seeing thousands of TCP connections stuck in TIME_WAIT state pointing to localhost:sunrpc (port 111), you're witnessing normal TCP protocol behavior - but at an abnormal scale. Each TIME_WAIT represents a properly closed connection that the kernel maintains for 60 seconds by default (2*MSL) to handle any delayed packets.
# View current TIME_WAIT timeout (in seconds)
cat /proc/sys/net/ipv4/tcp_fin_timeout
The key observations from your netstat output reveal:
- All connections are local (127.0.0.1)
- Targeting port 111 (sunrpc)
- Ephemeral ports in 60XXX range
- No associated process (PID -)
This suggests an RPC service (like portmapper) is being hammered by local processes - potentially cron jobs, monitoring tools, or misconfigured services making rapid successive calls.
First, identify the source of these RPC calls:
# Monitor RPC calls in real-time
sudo rpcinfo -p
sudo tcpdump -i lo -nn 'port 111' -c 100
# Check which processes use RPC (run as root)
lsof -i :111
netstat -tulp | grep rpc
For temporary relief, adjust these sysctl values:
# Reduce TIME_WAIT duration (default 60)
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
# Enable TIME_WAIT reuse (Linux 4.1+)
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
# Increase available port range
echo '1024 65000' > /proc/sys/net/ipv4/ip_local_port_range
Make changes permanent by adding to /etc/sysctl.conf:
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 1024 65000
For sustainable fixes:
- RPC Client Optimization: Configure clients to reuse connections (NFS mount options, rpcbind settings)
- Connection Pooling: Implement keepalive for RPC clients
- Service Isolation: Containerize services making excessive RPC calls
Example NFS client optimization:
mount -o proto=tcp,vers=3,timeo=600,retrans=2,hard,intr \
nfsserver:/share /mnt/share
Set up alerts for TIME_WAIT buildup:
# Nagios check example
#!/bin/bash
WARN=1000
CRIT=5000
count=$(netstat -ant | grep -c 'TIME_WAIT.*:111')
if [ $count -gt $CRIT ]; then
echo "CRITICAL: $count RPC TIME_WAIT connections"
exit 2
elif [ $count -gt $WARN ]; then
echo "WARNING: $count RPC TIME_WAIT connections"
exit 1
else
echo "OK: $count RPC TIME_WAIT connections"
exit 0
fi