Debugging Netstat Performance Issues: Why Does netstat Command Hang on CentOS?


2 views

Many Linux sysadmins have encountered this puzzling scenario: executing the basic netstat command occasionally results in unexpected delays (5+ seconds) while working perfectly at other times. Here's what I've gathered from troubleshooting this on CentOS 6.4 systems:


# Typical hanging scenario:
$ time netstat -tulnp
(5.23 seconds elapsed)

# Versus normal operation:
$ time netstat -tulnp
0.02 seconds user 0.01 seconds system

The primary culprits for netstat hangs typically involve:

  • DNS Reverse Lookups: netstat attempts to resolve IP addresses to hostnames
  • Service Name Resolution: Mapping port numbers to service names (/etc/services)
  • Kernel Socket Table Size: Large connection tables causing processing delays
  • NFS or Network Filesystem Issues: When reading /proc/net files

To identify the specific bottleneck, try these alternatives:


# Disable DNS lookups (numeric output only)
netstat -n

# Skip service name resolution
netstat --numeric-ports

# Use ss from iproute2 (modern alternative)
ss -tulnp

# Check for stuck NFS mounts
mount | grep nfs

For production servers, consider these adjustments:


# Add to /etc/sysctl.conf (reduce DNS timeouts)
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200

# Disable reverse DNS in /etc/nsswitch.conf
hosts: files dns [NOTFOUND=return]

# Alternative: Use static hosts entries
echo "1.2.3.4 myserver" >> /etc/hosts

Benchmarking different network status tools:


# Traditional netstat
$ time netstat -tulnp
real 0m5.23s

# iproute2 ss utility
$ time ss -tulnp
real 0m0.04s

# /proc/net quick read
$ time cat /proc/net/tcp
real 0m0.01s

For modern systems, these tools often provide better performance:

  • ss from iproute2 (replaces netstat)
  • lsof -i for process-bound connections
  • ip -s link for interface statistics

On servers with 10,000+ connections, additional tuning may be needed:


# Increase socket hash buckets
echo 32768 > /proc/sys/net/ipv4/tcp_max_tw_buckets

# Monitor connection tracking
conntrack -L | wc -l

# Consider connection rate limiting
iptables -A INPUT -p tcp --syn -m connlimit --connlimit-above 100 -j REJECT

On my CentOS 6.4 server, I've noticed an inconsistent behavior with the netstat command. Approximately 20% of executions exhibit significant delays (5+ seconds), while others complete instantly. This pattern persists across reboots and occurs regardless of system load.

After extensive troubleshooting, here's what I found:

# Reproduce with timing:
$ time netstat -tulnp
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)

real    0m5.423s
user    0m0.004s
sys     0m0.008s

The primary culprit appears to be DNS resolution. netstat performs reverse DNS lookups by default, which can block when:

  • DNS servers are slow to respond
  • Network interfaces have IPs without PTR records
  • Local DNS cache is invalid or expired

Solution 1: Disable reverse DNS lookups

netstat -n  # Fast - numeric output only

Solution 2: Use modern alternatives

ss -tulnp  # Socket statistics from newer kernels

Solution 3: Check DNS configuration

# Verify DNS resolution time:
$ time dig -x 127.0.0.1

# Check nscd cache status:
service nscd status

For production servers, consider these adjustments in /etc/sysconfig/network:

NETWORKING_IPV6=no
NOZEROCONF=yes

And modify /etc/hosts to include all local IPs:

127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6

This Bash script helps identify the slowest resolving IPs:

#!/bin/bash
netstat -tun | awk '{print $5}' | cut -d: -f1 | sort | uniq | \
while read ip; do
    time=$( { time host $ip >/dev/null; } 2>&1 | grep real | awk '{print $2}')
    echo "$time - $ip"
done | sort -n