Debugging Orphaned TCP Ports: When netstat Shows Open Ports Without Associated Processes

During routine server maintenance, I encountered a peculiar networking anomaly - port 5666 showed as LISTENING in netstat output, yet no process was attached to it. This occurred on an EC2 instance running NRPE monitoring:

sudo netstat -tulnp | grep 5666
tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN      -

The standard diagnostic tools yielded conflicting results:

sudo lsof -i :5666  # No output
ps aux | grep nrpe  # No process running

Upon checking kernel messages, I found critical clues:

dmesg | tail -20
[132031.611745] BUG: unable to handle kernel paging request
[132031.612338] IP: [] tcp_v4_do_rcv+0x0/0x820
[132031.613024] Pid: 0, comm: swapper Not tainted 2.6.16-xenU #1

The older Linux kernel (2.6.16) had stability issues with TCP stack management. When the kernel crashed, it failed to properly clean up network resources, leaving the port in a zombie state.

This situation can be artificially reproduced through kernel module manipulation:

# Experimental module to simulate the bug
#include <linux/module.h>
#include <linux/net.h>
#include <net/tcp.h>

static int __init port_hijack_init(void) {
    struct socket *sock;
    int err = sock_create_kern(PF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
    if (!err) {
        struct sockaddr_in addr = { .sin_family = AF_INET, .sin_port = htons(5666) };
        kernel_bind(sock, (struct sockaddr *)&addr, sizeof(addr));
        kernel_listen(sock, 1);
        printk(KERN_INFO "Port 5666 bound without proper process context\n");
    }
    return 0;
}

module_init(port_hijack_init);
MODULE_LICENSE("GPL");

For EC2 instances, the solution involved:

# Upgrade kernel on Ubuntu/Debian
sudo apt-get update
sudo apt-get install linux-image-$(uname -r|sed 's,[^-]*-[^-]*-,,')

For immediate port recovery without reboot:

# Reset TCP stack (may drop existing connections)
echo 1 > /proc/sys/net/ipv4/tcp_abort_on_overflow
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle

Modern monitoring solutions should include kernel version checks:

#!/bin/bash
MIN_KERNEL="2.6.32"
CURRENT_KERNEL=$(uname -r | cut -d'-' -f1)

if [ "$(printf '%s\n' "$MIN_KERNEL" "$CURRENT_KERNEL" | sort -V | head -n1)" != "$MIN_KERNEL" ]; then
    echo "CRITICAL: Kernel $CURRENT_KERNEL is vulnerable to orphaned port bugs"
    exit 2
fi

During routine server maintenance, I encountered a puzzling scenario where TCP port 5666 appeared open in netstat output, yet no process was actively bound to it. This manifested through the following diagnostic commands:

# Initial discovery
netstat -ln --program
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address   Foreign Address State   PID/Program name
tcp   0      0      0.0.0.0:5666    0.0.0.0:*       LISTEN  -

The expected NRPE daemon (part of Opsview monitoring) wasn't running, and attempts to start it failed immediately. Standard process inspection tools came up empty:

lsof -i :5666
# No output
ps aux | grep nrpe
# No relevant processes

When basic diagnostics didn't reveal the culprit, I expanded the investigation scope:

Kernel Connection Tracking:

ss -tulnp | grep 5666
# Showed empty process field

Kernel Socket State Inspection:

cat /proc/net/tcp | grep 0A
# Hex value for port 5666 (0x1622)
grep -a "1622" /proc/net/tcp

Kernel Message Buffer:

dmesg | grep -i "5666\|nrpe\|kernel"
# Revealed critical kernel panics

The EC2 instance was running an outdated and unstable Linux kernel (2.6.16). During a kernel panic:

The NRPE process terminated abnormally
Kernel TCP stack maintained the listening socket state
Cleanup routines never executed properly

This created an "orphaned port" scenario where the network stack believed the port was in use, but no user-space process owned it.

Immediate Resolution:

# Force kernel to release the port
echo "1" > /proc/sys/net/ipv4/tcp_tw_reuse
# Then reboot the instance

Permanent Fix:

# For AWS EC2 instances specifically
sudo apt-get install linux-image-$(uname -r)-ec2
sudo reboot

Preventive Monitoring (sample Nagios check):

#!/bin/bash
PORT=5666
NETSTAT=$(netstat -ln --program | grep ":${PORT}")
PROCESS=$(lsof -i :${PORT})

if [ -z "$PROCESS" ] && [ ! -z "$NETSTAT" ]; then
  echo "CRITICAL: Orphaned port ${PORT} detected"
  exit 2
fi

For persistent cases, these techniques can help:

# Kernel socket diagnostics
strace -f -e trace=network nc -zv 127.0.0.1 5666

# Kernel module inspection (if suspecting firewall modules)
lsmod | grep nf_
modinfo nf_conntrack

Remember that different Linux distributions may require variant commands. For RHEL-based systems, consider:

sudo semodule -DB  # Rebuild SELinux policy cache
sudo restorecon -Rv /usr/local/nagios/

ServerDevWorker

Debugging Orphaned TCP Ports: When netstat Shows Open Ports Without Associated Processes

Related Articles