Debugging Mystery Sockets: Why lsof Shows Connections That netstat Doesn’t


2 views

When troubleshooting network-related issues in Java applications, we often rely on tools like lsof and netstat. However, you might encounter situations where sockets appear in lsof output but remain invisible to netstat. This typically happens with UNIX domain sockets or internal IPC connections that don't use traditional network protocols.

The sockets appearing in your output with "can't identify protocol" are likely one of these types:


1. UNIX domain sockets (AF_UNIX)
2. Internal IPC sockets
3. Socket pairs created with socketpair(2)
4. Abstract namespace sockets

To properly identify these mystery sockets, try these approaches:


# Check for UNIX domain sockets
sudo ss -xap | grep java

# Alternative using /proc
for i in /proc/$(pgrep java)/fd/*; do 
    readlink $i | grep socket | cut -d'[' -f2 | cut -d']' -f1 | 
    xargs -I{} grep -l {} /proc/net/unix
done

# Socket statistics
cat /proc/net/sockstat

Since this is a Java application, consider these JVM tools:


# List all open file descriptors
jcmd <pid> VM.native_memory summary

# Alternatively, use Java's NIO diagnostics
String[] cmd = {
    "/bin/sh",
    "-c",
    "ls -l /proc/" + ManagementFactory.getRuntimeMXBean().getName().split("@")[0] + "/fd"
};
Runtime.getRuntime().exec(cmd);

To manage socket resources better in your Java container:


// Implement proper resource cleanup
try (Socket socket = new Socket()) {
    // socket operations
} // auto-closed

// Monitor FD usage
OperatingSystemMXBean osBean = ManagementFactory.getPlatformMXBean(
    OperatingSystemMXBean.class);
System.out.println("Open FD count: " + osBean.getOpenFileDescriptorCount());

For long-term monitoring, consider implementing these solutions:


# Prometheus exporter configuration
- job_name: 'java_fd_monitor'
  metrics_path: '/metrics'
  static_configs:
    - targets: ['localhost:1234']
  params:
    pattern: ['java.*socket.*']

# Custom JMX monitoring
public class SocketMonitor implements NotificationListener {
    public void handleNotification(Notification notification, Object handback) {
        if (notification.getType().equals("file.descriptor.threshold.exceeded")) {
            // Trigger diagnostics
        }
    }
}

When diagnosing file descriptor leaks in Java applications, it's particularly frustrating to encounter sockets that appear in lsof output but remain invisible to netstat or ss. These typically appear with the "can't identify protocol" message and mysterious socket inode numbers.

These hidden sockets usually fall into several categories:

  • Unix domain sockets in abstract namespace (no filesystem path)
  • Sockets created but not yet bound (in CLOSED or UNCONNECTED state)
  • Socket pairs created via socketpair() system call
  • Ephemeral sockets during complex connection sequences

Here are more powerful ways to investigate these phantom sockets:

# Check socket inodes in all network namespaces
for ns in /proc/[0-9]*/ns/net; do
  nsenter --net=$ns grep "SOCKET_INODE" /proc/net/*
done

# Deep inspection using systemtap
stap -e 'probe begin {
  printf("%-6s %-16s %-8s %s\\n", "PID", "COMM", "FD", "SOCKET")
}
probe syscall.socket.return {
  printf("%-6d %-16s %-8d socket:[%d]\\n", pid(), execname(), $return, inode_num)
}'

For Java applications, we can combine native tools with JVM diagnostics:

# Get Java thread stack traces that created sockets
jstack PID | grep -A10 "java.net" 

# Use jcmd for direct socket diagnostics
jcmd PID VM.native_memory summary | grep -A10 "Java_java_net" 

# JVM socket statistics (Linux only)
cat /proc/$PID/net/sockstat

Here's a Python script to track socket creation by matching lsof output with procfs data:

#!/usr/bin/env python3
import os, re

def find_socket_leaks(pid):
    sockets = set()
    # Parse lsof output
    with os.popen(f"lsof -p {pid} -a -U -F n") as f:
        for line in f:
            if line.startswith('n'):
                inode = re.search(r'socket:$$(\d+)$$', line)
                if inode:
                    sockets.add(inode.group(1))
    
    # Check network namespaces
    found = False
    for proto in ['tcp', 'tcp6', 'udp', 'udp6', 'unix']:
        try:
            with open(f"/proc/{pid}/net/{proto}") as f:
                for line in f:
                    for inode in sockets:
                        if inode in line:
                            print(f"Found in /proc/net/{proto}: {line.strip()}")
                            found = True
        except FileNotFoundError:
            continue
    
    if not found:
        print("Socket inodes not found in any /proc/net files")
        print("Likely candidates:")
        print("- Unconnected sockets")
        print("- Socket pairs")
        print("- Abstract UNIX sockets")

if __name__ == "__main__":
    import sys
    find_socket_leaks(sys.argv[1])

To prevent such leaks in Java applications:

  • Implement connection pooling properly
  • Add shutdown hooks for network cleanup
  • Monitor with JMX: java.nio.BufferPool and java.net.Socket MXBeans
  • Consider using -XX:NativeMemoryTracking=detail

For critical production systems where you need immediate resolution:

# Emergency FD cleanup (Linux only)
gdb -p PID -batch -ex 'call close(1010)' -ex 'call close(1011)' \
    -ex 'call close(1012)' -ex 'call close(1014)' -ex 'detach'

Warning: This should only be used as last resort after thorough testing in non-production environments.