Linux TCP Keep-Alive Not Working on Outbound Connections: Diagnosis and Solutions


8 views

When working with TCP connections on Linux, I noticed an interesting discrepancy in keep-alive behavior between incoming and outgoing connections. While incoming connections properly show keepalive timers in netstat --timers output, outgoing connections show as "off":

tcp 0 0 localhost.localdomain:44307 172.16.0.15:2717 ESTABLISHED off (0.00/0/0)

This contrasts with incoming connections that display proper timer values:

tcp 0 0 172.16.0.3:8585 localhost.localdomain:21527 ESTABLISHED keepalive (29.26/0/0)

After digging through kernel documentation and various networking forums, I found that Linux does support keep-alive for outgoing connections, but there are some caveats:

  • The keep-alive timers are only activated when the socket has no data transmission for the idle period
  • The default system-wide TCP keep-alive settings might override socket-level options
  • Some applications explicitly disable keep-alive for outgoing connections

To properly check if keep-alive is enabled on a socket, we can use the following C code snippet:

int optval;
socklen_t optlen = sizeof(optval);

if (getsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) < 0) {
    perror("getsockopt() failed");
} else {
    printf("SO_KEEPALIVE is %s\n", optval ? "ON" : "OFF");
}

Here's how to properly enable keep-alive with custom timing for outgoing connections:

#include 
#include 
#include 

int enable_keepalive(int sockfd) {
    int yes = 1;
    
    // Enable keep-alive
    if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &yes, sizeof(int)) < 0) {
        return -1;
    }
    
    // Set specific timing parameters (values in seconds)
    int idle = 60;    // Time before sending first keep-alive
    int interval = 10; // Interval between keep-alive packets
    int count = 5;     // Number of unacknowledged probes before declaring dead
    
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &idle, sizeof(int)) < 0) {
        return -1;
    }
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &interval, sizeof(int)) < 0) {
        return -1;
    }
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &count, sizeof(int)) < 0) {
        return -1;
    }
    
    return 0;
}

For applications where you can't modify the source code, you can adjust the system-wide defaults:

# Check current settings
cat /proc/sys/net/ipv4/tcp_keepalive_time
cat /proc/sys/net/ipv4/tcp_keepalive_intvl
cat /proc/sys/net/ipv4/tcp_keepalive_probes

# Set new values (temporary)
echo 60 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 10 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 5 > /proc/sys/net/ipv4/tcp_keepalive_probes

# For permanent changes, add to /etc/sysctl.conf:
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 5

If keep-alive still isn't working as expected:

  • Verify with ss -o instead of netstat (more modern tool)
  • Check if the remote server supports/accepts keep-alive packets
  • Use tcpdump or Wireshark to confirm if packets are being sent
  • Test with different idle times to ensure the timer is properly triggered

When working with TCP keep-alive on Linux, I recently noticed an interesting behavior difference between incoming and outgoing connections. While incoming connections properly show keepalive timers in netstat --timers output, outgoing connections mysteriously show "off" status.


// Sample netstat output showing the issue
tcp 0 0 localhost.localdomain:44307 172.16.0.15:2717 ESTABLISHED off (0.00/0/0)

To properly diagnose this, we need to understand how to examine socket options programmatically. While ss and lsof don't show these details, we can use getsockopt in our code:


#include <sys/socket.h>
#include <netinet/tcp.h>
#include <stdio.h>

void check_keepalive(int sockfd) {
    int optval;
    socklen_t optlen = sizeof(optval);
    
    if (getsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) == -1) {
        perror("getsockopt");
        return;
    }
    
    printf("SO_KEEPALIVE is %s\n", optval ? "ON" : "OFF");
    
    if (optval) {
        int idle, interval, count;
        getsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &idle, &optlen);
        getsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &interval, &optlen);
        getsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &count, &optlen);
        
        printf("Keepalive settings - idle: %d, interval: %d, count: %d\n",
               idle, interval, count);
    }
}

After extensive testing, I discovered that Linux applies keep-alive settings differently for inbound and outbound connections. The key points:

  • Keep-alive must be explicitly set after connection establishment for outbound connections
  • The default system-wide settings in /proc/sys/net/ipv4/tcp_keepalive_* might override your socket options
  • Some network stacks delay keep-alive activation until after the first data transfer

Here's the correct way to implement keep-alive for outbound connections:


int enable_keepalive(int sockfd) {
    int optval = 1;
    
    // Enable keepalive
    if (setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof(optval)) == -1) {
        perror("setsockopt SO_KEEPALIVE");
        return -1;
    }
    
    // Set specific parameters (optional)
    optval = 60; // TCP_KEEPIDLE (time in seconds)
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &optval, sizeof(optval)) == -1) {
        perror("setsockopt TCP_KEEPIDLE");
    }
    
    optval = 10; // TCP_KEEPINTVL (interval between probes)
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &optval, sizeof(optval)) == -1) {
        perror("setsockopt TCP_KEEPINTVL");
    }
    
    optval = 6; // TCP_KEEPCNT (number of unacknowledged probes)
    if (setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &optval, sizeof(optval)) == -1) {
        perror("setsockopt TCP_KEEPCNT");
    }
    
    return 0;
}

To verify your keep-alive settings are active, use these commands:


# Check timer status
netstat --timers -tnp

# Alternative using ss
ss -to state established '( sport = :your_port )'

# View system-wide defaults
cat /proc/sys/net/ipv4/tcp_keepalive_time
cat /proc/sys/net/ipv4/tcp_keepalive_intvl
cat /proc/sys/net/ipv4/tcp_keepalive_probes