Debugging Autossh: Why SSH Processes Persist After Network Disconnection in Tunnel Setups

What you're seeing is actually autossh's default behavior - it only monitors the SSH process status, not the actual network connectivity through the tunnel. Here's why:

Sep  5 12:26:44 serverA autossh[20935]: check on child 23084
Sep  5 12:26:44 serverA autossh[20935]: set alarm for 30 secs

The logs clearly show autossh is just checking if the child process (PID 23084) exists, not whether the tunnel is functional.

Autossh performs exactly two checks:

Process existence (is the SSH daemon still running?)
Port monitoring (if -M port is specified)

It doesn't validate tunnel functionality or remote endpoint accessibility. This explains why your SSH process remains running during network outages.

To make autossh more responsive to network failures, add these SSH parameters to your command:

AUTOSSH_POLL=30 AUTOSSH_LOGLEVEL=7 autossh -M 0 -f -S none -N \
  -o "ServerAliveInterval=15" \
  -o "ServerAliveCountMax=3" \
  -o "TCPKeepAlive=yes" \
  -L localhost:34567:localhost:6543 user1@server1

The critical parameters that trigger faster failure detection:

ServerAliveInterval 15: Send keepalive every 15 seconds
ServerAliveCountMax 3: Terminate after 3 missed responses (45 seconds total)
TCPKeepAlive yes: Enable TCP-level keepalives

For mission-critical tunnels, consider adding a monitoring script:

#!/bin/bash
while true; do
  if ! nc -z localhost 34567; then
    pkill -f "autossh.*34567:localhost:6543"
    AUTOSSH_POLL=30 autossh -M 0 -f -S none -N \
      -L localhost:34567:localhost:6543 user1@server1 &
  fi
  sleep 30
done

The original design philosophy assumes:

Network layers should handle physical connectivity
SSH's built-in keepalives suffice for most cases
Adding complex endpoint verification would introduce new failure modes

For most production scenarios, properly configured SSH keepalives provide the right balance between reliability and simplicity.

When working with autossh in persistent tunnel scenarios, many developers encounter a perplexing behavior: the SSH process remains alive even after physical network disconnection. Let's examine why this happens and how to properly configure the system.

Your setup uses these key parameters:

AUTOSSH_POLL=30 AUTOSSH_LOGLEVEL=7 autossh -M 0 -f -S none -f -N -L localhost:34567:localhost:6543 user1@server1

The critical elements here are:

-M 0: Disables the built-in monitoring port
-S none: Disables control master functionality
AUTOSSH_POLL=30: Sets a 30-second check interval

Autossh primarily monitors the SSH process existence, not the network connectivity. When you physically disconnect the cable:

Autossh continues seeing the SSH process running (PID exists)
Without proper keepalive settings, SSH won't terminate on network loss
The monitoring loop only checks process status, not tunnel viability

The solution lies in proper SSH configuration. Your current settings are good but could be optimized:

# Server-side keepalive settings
ServerAliveInterval 15
ServerAliveCountMax 3
TCPKeepAlive yes

More aggressive settings for unreliable networks:

# For unstable connections
ServerAliveInterval 10
ServerAliveCountMax 2
ClientAliveInterval 30
ClientAliveCountMax 5

Here's a more robust autossh command structure:

AUTOSSH_POLL=10 AUTOSSH_LOGLEVEL=7 \
autossh -M 20000 -N \
-o "ServerAliveInterval 10" \
-o "ServerAliveCountMax 3" \
-o "ExitOnForwardFailure=yes" \
-L localhost:34567:localhost:6543 user1@server1

Key improvements:

Uses monitoring port (-M 20000) for better connection verification
Shorter keepalive intervals for faster failure detection
ExitOnForwardFailure ensures cleanup if forwarding fails

For Linux systems, consider this systemd service unit for better control:

[Unit]
Description=AutoSSH tunnel service
After=network.target

[Service]
Environment="AUTOSSH_GATETIME=0"
Environment="AUTOSSH_POLL=10"
ExecStart=/usr/bin/autossh -M 20000 -N \
  -o "ServerAliveInterval 10" \
  -o "ServerAliveCountMax 3" \
  -L localhost:34567:localhost:6543 user1@server1
Restart=always
RestartSec=60

[Install]
WantedBy=multi-user.target

Verify your setup works correctly with:

# Check SSH process status
ps aux | grep ssh

# Test tunnel connectivity
timeout 5 telnet localhost 34567 || echo "Tunnel failed"

# Force network failure test (requires root)
iptables -A OUTPUT -p tcp --dport 22 -j DROP
sleep 20
iptables -D OUTPUT -p tcp --dport 22 -j DROP

ServerDevWorker

Debugging Autossh: Why SSH Processes Persist After Network Disconnection in Tunnel Setups

Related Articles