Many sysadmins have encountered this frustrating scenario: your SSH key authentication succeeds, the server logs show a successful login, but then Write failed: Broken pipe
appears and the connection terminates. Let's dissect this issue through multiple technical angles.
The TCP dump reveals critical insights about the connection flow:
19:00:41.211348 IP [server].ssh > [client]: Flags [S.], seq 4135716624, ack 3430788633
19:01:34.714519 IP [client] > [server].ssh: Flags [P.], seq 2702:3162, ack 2790 (retransmission)
Notice the 30-second gap between packets before retransmission attempts begin. This suggests either:
- Network path MTU issues
- Stateful firewall interference
- TCP window sizing problems
First, verify these critical SSH server settings in /etc/ssh/sshd_config
:
# Example of crucial parameters
TCPKeepAlive yes
ClientAliveInterval 30
ClientAliveCountMax 5
LoginGraceTime 2m
AllowTcpForwarding yes
The auth.log
shows an interesting warning:
error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
Generate missing host keys with:
sudo ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key
sudo systemctl restart sshd
Different networks may require MTU adjustments. Try this diagnostic sequence:
# Find optimal MTU (reduce by 10 until ping works)
ping -M do -s 1472 -c 3 your.server.ip
# Set temporarily for testing:
sudo ifconfig en0 mtu 1400
# For permanent change (MacOS):
sudo networksetup -setMTU en0 1400
Create a detailed connection profile using:
ssh -vvv -o ConnectTimeout=30 -o ConnectionAttempts=3 \
-o ServerAliveInterval=15 -o ServerAliveCountMax=3 \
user@server.example.com
Key timeout parameters to experiment with:
IPQoS
throughput/interactiveConnectTimeout
(default 30s)ServerAliveInterval
(recommended 15-30)
When standard SSH fails, try these fallbacks:
# Try different cipher suites
ssh -c aes128-ctr user@host
# Use alternative authentication:
ssh -o PreferredAuthentications=keyboard-interactive user@host
# Test through jump host:
ssh -J gateway.example.com target.example.com
Check for silent packet drops using these Ubuntu commands:
# Monitor conntrack entries
sudo conntrack -E -p tcp --dport 22
# Check firewall logs
sudo journalctl -k --grep="DROP" --since "1 hour ago"
# Inspect TCP window scaling
ss -itmp '( dport = :ssh )'
Recently while traveling between countries, I encountered a peculiar SSH issue where connections would fail after successful public key authentication. The debug logs showed:
debug1: Authentication succeeded (publickey).
debug2: channel 0: open confirm rwindow 0 rmax 32768
Write failed: Broken pipe
This was particularly frustrating because:
- Authentication succeeds (visible in auth.log)
- Works fine from other networks
- Local datacenter SSH works between servers
- Console login remains functional
Packet captures revealed the TCP handshake completes normally, with the break occurring during the encrypted session establishment phase. The tcpdump output shows normal SYN/SYN-ACK/ACK exchange, followed by encrypted payloads before the stall.
Key observations from the network traces:
19:00:41.760341 IP [redacted_ip].ssh > 192.168.1.2.50409: Flags [P.], seq 1490:1674, ack 22, win 114
19:01:34.714519 IP 192.168.1.2.50409 > [redacted_ip].ssh: Flags [P.], seq 2702:3162, ack 2790, win 4096
The auth.log contained one notable warning:
error: Could not load host key: /etc/ssh/ssh_host_ed25519_key
This suggests the server may be falling back to less secure key types. To regenerate all host keys:
sudo rm /etc/ssh/ssh_host_*
sudo dpkg-reconfigure openssh-server
Several client-side factors can contribute to this behavior:
- TCP Keepalives: Add these to ~/.ssh/config:
Host * ServerAliveInterval 60 ServerAliveCountMax 5 TCPKeepAlive yes
- Cipher Selection: Force modern ciphers:
ssh -oCiphers=chacha20-poly1305@openssh.com,aes256-gcm@openssh.com
- MTU Issues: Try lowering MTU:
sudo ifconfig en0 mtu 1400
Network security devices often interfere with SSH:
Device Type | Common Issues | Test Command |
---|---|---|
Stateful Firewall | Aggressive connection timeouts | ssh -o ConnectTimeout=30 |
IDS/IPS | SSH version filtering | ssh -o "Protocol 2" |
NAT Gateway | TCP RST injection | sudo tcpdump 'tcp[tcpflags] & (tcp-rst) != 0' |
For persistent cases, we need deeper inspection:
# Client-side debug
strace -f -e trace=network -s 10000 -o ssh.strace ssh -vvv user@host
# Server-side monitoring
sudo journalctl -u ssh --follow --output=cat
Particularly watch for:
- TCP retransmissions in tcpdump
- SELinux/AppArmor denials in system logs
- Resource exhaustion (sshd memory usage)
When standard SSH fails, consider these workarounds:
# Use HTTP CONNECT proxy
ssh -o ProxyCommand="nc -X connect -x proxy:3128 %h %p" user@host
# Try mosh for unstable connections
mosh --ssh="ssh -p 22" user@host
# Web-based fallback
sudo apt install shellinabox