Troubleshooting SSH Connection Stuck at “expecting SSH2_MSG_KEX_DH_GEX_REPLY” on EC2 Instance


11 views

You've set up your EC2 instance correctly - SSH port 22 open, public key in ~/.ssh/authorized_keys, permissions properly configured. Yet when you try to connect using verbose debugging:

ssh -vvv ec2-user@your-ec2-instance

The connection hangs indefinitely at:

debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY

This message indicates the SSH client has sent its key exchange initialization but isn't receiving a response from the server. The handshake process gets stuck during the Diffie-Hellman key exchange phase.

1. Network-Level Blocking

Despite port 22 being open, intermediate networking components might interfere:

  • Check Security Groups and NACLs for both inbound AND outbound rules
  • Verify the instance has a proper public IP/Elastic IP
  • Test with a simplified security group temporarily

2. SSH Configuration Issues

Server-side SSH config might need adjustment. Try adding these to /etc/ssh/sshd_config:

KexAlgorithms diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha1
Ciphers aes128-ctr,aes192-ctr,aes256-ctr
MACs hmac-sha2-256,hmac-sha2-512

Then restart SSH:

sudo systemctl restart sshd

3. Resource Constraints

Small instance types might struggle with DH key generation:

  • Check CPU usage during connection attempts
  • Consider upgrading from t2.micro to t2.small if consistently failing
  • Monitor /var/log/auth.log for error messages

When basic fixes don't work, go deeper:

# Check for packet drops
sudo tcpdump -i eth0 port 22 -n

# Verify process is listening
sudo netstat -tulnp | grep 22

# Test with alternate client
ssh -oKexAlgorithms=diffie-hellman-group14-sha1 user@host

Amazon's environment adds some unique factors:

  • Metadata service interference (try disabling IMDSv2 temporarily)
  • ENI attachment issues (stop/start instance to force new network setup)
  • EBS volume latency affecting key generation

When your SSH client gets stuck at debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY, it indicates the key exchange phase failed during the SSH protocol handshake. This typically happens after the initial connection is established but before authentication begins.


# First verify basic connectivity:
ping your-ec2-instance.com
telnet your-ec2-instance.com 22

If these work, check your SSH client version:


ssh -V

Modify /etc/ssh/sshd_config on your EC2 instance:


# Ensure these settings are present:
KexAlgorithms curve25519-sha256,ecdh-sha2-nistp256,ecdh-sha2-nistp384,diffie-hellman-group-exchange-sha256
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com

For diagnostic purposes, try forcing different key exchange methods:


ssh -oKexAlgorithms=diffie-hellman-group-exchange-sha256 user@host
ssh -oKexAlgorithms=ecdh-sha2-nistp256 user@host

Check your Security Group rules for the instance:


# Sample AWS CLI command to verify SG rules:
aws ec2 describe-security-groups --group-ids sg-xxxxxxxx --query 'SecurityGroups[].IpPermissions[]'

Use tcpdump to capture the network traffic:


sudo tcpdump -i eth0 -w ssh.pcap port 22

Analyze with Wireshark to see exactly where the handshake fails.

As a temporary workaround, you can use AWS Session Manager:


aws ssm start-session --target i-xxxxxxxxxxxxx

Update both client and server SSH packages:


# On Ubuntu/Debian:
sudo apt update && sudo apt upgrade openssh-client openssh-server

# On RHEL/CentOS:
sudo yum update openssh