Troubleshooting SSH Hostname Resolution Failures When DNS Lookup Works for Ping But Not SSH


4 views

During our infrastructure migration from Solaris to Linux, we encountered a peculiar networking issue where:

  • ping server.idmz.example.com resolves successfully
  • dig server.idmz.example.com returns correct records
  • But ssh server.idmz.example.com fails with "Could not resolve hostname"

First, let's verify the DNS resolution chain:

# Check DNS resolution order
$ cat /etc/nsswitch.conf | grep hosts
hosts:      files dns

# Verify DNS servers
$ cat /etc/resolv.conf
search example.org
nameserver 192.168.1.1
nameserver 192.168.1.2

One common culprit in such cases is IPv6 resolution. SSH might be attempting IPv6 lookups even when IPv4 works:

# Try forcing IPv4
$ ssh -4 server.idmz.example.com

# Or disable IPv6 in SSH config
echo "AddressFamily inet" >> ~/.ssh/config

Sometimes GSSAPI authentication can cause resolution failures:

# Check current SSH configuration
$ ssh -G server.idmz.example.com

# Disable GSSAPI if needed
$ ssh -o GSSAPIAuthentication=no server.idmz.example.com

SSH has stricter timeout settings than ping. Try adjusting them:

# Increase DNS resolution timeout
$ ssh -o ConnectTimeout=30 -o ConnectionAttempts=5 server.idmz.example.com

After extensive testing, we found the most reliable solution was to modify the SSH client configuration:

# /etc/ssh/ssh_config or ~/.ssh/config
Host *.idmz.example.com
    AddressFamily inet
    GSSAPIAuthentication no
    CheckHostIP no
    ConnectTimeout 20
    StrictHostKeyChecking no

To confirm everything works as expected:

$ ssh -v server.idmz.example.com
[...]
debug1: Connecting to server.idmz.example.com [192.168.1.3] port 22.
debug1: Connection established.

When migrating from Solaris to Linux jump hosts, we encountered a peculiar case where standard networking tools (ping, dig) could resolve hostnames in idmz.example.com while SSH clients failed with "Could not resolve hostname". This manifested specifically:

$ dig +short server.idmz.example.com
192.168.1.3

$ ssh -v server.idmz.example.com
OpenSSH_8.9p1, OpenSSL 3.0.7 1 Nov 2022
debug1: Connecting to server.idmz.example.com [192.168.1.3] port 22.
Connection timed out during DNS resolution

Key observations from packet captures (tcpdump -n -i any port 53):

  • SSH performs sequential DNS queries (A/AAAA records) without respecting TTLs
  • The resolver falls back to non-authoritative NS records when authoritative lookups fail
  • DMZ subdomains with working SSH show proper SOA record propagation
# Compare working vs broken domains:
$ dig +nocmd +noall +answer SOA jdmz.example.com
jdmz.example.com. 3600 IN SOA ns1.jdmz.example.com. admin.example.com. 2023121401 ...

$ dig +nocmd +noall +answer SOA idmz.example.com
;; No SOA records returned

The OpenSSH client implements custom hostname resolution logic that differs from glibc:

  1. Prioritizes IPv6 (AAAA) queries even when IPv4 is requested
  2. Implements strict timeout thresholds (5s default)
  3. Doesn't honor options edns0 in /etc/resolv.conf

Workaround configuration for /etc/ssh/ssh_config:

Host *.idmz.example.com
    AddressFamily inet
    ConnectTimeout 15
    CheckHostIP no
    GSSAPIAuthentication no

In environments with strict DNSSEC validation (common in DMZs), missing DS records cause resolution failures. Diagnostic steps:

$ delv server.idmz.example.com
;; resolution failed: broken trust chain

# Temporary bypass (not recommended for production):
$ sudo sysctl -w net.dns.resolver.options=edns0:0

For enterprise environments, implement either:

# Option 1: Local host override
echo "192.168.1.3 server.idmz.example.com" | sudo tee -a /etc/hosts

# Option 2: Custom resolv.conf with search domains
cat << EOF | sudo tee /etc/resolv.conf.d/dmz.conf
search idmz.example.com example.com
nameserver 192.168.1.1
options timeout:2 attempts:1
EOF

For Ansible-managed environments:

- name: Configure DNS overrides
  blockinfile:
    path: /etc/hosts
    block: |
      {% for server in dmz_servers %}
      {{ server.ip }} {{ server.fqdn }}
      {% endfor %}