Troubleshooting WGET/CURL Host Resolution Failure on Debian: DNS Configuration Deep Dive


5 views

Here's what makes this case particularly puzzling: both machines share identical network configurations but exhibit different behaviors when resolving hosts through wget/curl. The key observations:

# Machine 1 (Working):
$ cat /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

$ wget google.com
--2009-10-20 16:43:55--  http://google.com/
Resolving google.com... 74.125.53.100, 74.125.45.100, 74.125.67.100
# Machine 2 (Failing):
$ cat /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

$ wget google.com
--2009-10-20 16:38:36--  http://google.com/
Resolving google.com... failed: Name or service not known.

Since basic DNS configuration appears identical, let's dig deeper into potential culprits:

# Check DNS resolution tools comparison:
$ nslookup google.com
$ dig google.com
$ host google.com

The discrepancy likely lies in how different tools interact with glibc's name resolution. Check the name service switch configuration:

# Examine /etc/nsswitch.conf
$ cat /etc/nsswitch.conf | grep hosts
# Typical Debian output:
hosts:          files dns

Even with identical iptables rules, check for subtle differences in firewall behavior:

# Verify DNS port access
$ sudo tcpdump -i eth0 -n port 53
# On working machine, you should see DNS queries/responses

Use these commands to pinpoint resolution failures:

# Check DNS resolution via libc
$ getent hosts google.com

# Verify DNS server reachability
$ ping 8.8.8.8
$ nc -zv 8.8.8.8 53

# Test with direct IP to bypass DNS
$ wget http://142.250.190.78

When standard DNS fails, consider these approaches:

# Force IPv4 resolution
$ wget -4 google.com

# Use alternate DNS temporarily
$ wget --header="Host: google.com" http://8.8.8.8

# Modify resolve.conf temporarily
$ echo "nameserver 1.1.1.1" | sudo tee /etc/resolv.conf

After identifying the root cause, implement these permanent fixes:

# Install dnsutils for better diagnostics
$ sudo apt-get install dnsutils

# Configure static DNS in interfaces file
$ sudo nano /etc/network/interfaces
# Add:
dns-nameservers 8.8.8.8 8.8.4.4

I recently encountered a baffling situation where two Debian 5.0 machines on the same subnet with identical network configurations exhibited different behavior with wget/curl. Both systems shared:

# /etc/resolv.conf
nameserver 8.8.8.8
nameserver 8.8.4.4

# route -n
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 eth0

First, let's verify the basic connectivity on the problematic machine (machine2):

# ping -c 3 google.com
PING google.com (142.250.189.46) 56(84) bytes of data.
64 bytes from fra16s48-in-f14.1e100.net (142.250.189.46): icmp_seq=1 ttl=117 time=12.3 ms
...
# host google.com
google.com has address 142.250.189.46
google.com has IPv6 address 2a00:1450:4013:c07::66

Yet wget fails spectacularly:

# wget google.com
--2009-10-20 16:38:36--  http://google.com/
Resolving google.com... failed: Name or service not known.
wget: unable to resolve host address google.com'

Let's examine the name resolution process more thoroughly. These commands help identify where the breakdown occurs:

# strace -e trace=network wget google.com
socket(AF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 4
connect(4, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("8.8.8.8")}, 16) = 0
sendto(4, "\\304\\1\\0\\0\\1\\0\\0\\0\\0\\0\\0\\6google\\3com\\0\\0\\1\\0\\1", 29, MSG_NOSIGNAL, NULL, 0) = 29

Compare network libraries between machines:

# ldd $(which wget)
linux-vdso.so.1 =>  (0x00007ffd18d8a000)
libssl.so.1.0.0 => /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 (0x00007f8a3a6a5000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8a3a2db000)
/lib64/ld-linux-x86-64.so.2 (0x00007f8a3a8e9000)

After hours of debugging, the issue was in the name service switch configuration:

# diff /etc/nsswitch.conf machine1 machine2
< hosts:          files dns
> hosts:          files

The working machine had proper DNS resolution order, while the problematic one only used local files. Fix it with:

# echo "hosts: files dns" > /etc/nsswitch.conf

For future troubleshooting, these commands are invaluable:

# Check DNS resolution directly
getent hosts google.com

# Verify name resolution order
getent ahosts google.com

# Test with different resolvers
dig @8.8.8.8 google.com

Create a verification script to check system health:

#!/bin/bash
check_connectivity() {
    if ! wget -q --spider google.com; then
        echo "DNS resolution failed"
        echo "Checking /etc/nsswitch.conf..."
        grep hosts /etc/nsswitch.conf
        return 1
    fi
    return 0
}
check_connectivity || exit 1