When working with multiple DNS nameservers in Ubuntu (particularly older versions like 10.04), you might encounter a frustrating behavior where the system fails to query subsequent nameservers when the primary one doesn't respond with a positive answer. This manifests when:
# Primary NS responds with NXDOMAIN for private zone
$ host host.private.example.org 10.0.0.20
Host host.private.example.org not found: 3(NXDOMAIN)
# Secondary NS has the record but never gets queried
$ host host.private.example.org 10.0.0.30
host.private.example.org has address 10.0.0.60
The issue stems from how the GNU C Library (glibc) resolver handles NXDOMAIN responses. When the first nameserver returns NXDOMAIN (non-existent domain), glibc treats this as a definitive answer and stops querying other nameservers. This differs from SERVFAIL scenarios where it would properly fail over.
Key points about this behavior:
- Affects all glibc-based applications (ping, host, Thunderbird, etc.)
- Network Manager merely generates the resolv.conf file
- More noticeable with split DNS configurations
Option 1: Use a Local DNS Caching Resolver
Configure a local resolver like dnsmasq or unbound that can properly handle multiple upstream servers:
# Install dnsmasq
sudo apt-get install dnsmasq
# Configure /etc/dnsmasq.conf
server=/public.example.org/10.0.0.20
server=/private.example.org/10.0.0.30
Option 2: Modify Resolver Timeout Settings
Adjust timeout and retry values in /etc/resolv.conf (requires disabling NetworkManager overwrites):
options timeout:1 attempts:3 rotate
nameserver 10.0.0.20
nameserver 10.0.0.30
Option 3: Conditional Forwarding with BIND
For advanced setups, configure a local BIND instance:
zone "public.example.org" {
type forward;
forwarders { 10.0.0.20; };
};
zone "private.example.org" {
type forward;
forwarders { 10.0.0.30; };
};
For newer Ubuntu versions (17.10+), systemd-resolved offers better handling:
# Configure with resolvectl
resolvectl dns example.org 10.0.0.20 10.0.0.30
resolvectl domain example.org ~public.example.org ~private.example.org
Remember that these solutions may require adjusting your firewall rules to allow DNS traffic between your local resolver and the upstream servers.
In Ubuntu 10.04 with NetworkManager, I've encountered a peculiar DNS resolution behavior where the system refuses to query the secondary nameserver when the primary fails to resolve a domain. Here's the exact symptom:
# Current resolv.conf configuration
search example.org
nameserver 10.0.0.20 # public nameserver (public.example.org)
nameserver 10.0.0.30 # private nameserver (private.example.org)
The resolution works asymmetrically:
# Works when querying public domain (primary NS)
$ ping host.public.example.org
PING host.public.example.org (10.0.0.50) 56(84) bytes of data.
# Fails when querying private domain (secondary NS)
$ ping host.private.example.org
ping: unknown host host.private.example.org
# But dig confirms the record exists
$ dig @10.0.0.30 host.private.example.org
;; ANSWER SECTION:
host.private.example.org. 3600 IN A 10.0.0.60
The issue stems from how NetworkManager (v0.8) manages resolv.conf:
- By default, it implements a "strict-order" resolution policy
- Timeout handling between nameservers is problematic (default 5s timeout per attempt)
- NXDOMAIN responses may be cached aggressively
Option 1: Modify NetworkManager Configuration
# Edit /etc/NetworkManager/NetworkManager.conf
[main]
dns=default
rc-manager=resolvconf
Option 2: Use resolvconf with Custom Settings
# Install resolvconf if not present
sudo apt-get install resolvconf
# Configure custom options
echo "options timeout:1 attempts:2 rotate" | sudo tee /etc/resolvconf/resolv.conf.d/head
sudo service resolvconf restart
Option 3: Manual resolv.conf Management
# Make resolv.conf immutable to prevent NetworkManager overwrites
sudo chattr +i /etc/resolv.conf
# Sample optimized resolv.conf
nameserver 10.0.0.20
nameserver 10.0.0.30
options timeout:1 attempts:2 rotate
Use this Python script to test fallback behavior:
import socket
import dns.resolver
resolver = dns.resolver.Resolver()
resolver.nameservers = ['10.0.0.20', '10.0.0.30']
resolver.lifetime = 2 # timeout in seconds
try:
answers = resolver.resolve('host.private.example.org', 'A')
for rdata in answers:
print(rdata.address)
except dns.resolver.NXDOMAIN:
print("NXDOMAIN received")
except dns.resolver.NoAnswer:
print("No answer received")
except dns.resolver.Timeout:
print("All nameservers timed out")
For production environments, consider:
- Setting up a local caching resolver (dnsmasq/unbound)
- Implementing DNS views in BIND to merge zones
- Upgrading to newer Ubuntu versions with improved NetworkManager