When troubleshooting CNAME resolution in our internal network (domain1), I observed an interesting behavior pattern. The resolver initially sends an A record request:
// Sample DNS query capture
16:15:45.837525 IP (tos 0x0, ttl 64, id 36911, offset 0, flags [none], proto UDP (17), length 62)
myhost.domain1.40684 > dnsserver.domain1.domain: 15355+ A? cfengine.domain1. (34)
But receives a ServFail response when the target is actually a CNAME record. This raises fundamental questions about DNS resolution responsibility.
In mixed-mode DNS servers (authoritative + recursive), the behavior depends on:
- Whether the RD (Recursion Desired) flag is set
- If the query is for an authoritative zone
- Server implementation (BIND, PowerDNS, etc.)
A proper functioning server should:
1. Check if query is for authoritative zone
2. For authoritative queries with RD=0:
- Return CNAME directly if exists
- Return NXDOMAIN if no record
3. For recursive queries (RD=1):
- Follow CNAME chain automatically
- Return final A record or error
To diagnose this properly, use these dig commands:
# Basic A record lookup (may fail)
dig @dnsserver cfengine.domain1 A +norecurse
# Explicit CNAME query
dig @dnsserver cfengine.domain1 CNAME +norecurse
# Full recursive resolution
dig @dnsserver cfengine.domain1 A +recurse
The key difference is the +norecurse/+recurse
flag which controls RD bit setting.
When dealing with combined authoritative/recursive servers:
# BIND named.conf example for proper CNAME handling
zone "domain1" {
type master;
file "db.domain1";
allow-recursion { any; };
};
options {
recursion yes;
cname-auto-alias yes; # Modern BIND feature
};
For client-side workarounds:
# Resolver configuration (CentOS/RHEL)
options timeout:1 attempts:2
search domain1
nameserver dnsserver.domain1
A properly configured DNS server should:
- Detect CNAME records during A record queries
- Automatically follow the chain for recursive queries
- Return additional section data when possible (per RFC 1034)
The ServFail response suggests either:
- Server misconfiguration for authoritative zone handling
- CNAME chain pointing to unreachable domains
- DNSSEC validation failures (if enabled)
When troubleshooting CNAME resolution, we need to understand the exact workflow:
Resolver → Query for A record → Authoritative Server
↑ |
| ↓
+---- CNAME response ←----- Check zone data
From your packet capture, we see the critical failure pattern:
# Failed A record query
myhost.domain1.40684 > dnsserver.domain1.domain: 15355+ A? cfengine.domain1. (34)
dnsserver.domain1.domain > myhost.domain1.40684: 15355 ServFail 0/0/0 (34)
But manual CNAME query succeeds:
dig CNAME cfengine.domain1
;; ANSWER SECTION:
cfengine.domain1. 3600 IN CNAME helm02.domain2.
The server's dual role creates interesting edge cases. When authoritative for domain1:
- For A record queries: Should return CNAME if exists (RFC 1034 Section 3.6.2)
- For CNAME queries: Should return just the CNAME record
Per RFC 1034, authoritative servers must:
if (query.type == A && zone.has(CNAME)):
return CNAME + follow chain
elif (query.type == CNAME):
return CNAME only
For your mixed-environment scenario:
# Force CNAME resolution then A lookup
dig CNAME cfengine.domain1 +short | xargs dig A
Or configure resolver to handle chaining:
# /etc/resolv.conf options
options rotate
options attempts:2
options no-tld-query
Verify your DNS server can properly handle authoritative CNAMEs:
# BIND check
named-checkzone domain1 /var/named/domain1.zone
# PowerDNS debug
pdns_server --daemon=no --loglevel=10
Here's what successful resolution should look like:
;; QUESTION SECTION:
;cfengine.domain1. IN A
;; ANSWER SECTION:
cfengine.domain1. 3600 IN CNAME helm02.domain2.
helm02.domain2. 300 IN A 192.168.1.10
Systematic troubleshooting steps:
- Verify zone file syntax
- Check server logs for SERVFAIL reasons
- Test with
+norecurse
to isolate authoritative behavior - Validate DNSSEC status if enabled