Troubleshooting BIND9 DNSSEC Validation: Why Local Queries Fail but Remote Clients Succeed


3 views

When configuring BIND9 as a caching DNS server with DNSSEC validation, we often encounter a peculiar scenario where:

  • Local queries on the BIND server correctly fail DNSSEC validation (returning SERVFAIL)
  • Remote clients receive successful responses (NOERROR) for the same queries

Based on the configuration provided, we're working with:

BIND 9.9.5-9+deb8u6-Debian
Debian GNU/Linux 8.5 (jessie)
Basic caching DNS server setup
DNSSEC validation enabled

The test domain sigfail.verteiltesysteme.net is specifically designed to fail DNSSEC validation. Here's what we observe:

# Local query on BIND server:
$ dig @192.168.10.36 sigfail.verteiltesysteme.net
;; status: SERVFAIL

# Remote client query:
$ dig @192.168.10.36 sigfail.verteiltesysteme.net
;; status: NOERROR
;; ANSWER: 134.91.78.139

BIND logs show the validation is actually working:

Jul 9 00:33:05 thor named[2940]: validating @0x7fd2d0391140: sigfail.verteiltesysteme.net A: no valid signature found
Jul 9 00:33:05 thor named[2940]: error (no valid RRSIG) resolving 'sigfail.verteiltesysteme.net/A/IN': 134.91.78.141#53

Yet remote clients still receive the invalid data. This suggests a caching or response modification issue.

The current named.conf.options contains:

options {
    directory "/var/cache/bind";
    dnssec-enable yes;
    dnssec-validation auto;
    auth-nxdomain no;
    listen-on { 127.0.0.1; 192.168.10.36; };
    recursion yes;
    allow-recursion { 127.0.0.0/8; 192.168.10.0/24; };
    max-ncache-ttl 0;
};

To enforce strict DNSSEC validation behavior, we need these additional options:

options {
    # Existing options...
    dnssec-lookaside auto;
    max-cache-ttl 10; # Prevent long caching of invalid responses
    minimal-responses no;
    max-clients-per-query 10;
    serve-stale no; # Critical - prevents serving invalid cached data
};

Use these commands to verify DNSSEC behavior:

# Check validation status
$ dig +dnssec +multi sigfail.verteiltesysteme.net

# Verify BIND version and capabilities
$ named -V

# Check DNSSEC trust anchors
$ rndc secroots

# Enable full debugging
$ rndc trace 3

The issue stems from two primary factors:

  1. Caching behavior: BIND may serve stale cache entries even after validation fails
  2. EDNS compliance: Some clients don't properly advertise DNSSEC awareness

Here's the complete working configuration:

options {
    directory "/var/cache/bind";
    
    dnssec-enable yes;
    dnssec-validation auto;
    dnssec-lookaside auto;
    
    auth-nxdomain no;
    listen-on { 127.0.0.1; 192.168.10.36; };
    
    recursion yes;
    allow-recursion { 127.0.0.0/8; 192.168.10.0/24; };
    
    max-cache-ttl 10;
    max-ncache-ttl 10;
    serve-stale no;
    minimal-responses no;
    
    # Security enhancements
    allow-query-cache { 127.0.0.0/8; 192.168.10.0/24; };
    allow-transfer { none; };
    version "not disclosed";
};

After implementing these changes:

# Restart BIND
$ systemctl restart bind9

# Verify from remote client
$ dig +dnssec @192.168.10.36 sigfail.verteiltesysteme.net
;; Should now return SERVFAIL

# Check logs for validation failures
$ tail -f /var/log/syslog | grep validating

This configuration ensures consistent DNSSEC validation behavior for all clients while maintaining proper caching functionality for valid responses.


The problem manifests when a BIND9 DNS server (v9.9.5) correctly identifies DNSSEC validation failures locally but serves unvalidated responses to network clients. This creates a security gap where clients receive potentially compromised DNS data despite the server's validation capabilities.

// Server-side correct behavior
Jul  9 00:33:05 thor named[2940]: validating @0x7fd2d0391140: sigfail.verteiltesysteme.net A: no valid signature found
Jul  9 00:33:05 thor named[2940]: error (no valid RRSIG) resolving 'sigfail.verteiltesysteme.net/A/IN': 134.91.78.141#53

Yet client queries receive NOERROR responses with the invalid records:

;; ANSWER SECTION:
sigfail.verteiltesysteme.net. 60 IN     A       134.91.78.139
sigfail.verteiltesysteme.net. 60 IN     RRSIG   A 5 3 60 20200610081125 20150611081125 30665 verteiltesysteme.net.

The current named.conf.options appears correct:

options {
    directory "/var/cache/bind";
    dnssec-enable yes;
    dnssec-validation auto;
    auth-nxdomain no;
    listen-on { 127.0.0.1; 192.168.10.36; };
    recursion yes;
    allow-recursion { 127.0.0.0/8; 192.168.10.0/24; };
    max-ncache-ttl 0;
};
  • Query forwarding behavior differences between local and remote queries
  • EDNS(0) handling variations
  • Response caching mechanisms
  • Client DNSSEC awareness differences

First, verify full DNSSEC chain with:

dig +dnssec +multi sigfail.verteiltesysteme.net

Check BIND's validation view:

rndc validation

Enable detailed logging:

logging {
    channel query-log {
        file "/var/log/named/query.log" versions 5 size 50M;
        severity debug 3;
        print-time yes;
    };
    category queries { query-log; };
    category security { query-log; };
};

Force strict DNSSEC validation by modifying named.conf:

options {
    // Existing options...
    dnssec-validation yes; // Changed from 'auto' to 'yes'
    validate-except { "intranet"; }; // Exempt internal zones
};

Clear cache and restart:

rndc flush
rndc reload
systemctl restart bind9

Server-side validation check:

dig @127.0.0.1 +cd sigfail.verteiltesysteme.net

Client verification (should now match server behavior):

dig @192.168.10.36 +adflag sigfail.verteiltesysteme.net

For strict enforcement across all clients:

options {
    // Existing options...
    minimal-responses yes;
    deny-answer-aliases { any; };
    dnssec-must-be-secure "verteiltesysteme.net" yes;
};

Consider adding view-based configurations for different client groups if needed:

view "internal" {
    match-clients { 192.168.10.0/24; };
    recursion yes;
    dnssec-validation yes;
};