Troubleshooting DNS Propagation Issues: Why Third-Party DNS Servers Aren’t Caching Your Records


1 views

When you're seeing immediate updates on network-tools.com but no caching behavior during DNS server downtime, you're observing two important DNS characteristics:

// Example dig command showing live query to authoritative nameserver
$ dig +norecurse funny.example.com @your.nameserver.ip
;; flags: qr aa; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

The core issue stems from how recursive resolvers interact with authoritative nameservers. Your current configuration has several critical elements:

// Example named.conf options affecting propagation
options {
    allow-query { any; };
    allow-transfer { none; }; // Security best practice
    recursion no; // As authoritative server
    dnssec-validation auto;
};

Your 15-minute TTL (900 seconds) is indeed the primary reason for the lack of caching. Most public resolvers implement conservative minimum caching thresholds regardless of TTL values.

// Example zone file TTL settings
$TTL 86400      ; 24-hour default
example.com. IN SOA ns1.example.com. admin.example.com. (
    2023081501 ; serial
    3600       ; refresh (1 hour)
    900        ; retry (15 min)
    604800     ; expire (1 week)
    3600       ; minimum (1 hour)
)

For better propagation while maintaining flexibility:

// Optimal TTL strategy for different record types
; A Records
www     IN A     192.0.2.1       ; TTL 14400 (4 hours)

; MX Records
        IN MX 10 mail1           ; TTL 86400 (24 hours)
        IN MX 20 mail2           ; TTL 86400 (24 hours)

; Critical services
api     IN A     192.0.2.2       ; TTL 3600 (1 hour)

Use these methods to test actual propagation across different resolvers:

# Google's public DNS
dig @8.8.8.8 funny.example.com +short

# Cloudflare's DNS
dig @1.1.1.1 funny.example.com +trace

# Multiple resolver check
for resolver in 8.8.8.8 1.1.1.1 9.9.9.9; do
    echo "Checking $resolver:"
    dig @$resolver funny.example.com +short
done

If you've implemented DNSSEC, additional validation is required:

# Verify DNSSEC chain
dig +dnssec example.com. SOA
delv example.com

# Check RRSIG expiration
dig example.com. SOA +dnssec +multi

Implement ongoing monitoring with tools like:

# Nagios plugin example
define command {
    command_name check_dns_propagation
    command_line /usr/lib/nagios/plugins/check_dns -H $HOSTADDRESS$ -a $ARG1$ -t 5 -s 8.8.8.8,1.1.1.1
}

# Prometheus blackbox exporter configuration
- job_name: 'dns_propagation'
  metrics_path: /probe
  params:
    module: [dns_ propagation]
  static_configs:
    - targets:
      - example.com

When managing our authoritative BIND/named servers for example.com, we noticed an unusual pattern: DNS changes appear instantly on tools like network-tools.com, but fail to propagate externally. For instance:

# Sample BIND zone file update
$TTL 900
@ IN SOA ns1.example.com. admin.example.com. (
  2023081501 ; serial
  3600       ; refresh
  900        ; retry
  604800     ; expire
  900        ; minimum TTL
)

funny IN A 203.0.113.45

Our 15-minute TTL (900 seconds) appears too aggressive for proper caching. During server downtime tests:

  • External resolvers immediately fail when our nameservers are unreachable
  • No cached records persist beyond our TTL window
  • Recursive resolvers treat our zone as non-cacheable

For production environments, implement graduated TTLs:

; Recommended baseline configuration
$TTL 86400      ; 24-hour default
@ IN SOA ... (
  86400         ; refresh (24h)
  7200          ; retry (2h)
  3600000       ; expire (6 weeks)
  172800        ; minimum (2 days)
)

; Critical records can use shorter TTLs
www IN A 203.0.113.46  ; TTL 3600 (1h)

The observed behavior occurs because:

  1. network-tools.com uses fresh queries (non-recursive)
  2. Most ISPs honor TTL values religiously
  3. Our low TTL forces constant revalidation

Use these diagnostic commands to test propagation:

# Check authoritative response
dig @ns1.example.com funny.example.com +norec

# Check cached response (Google's DNS)
dig @8.8.8.8 funny.example.com +trace

# Verify TTL acceptance
dig funny.example.com +ttlid

Essential named.conf options for proper propagation:

options {
    allow-query { any; };
    allow-transfer { secondary-IPs; };
    notify yes;
    recursion no;
    auth-nxdomain no;    # RFC 1035 compliance
};

zone "example.com" {
    type master;
    file "/etc/bind/db.example.com";
    allow-update { none; };
    ixfr-from-differences yes;
};

Implement this Python script to track DNS changes across multiple resolvers:

import dns.resolver
import time

resolvers = ['8.8.8.8', '1.1.1.1', '9.9.9.9']
target = 'funny.example.com'

def check_propagation():
    results = {}
    for resolver in resolvers:
        try:
            answer = dns.resolver.resolve(target, 'A', 
                     resolver=dns.resolver.Resolver(configure=False))
            results[resolver] = [rdata.address for rdata in answer]
        except Exception as e:
            results[resolver] = str(e)
    return results

while True:
    print(check_propagation())
    time.sleep(300)  # Check every 5 minutes