Debugging Cloudflare DNS SERVFAIL for auth.otc.t-systems.com: Cross-Environment Resolution Analysis


4 views

When running DNS lookups for auth.otc.t-systems.com against Cloudflare's resolver (1.1.1.1), we're seeing inconsistent results across different environments:

# Successful query example
nslookup auth.otc.t-systems.com 8.8.8.8
Server:     8.8.8.8
Address:    8.8.8.8#53

Non-authoritative answer:
auth.otc.t-systems.com  canonical name = auth.otc.t-systems.com.cdn.cloudflare.net.
# Failing query example
nslookup auth.otc.t-systems.com 1.1.1.1
Server:     1.1.1.1
Address:    1.1.1.1#53

** server can't find auth.otc.t-systems.com: SERVFAIL

Let's use DNS tracing to identify where the resolution chain breaks:

dig +trace auth.otc.t-systems.com @1.1.1.1

Key observations from multiple test environments:

  • Consistent failure when querying Cloudflare DNS from corporate networks
  • Success when using Google DNS (8.8.8.8) from the same networks
  • Success when querying Cloudflare DNS from external VPS instances

Using tcpdump to capture the DNS traffic reveals important details:

sudo tcpdump -i any -n port 53 -w dns_capture.pcap

After analyzing the capture in Wireshark, we notice that:

  • Successful requests complete the full DNS resolution chain
  • Failing requests show truncated responses from intermediate resolvers

Several technical factors could explain this behavior:

  1. DNS Caching Differences - Corporate networks might have caching resolvers with stale records
  2. Firewall Interference - Some networks may modify or block certain DNS responses
  3. EDNS0 Compatibility - Cloudflare heavily uses EDNS0 which some networks filter

Try these diagnostic commands to gather more evidence:

# Check DNSSEC validation status
delv @1.1.1.1 auth.otc.t-systems.com

# Test with/without EDNS0
dig +noedns auth.otc.t-systems.com @1.1.1.1
dig +edns=0 auth.otc.t-systems.com @1.1.1.1

# Check DNS over HTTPS (DoH) behavior
curl "https://cloudflare-dns.com/dns-query?name=auth.otc.t-systems.com&type=A" -H "accept: application/dns-json"

While investigating the root cause, consider these temporary solutions:

# Local hosts file override (Linux/Mac)
echo "104.16.85.20 auth.otc.t-systems.com" | sudo tee -a /etc/hosts

# Alternative DNS configuration (Windows PowerShell)
Set-DnsClientServerAddress -InterfaceIndex 12 -ServerAddresses ("8.8.8.8","1.1.1.1")

Since this domain uses Cloudflare services, special attention should be paid to:

  • Orange-cloud vs gray-cloud status in DNS settings
  • Any DNS-only vs proxied configurations
  • DNSSEC implementation status

I recently encountered a puzzling DNS resolution issue where queries for auth.otc.t-systems.com through Cloudflare's 1.1.1.1 DNS server would sometimes fail with a SERVFAIL error. What made this particularly frustrating was the inconsistent behavior across different machines and networks.

$ nslookup auth.otc.t-systems.com 1.1.1.1
Server:     1.1.1.1
Address:    1.1.1.1#53

** server can't find auth.otc.t-systems.com: SERVFAIL

Here's what I observed during initial testing:

  • Failure on local machine (both work and home networks) with Cloudflare DNS
  • Success with Google DNS (8.8.8.8) on same networks
  • Mixed results across various online DNS lookup tools
  • Working resolution when querying from remote servers

To dig deeper, I used several tools to gather more information:

# Using dig for more detailed output
$ dig auth.otc.t-systems.com @1.1.1.1 +trace +all

# Checking DNSSEC validation
$ delv auth.otc.t-systems.com @1.1.1.1

# Testing with different record types
$ dig A auth.otc.t-systems.com @1.1.1.1
$ dig AAAA auth.otc.t-systems.com @1.1.1.1

The inconsistent behavior suggests several potential issues:

  1. DNSSEC Validation Problems: Cloudflare (1.1.1.1) validates DNSSEC by default, while Google (8.8.8.8) might be more lenient.
  2. DNS Filtering/Blocking: Some networks might be interfering with DNS traffic.
  3. Authoritative Server Issues: The domain's nameservers might have inconsistent responses.
  4. EDNS Client Subnet: Cloudflare might be sending different responses based on client subnet information.

Based on my investigation, here are approaches that worked:

# Bypass potential filtering by using DNS-over-HTTPS
$ curl 'https://cloudflare-dns.com/dns-query?name=auth.otc.t-systems.com&type=A' \
  -H 'accept: application/dns-json'

# Check specific nameservers directly
$ dig auth.otc.t-systems.com @ns1.t-systems.com

For programmatic DNS resolution in applications, consider implementing retry logic with fallback DNS servers:

import dns.resolver

def resolve_dns(domain):
    resolvers = ['1.1.1.1', '8.8.8.8', '9.9.9.9']
    for resolver in resolvers:
        try:
            answers = dns.resolver.resolve(domain, 'A', nameserver=resolver)
            return [str(r) for r in answers]
        except dns.resolver.NXDOMAIN:
            raise
        except dns.resolver.NoAnswer:
            continue
        except dns.resolver.Timeout:
            continue
    raise Exception("All DNS resolvers failed")

When facing similar DNS issues:

  • Always test with multiple DNS providers
  • Check DNSSEC validation status
  • Consider using DNS-over-HTTPS/TLS to bypass network interference
  • Contact the domain administrator if you suspect authoritative server issues
  • For critical applications, implement DNS resolution fallback mechanisms