The conventional DNS round robin approach provides basic load balancing by rotating IP addresses in DNS responses. While this works reasonably well for homogeneous server clusters, it becomes problematic when dealing with servers of varying capacities (100Mbps, 1Gbps, 10Gbps).
# Traditional round robin DNS zone example
orion.2x.to. IN A 80.237.201.41
orion.2x.to. IN A 87.230.54.12
orion.2x.to. IN A 87.230.100.10
orion.2x.to. IN A 87.230.51.65
By strategically varying TTL values, we can influence cache durations and achieve approximate weighted distribution:
# Weighted DNS configuration example
orion.2x.to. 240 IN A 10.0.0.1 # 10Gbps server (weight 100)
orion.2x.to. 120 IN A 10.0.0.2 # 1Gbps server (weight 10)
orion.2x.to. 60 IN A 10.0.0.3 # 100Mbps server (weight 1)
Several factors affect the actual distribution:
- Resolver cache behaviors vary across ISPs
- Minimum TTL enforcement (many enforce 120s minimum)
- Client-side DNS caching implementations
For more precise control, consider:
# BIND9 view configuration example (geographic weighting)
view "europe" {
match-clients { 192.0.2.0/24; 203.0.113.0/24; };
rrset-order {
order weighted;
weights 10.0.0.1 100; 10.0.0.2 10; 10.0.0.3 1;
};
};
Essential metrics to track:
# Sample monitoring script (Python)
import dns.resolver
def check_distribution(domain, samples=1000):
counts = {}
for _ in range(samples):
answer = dns.resolver.resolve(domain, 'A')
ip = str(answer[0])
counts[ip] = counts.get(ip, 0) + 1
return counts
For mission-critical deployments:
- Anycast routing (requires BGP capability)
- Commercial DNS-based load balancers (AWS Route 53, NS1)
- L4/L7 hardware load balancers
Traditional DNS round robin works well for homogeneous server clusters, but becomes problematic when dealing with servers of varying capacities (100Mbps, 1Gbps, and 10Gbps in our case). The fundamental issue is that standard round robin treats all servers equally, while we need to distribute traffic proportionally to server capacity.
The core idea is leveraging DNS TTL (Time-To-Live) values to influence client caching behavior and thereby achieve weighted distribution:
; Example DNS zone configuration
server1 IN A 192.0.2.1 ; 10Gbps server
TTL 2400 ; 40 minutes
server2 IN A 192.0.2.2 ; 1Gbps server
TTL 240 ; 4 minutes
server3 IN A 192.0.2.3 ; 100Mbps server
TTL 120 ; 2 minutes
Several factors affect the effectiveness of this approach:
- Client DNS resolver behavior varies across ISPs and devices
- Minimum practical TTL is typically 60-120 seconds due to resolver enforcement
- The ratio between TTL values determines the weighting effect
For more precise control, consider these DNS server implementations:
# PowerDNS Lua-based weighted round robin
function getrecord()
local servers = {
{ip="192.0.2.1", weight=100},
{ip="192.0.2.2", weight=10},
{ip="192.0.2.3", weight=1}
}
return weightedRoundRobin(servers)
end
Since you're primarily balancing bandwidth rather than requests, these additional techniques can help:
- Anycast routing with BGP (requires network infrastructure)
- TCP Anycast using ECMP (Equal-Cost Multi-Path) routing
- Geographic DNS with EDNS Client Subnet awareness
Here's how to implement weighted distribution using Bind9's DNS policies:
# Bind9 named.conf configuration
zone "proxy.example.com" {
type master;
file "db.proxy.example.com";
rrset-order {
order cyclic;
weights 100 10 1; // Corresponds to server capacities
};
};
Essential metrics to track for optimization:
- DNS query distribution patterns
- Actual bandwidth utilization per server
- Cache hit/miss ratios at resolvers
- Geographical distribution mismatch