Round-robin DNS operates by returning multiple IP addresses for a domain in a rotating order. When configured with two IPs (A records) for example.com, DNS queries will alternate between them:
example.com. IN A 192.0.2.1
example.com. IN A 203.0.113.2
The fundamental limitation emerges when one IP becomes unavailable. Standard DNS behavior doesn't include health checks - the failed IP remains in rotation. Client behavior varies:
- Modern browsers implement "Happy Eyeballs" (RFC 8305) attempting parallel connections
- Many applications simply try the first returned IP and fail on timeout
- DNS cache TTLs delay failover to alternative IPs
Let's simulate the behavior using a simple HTTP request:
# First attempt (gets unresponsive IP)
$ curl -v http://example.com
* Trying 192.0.2.1:80...
* connect to 192.0.2.1 port 80 failed: Connection timed out
# Second attempt after DNS cache expires
$ curl -v http://example.com
* Trying 203.0.113.2:80...
* Connected to example.com (203.0.113.2) port 80
For true high availability, consider these enhanced approaches:
# DNS-based solution with health checks (AWS Route53 example)
resource "aws_route53_health_check" "backend" {
ip_address = "192.0.2.1"
port = 80
type = "HTTP"
resource_path = "/health"
failure_threshold = 2
request_interval = 30
}
# Application-layer retry pattern (Python example)
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
session = requests.Session()
retries = Retry(
total=3,
connect=3,
read=3,
status=3,
backoff_factor=0.5,
allowed_methods=frozenset(['GET', 'POST'])
)
session.mount('http://', HTTPAdapter(max_retries=retries))
Despite limitations, round-robin works well for:
- Load distribution across healthy endpoints
- Blue-green deployments with controlled cutovers
- Geographic distribution when combined with EDNS Client Subnet
Round-robin DNS distributes requests across multiple IP addresses in a rotating fashion, but it's crucial to understand that DNS itself provides no health checking or automatic failover mechanism. When a client receives multiple IPs from a DNS response, the typical behavior is:
// Example DNS response with two A records
example.com. 300 IN A 192.0.2.1
example.com. 300 IN A 203.0.113.2
Most modern operating systems and HTTP clients implement "happy eyeballs" algorithms that attempt parallel connections:
- The client gets both IP addresses from DNS
- Attempts to connect to the first IP
- If no response within timeout (typically 300ms-1s), tries the next IP
- This happens at the TCP layer, before any application protocol handshake
Different technologies handle this differently:
Web Browsers
Modern browsers (Chrome, Firefox) implement sophisticated connection strategies:
// Chrome's connection behavior pseudocode
async function connect(url) {
const ips = await dns.resolve(url);
const connections = ips.map(ip => tryConnect(ip));
return Promise.any(connections);
}
Programming Languages
Most HTTP libraries will automatically try alternative IPs:
// Python requests example
import requests
try:
response = requests.get('http://example.com', timeout=5)
except requests.exceptions.ConnectTimeout:
# The library already tried all IPs before failing
handle_failure()
Round-robin DNS alone isn't a complete HA solution because:
- DNS caching means clients may continue trying dead IPs until TTL expires
- No awareness of server health or load
- Uneven distribution if some clients cache DNS longer than others
For production systems, consider combining with:
Health-Checking DNS
Services like Amazon Route 53 or NS1 provide DNS with health checks:
# Route 53 health check configuration
resource "aws_route53_health_check" "example" {
ip_address = "192.0.2.1"
port = 80
type = "HTTP"
resource_path = "/health"
failure_threshold = 3
}
Client-Side Retry Logic
Implement explicit retries in your application code:
// Node.js with retry logic
const axiosRetry = require('axios-retry');
const axios = require('axios');
axiosRetry(axios, {
retries: 3,
retryCondition: (error) => {
return axiosRetry.isNetworkError(error) ||
(error.response && error.response.status >= 500);
}
});
When using round-robin DNS, implement:
- DNS resolution monitoring to ensure all IPs are returned
- Endpoint availability checks for each IP
- TTL expiration tracking to detect caching issues