When migrating services between servers, DNS TTL (Time-To-Live) becomes critical infrastructure. The 24-hour TTL you initially set means recursive resolvers worldwide cached your records for that duration. By reducing it to 300 seconds (5 minutes), you're instructing resolvers to refresh more frequently - but existing caches still honor the original TTL until they expire.
For clean DNS handoff:
Original state: TTL=86400 (24h)
Update 1: TTL=300 (5min) at 00:00 UTC
Wait period: 24 hours (full original TTL)
Update 2: Change IP records at 24:00 UTC
Propagation: Global updates within 5 minutes
If immediate migration is necessary:
- Implement a blue-green deployment with load balancing
- Use DNS failover services like Amazon Route 53 health checks
- Maintain session stickiness during transition with cookies
# Example AWS CLI command for immediate failover
aws route53 change-resource-record-sets \
--hosted-zone-id Z1PA6795UKMFR9 \
--change-batch file://new_records.json
Check global DNS propagation using:
dig +short example.com @8.8.8.8 # Google DNS
dig +short example.com @1.1.1.1 # Cloudflare
nslookup example.com 208.67.222.222 # OpenDNS
- Set TTL reduction at least 24h before migration
- Configure monitoring for both old and new endpoints
- Prepare rollback procedure (reverting IP changes)
- Update SPF/DKIM records if handling email
When migrating services between servers, DNS Time-To-Live (TTL) becomes critical. The scenario you described is classic - reducing TTL from 24 hours to 5 minutes (300 seconds) before a migration window. Here's what happens at each stage:
# Example dig command showing TTL value
$ dig example.com +noall +answer
example.com. 86400 IN A 192.0.2.1
Technically, you don't need to wait the full 24 hours after changing the TTL. However, you should consider:
- Clients and DNS resolvers that cached the old record with 24h TTL
- Intermediate DNS servers that may ignore your new TTL
- ISPs that implement longer caching than specified
Here's how I handle such migrations:
# Migration timeline example
1. T-48h: Reduce TTL to 5 minutes
2. T-0h:
- Take application offline
- Perform data sync
- Update DNS records
3. T+5m: New server should receive traffic
4. T+24h: Full propagation expected
Use these methods to verify propagation:
# Using dig to check authoritative servers
$ dig @ns1.yourprovider.com example.com +short
# Checking local resolver cache
$ dig example.com +nocmd +noall +answer +ttlunits
# Global DNS check (using Google's DNS)
$ dig @8.8.8.8 example.com +short
For critical migrations, consider these additional measures:
- Implement a health check endpoint on both old and new servers
- Use DNS weight balancing during transition
- Prepare rollback procedures in case of issues
Here's how I handled a recent migration for a Python web app:
# Health check endpoint example
@app.route('/health')
def health_check():
return jsonify({
'status': 'healthy',
'server': 'new-prod-01',
'timestamp': datetime.utcnow().isoformat()
}), 200