How to Force CloudFront to Always Serve Fresh HTML from S3: Cache Control Headers and Workarounds


2 views

When hosting static websites on AWS S3 with CloudFront, many developers encounter unexpected caching behavior despite setting aggressive cache-control headers. Here's what's happening under the hood:

// Typical S3 object metadata settings
{
  "CacheControl": "no-cache, no-store, max-age=0, must-revalidate",
  "Expires": "Thu, 01 Jan 1970 00:00:00 GMT"
}

CloudFront has several caching layers that don't always respect origin headers as expected. The service maintains:

  • Edge location caches (varies by viewer request headers)
  • Regional caches (shared between edge locations)
  • Persistent connection pools to S3
// Example HTTP response showing the problem
HTTP/1.1 200 OK
X-Cache: Hit from cloudfront  // This indicates stale content
Cache-Control: max-age=3600   // CloudFront may override your headers

Option 1: Versioned URLs

Append query strings or modify paths:

<link href="/styles.css?v=123" rel="stylesheet">
<script src="/app.js?build=20230101"></script>

Option 2: Invalidation Requests

Programmatic cache clearing via AWS CLI:

aws cloudfront create-invalidation \
  --distribution-id EDFDVBD6EXAMPLE \
  --paths "/index.html" "/about/*"

Option 3: Lambda@Edge

For dynamic cache control:

exports.handler = (event, context, callback) => {
  const response = event.Records[0].cf.response;
  response.headers['cache-control'] = [{
    key: 'Cache-Control',
    value: 'no-cache, no-store, must-revalidate'
  }];
  callback(null, response);
};

As mentioned in the original scenario, separating dynamic HTML from static assets proves effective:

// DNS configuration example
www.example.com   → s3-website-us-east-1.amazonaws.com
static.example.com → d123.cloudfront.net

This architecture ensures HTML always comes fresh from S3 while benefiting from CloudFront's CDN capabilities for static assets.

For precise control, combine these S3 metadata settings:

Cache-Control: public, max-age=0, s-maxage=0, must-revalidate
X-Cache-Control-Override: no-cache
Surrogate-Control: max-age=60

Remember to configure CloudFront's Minimum TTL to 0 in the cache behavior settings.


When hosting static websites on AWS S3 with CloudFront, many developers encounter unexpected caching behavior despite setting aggressive cache-control headers. Here's why your no-cache directives might not work as expected:

// Typical problematic response headers
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
X-Cache: Hit from cloudfront  // Still getting cached version!

CloudFront implements multiple caching mechanisms:

  • Edge Location Cache: TTL defaults to 24 hours minimum
  • Regional Caches: Intermediate layer between edge and origin
  • Origin Fetch: Only occurs when all caches expire

Solution 1: Versioned File Names

The most reliable approach is to implement versioning in your filenames:

// Before
index.html

// After
index.v2.html
index.<md5hash>.html
index.<timestamp>.html

Solution 2: Programmatic Cache Invalidation

Use AWS CLI to force cache clearing:

aws cloudfront create-invalidation \
    --distribution-id E1A2B3C4D5E6F7 \
    --paths "/*"

Or via AWS SDK (JavaScript example):

const { CloudFrontClient, CreateInvalidationCommand } = require('@aws-sdk/client-cloudfront');

const client = new CloudFrontClient({ region: 'us-east-1' });
const command = new CreateInvalidationCommand({
  DistributionId: 'E1A2B3C4D5E6F7',
  InvalidationBatch: {
    CallerReference: Date.now().toString(),
    Paths: {
      Quantity: 1,
      Items: ['/*']
    }
  }
});

await client.send(command);

For HTML files that change frequently, use these S3 metadata headers:

Cache-Control: no-cache, no-store, must-revalidate, max-age=0
Pragma: no-cache
Expires: 0

For supporting assets (CSS/JS):

Cache-Control: public, max-age=31536000, immutable

Consider separating dynamic and static content:

www.example.com    → S3 bucket (HTML with short TTL)
static.example.com → CloudFront (assets with long TTL)

Check these response headers to verify cache status:

X-Cache: Miss from cloudfront  // Fresh fetch
X-Cache: Hit from cloudfront   // Cached version
Age: 3600                      // Seconds in cache

For granular control, use Lambda@Edge to modify cache behavior:

exports.handler = (event, context, callback) => {
  const request = event.Records[0].cf.request;
  const headers = request.headers;

  if (request.uri.endsWith('.html')) {
    headers['cache-control'] = [{
      key: 'Cache-Control',
      value: 'no-cache, no-store, must-revalidate, max-age=0'
    }];
  }

  callback(null, request);
};