Decoding Nginx ETag Generation: Algorithm, Structure, and Custom Implementation


1 views

The ETag format "554b73dc-6f0d" you're observing consists of two hexadecimal components:

Last-Modified timestamp (hex): 554b73dc
File size (hex): 6f0d

Nginx calculates ETags using this standard approach:

etag = hex(last_modified_time) + "-" + hex(content_length)

For example, given:

Last-Modified: Wed, 05 May 2021 08:30:20 GMT (1620196220 in Unix time)
Content-Length: 28429 bytes

The ETag becomes:

60a1c0bc-6f0d

The actual implementation can be found in ngx_http_core_module.c:

etag->value.len = ngx_sprintf(etag->value.data, "\"%xT-%xO\"",
                              r->headers_out.last_modified_time,
                              r->headers_out.content_length_n)
                  - etag->value.data;

Understanding this becomes critical when:

  • Implementing custom caching logic
  • Distributed systems with multiple Nginx instances
  • Testing cache validation workflows

You can override the default behavior with Lua:

location / {
    content_by_lua_block {
        local file = "/path/to/file"
        local stat = ngx.shared.cache:get(file)
        if not stat then
            stat = {
                mtime = ngx.time(),
                size = ngx.fs.size(file)
            }
            ngx.shared.cache:set(file, stat)
        end
        ngx.header["ETag"] = string.format("\"%x-%x\"", stat.mtime, stat.size)
    }
}

The default implementation is lightweight but has limitations:

  • ETags change even with minor file modifications
  • Not suitable for content that changes frequently but remains identical
  • Cluster synchronization issues when Last-Modified timestamps differ

For more robust solutions:

# Using file digest
openssl sha1 filename | awk '{print "\""$2"\""}'

# Using Nginx's $content_md5 (requires ngx_http_headers_module)
add_header ETag $content_md5;

Remember that strong ETags (default) versus weak ETags (with W/) have different cache behaviors in browsers.


The ETag format you're seeing (e.g., "554b73dc-6f0d") in Nginx consists of two hexadecimal components:

Last-Modified-Time-Hex - Content-Length-Hex

Nginx generates ETags by combining:

1. The last modified timestamp of the file (in hexadecimal)
2. The content length of the file (in hexadecimal)

Here's a simplified representation of the algorithm:

etag = hex(last_modified_time) + "-" + hex(content_length)

Looking at Nginx's ngx_http_core_module.c, we find the actual implementation:


static ngx_int_t
ngx_http_set_etag(ngx_http_request_t *r)
{
    // ... other code ...
    
    // Convert last modified time to hex
    p = ngx_sprintf(etag->data, "\"%xT-%xO\"",
                    r->headers_out.last_modified_time,
                    r->headers_out.content_length_n);
    
    // ... other code ...
}

Let's simulate how Nginx would generate an ETag for a file with:

Last-Modified: 1433242460 (Unix timestamp)
Content-Length: 28445 bytes

The calculation would be:

1. Convert timestamp to hex: 1433242460 → 554b73dc
2. Convert length to hex: 28445 → 6f0d
3. Combine: "554b73dc-6f0d"

If you want to modify ETag behavior, you can use Nginx directives:


location /static/ {
    etag on; # Default
    # or
    etag off;
}

The current algorithm has advantages:

  • Fast computation (no file hashing required)
  • Works well for static files
  • Minimal CPU overhead

But be aware that for dynamically generated content, you might want to implement custom ETags using ngx_http_lua_module:


location /dynamic/ {
    content_by_lua_block {
        ngx.header["ETag"] = ngx.md5(response_content)
    }
}

Unlike Apache (which uses inode-size-timestamp) or cloud services (which often use MD5 hashes), Nginx's approach is simpler but equally effective for its primary use case of serving static files.