How to Use Nginx to Rewrite Outgoing Response URLs for Static Content Offloading


2 views

When dealing with legacy systems or poorly documented applications, modifying source code to point static assets to a CDN or cookieless domain can be challenging. In this specific case:

  • Original application runs on Apache with dynamic content mixed with static assets
  • All image paths are relative (e.g., /images/logo.png)
  • We need to redirect these to http://static.thedomain.com/images/logo.png

The ngx_http_sub_module provides exactly what we need - the ability to modify response bodies before sending to clients. This allows us to:

# Enable the sub_filter module (usually compiled by default)
nginx -V 2>&1 | grep -o with-http_sub_module

Here's a full proxy configuration that handles the URL rewriting:

server {
    listen 80;
    server_name www.thedomain.com;

    location / {
        proxy_pass http://apache_backend;
        proxy_set_header Host $host;
        
        # Enable response processing
        sub_filter_once off;
        sub_filter_types text/html;
        
        # Rewrite image paths
        sub_filter 'src="/images/' 'src="http://static.thedomain.com/images/';
        
        # Handle CSS background images
        sub_filter 'url(/images/' 'url(http://static.thedomain.com/images/';
        
        # Handle JavaScript paths if needed
        sub_filter '"/js/' '"http://static.thedomain.com/js/';
    }
}

upstream apache_backend {
    server 192.168.1.100:8080;
}

For production environments, consider these enhancements:

# Cache the rewritten responses to reduce CPU load
proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=rewrite_cache:10m inactive=60m;

server {
    # ... existing config ...
    
    proxy_cache rewrite_cache;
    proxy_cache_valid 200 302 10m;
    proxy_cache_valid 404 1m;
    
    # Handle gzipped responses
    sub_filter_types text/html application/javascript text/css;
    gunzip on;
}

Verify your configuration with:

curl -I http://www.thedomain.com | grep -i "content-type"
curl -v http://www.thedomain.com | grep "static.thedomain.com"

While this solution works, be aware of:

  • CPU overhead increases with response size
  • Throughput may decrease by 15-20% for HTML-heavy pages
  • Consider adding more worker processes if needed

When dealing with legacy web applications, we often encounter situations where modifying the source code isn't immediately feasible. In this case, we have an Apache-hosted site that needs to offload static content to a cookieless domain (static.thedomain.com) for performance optimization, but the application code still references local paths like /images/....

Nginx's sub_filter module provides exactly what we need for this scenario. It allows content modification in outgoing responses while maintaining the original request flow. Here's the core approach:


http {
    upstream apache_backend {
        server 127.0.0.1:8080;
    }

    server {
        listen 80;
        server_name www.thedomain.com;

        location / {
            proxy_pass http://apache_backend;
            proxy_set_header Host $host;
            
            # Enable response processing
            proxy_set_header Accept-Encoding "";
            sub_filter_once off;
            
            # Transform image paths
            sub_filter 'src="/images/' 'src="http://static.thedomain.com/images/';
            sub_filter 'href="/images/' 'href="http://static.thedomain.com/images/';
        }
    }
}

Several technical points require attention when implementing this solution:

  • Compression Handling: The proxy_set_header Accept-Encoding ""; line disables upstream compression since we can't modify compressed responses
  • Multiple Replacements: sub_filter_once off ensures all occurrences are replaced, not just the first one
  • Content Types: By default, Nginx only processes text/html responses. For other types, use sub_filter_types

For more complex transformations, we can use regular expressions with sub_filter:


sub_filter 'src="([^"]*/images/[^"]*)"' 'src="http://static.thedomain.com$1"';
sub_filter 'url\(['\''"]?(/[^'"\)]*)' 'url(http://static.thedomain.com$1';

While response filtering does add some overhead, it's typically minimal compared to:

  • The network latency saved by serving static content directly
  • The reduced cookie overhead from using a cookieless domain
  • The development time saved versus immediate code changes

For cases where response filtering isn't sufficient, consider:

  1. Using Nginx's proxy_redirect for Location header modifications
  2. Implementing a Lua script with OpenResty for complex transformations
  3. Setting up a CDN that can perform origin URL rewriting