How to Remove URL Substrings in NGINX Proxy_Pass: A Complete Rewrite Rule Guide


2 views

When working with NGINX reverse proxy configurations, developers often face URL manipulation challenges. Consider this common scenario:


Original URL: http://host:port/string_1/string_X/command?param=value
Desired Destination: http://internal_host:port/string_X/command?param=value

The naive approach using $request_uri fails because it preserves the original path:


location /string_1/ {
    proxy_pass http://internal_host:port/$request_uri;
    # Results in: http://internal_host:port/string_1/string_X/command?param=value
}

The correct approach involves combining regex matching with the $1 capture group:


location ~ ^/string_1/(.*)$ {
    proxy_pass http://internal_host:port/$1$is_args$args;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

Regex Pattern (~ ^/string_1/(.*)$): Matches any path starting with /string_1/ and captures the remainder

$1: The first capture group containing everything after /string_1/

$is_args$args: Preserves query parameters (?param=value)

For multiple substring removals or complex rewrites:


location ~ ^/(string_1|prefix_2)/(.*)$ {
    proxy_pass http://backend/$2$is_args$args;
    # Removes either string_1 or prefix_2
}

While regex locations (~) are powerful, they have slightly higher overhead than prefix matches. For high-traffic sites:


location /string_1/ {
    rewrite ^/string_1/(.*)$ /$1 break;
    proxy_pass http://backend;
}

This alternative uses a rewrite rule before proxy_pass for better performance.

Enable debug logging when testing rewrite rules:


error_log /var/log/nginx/rewrite.log debug;
rewrite_log on;

When configuring NGINX as a reverse proxy, we often need to modify the URL structure before passing requests to backend servers. A common scenario is removing a specific substring from the original request URI while preserving the remaining path and query parameters.

Consider this specific case:


Original URL: http://host:port/string_1/string_X/command?xxxxx
Should become: http://internal_host:port/string_X/command?xxxxx

Here are three effective methods to achieve this URL transformation:

Method 1: Using Regex Capture Groups


location ~ ^/string_1/(.*)$ {
    proxy_pass http://internal_host:port/$1$is_args$args;
}

Method 2: Rewrite Before Proxy


location /string_1/ {
    rewrite ^/string_1/(.*)$ /$1 break;
    proxy_pass http://internal_host:port;
}

Method 3: Advanced Regex Matching


location ~* ^/string_1(?/.*)$ {
    proxy_pass http://internal_host:port$remaining_path$is_args$args;
}

When implementing these solutions:

  • Always test with various URL patterns
  • Consider edge cases like trailing slashes
  • Preserve query parameters ($args or $query_string)
  • Choose between case-sensitive (~) or insensitive (~*) matching

Here's a complete NGINX configuration snippet:


server {
    listen 80;
    server_name proxy.example.com;
    
    location ~ ^/api/v1/(.*)$ {
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass http://backend-server:8080/$1$is_args$args;
        proxy_redirect off;
    }
}

Regex-based matching has some overhead compared to prefix matching. For high-traffic systems:

  • Prefer simpler patterns when possible
  • Consider using maps for complex transformations
  • Benchmark different approaches

When debugging URL rewriting:


# Add to server block:
log_format rewrite_log '$remote_addr - $remote_user [$time_local] '
                      '"$request" $status $body_bytes_sent '
                      '"$http_referer" "$http_user_agent" '
                      'Original: $request_uri '
                      'Rewritten: $uri';