How to Prevent Nginx from URL Decoding Proxy Pass Requests


1 views

When working with encoded URLs in nginx proxy configurations, you might encounter this behavior:

Original request: http://localhost:8080/foo/%5B-%5D
Expected backend request: GET /foo/%5B-%5D HTTP/1.1
Actual backend request: GET /foo/[-] HTTP/1.1

This automatic URL decoding can break applications expecting encoded characters, particularly with special characters like square brackets that have special meaning in URIs.

Nginx performs URI normalization by default, which includes:

  • Percent-decoding of encoded characters
  • Path component normalization
  • Duplicate slash removal

While generally useful, this behavior becomes problematic when your backend application expects raw, encoded URIs.

To maintain the original encoded URI when proxying, use one of these approaches:

Method 1: Using proxy_pass with Variables

location /foo {
    proxy_pass http://localhost:8080/$request_uri;
}

This preserves the original request URI including the query string.

Method 2: For Specific Path Prefixes

location /foo {
    rewrite ^/foo/(.*) /foo/$1 break;
    proxy_pass http://localhost:8080;
    proxy_pass_request_headers on;
}

Method 3: Advanced Regex Matching

location ~* "^/foo/([^?]*)(?:\?(.*))?$" {
    set $path $1;
    set $args $2;
    proxy_pass http://localhost:8080/foo/$path?$args;
}

Verify the solution works by checking what reaches your backend:

# Simple test server
nc -l 8080

# Test request
curl "http://localhost/foo/%5B-%5D"

The backend should now receive the encoded version: GET /foo/%5B-%5D HTTP/1.1

  • Be cautious with $request_uri as it includes the query string
  • For complex routing, consider using map directives
  • Test edge cases with multiple encoded sequences

When working with APIs or legacy systems that require encoded URLs to remain unchanged, Nginx's default URL decoding behavior in proxy_pass can cause issues. Consider this scenario:

Original request to Nginx:
GET /foo/%5B-%5D HTTP/1.1

What backend receives:
GET /foo/[-] HTTP/1.1

Many servers (especially Java-based ones) will reject requests with unencoded special characters like square brackets, resulting in HTTP 400 errors.

Nginx automatically decodes percent-encoded characters before processing the URI. This is generally helpful for human-readable URLs but problematic when:

  • Backend systems expect encoded characters
  • Special characters have semantic meaning in the URL
  • Security systems validate exact URL patterns

There are several approaches to maintain the original encoded URI:

1. Using $request_uri Variable

The most reliable method is to use $request_uri which contains the original unparsed URI:

location /foo {
    proxy_pass http://backend$request_uri;
}

This preserves all encoded characters exactly as received.

2. Rewriting with encoded URI

For more complex routing, combine with rewrite:

location ~* ^/foo/(.*) {
    rewrite ^ /foo/$1 break;
    proxy_pass http://backend;
    proxy_pass_request_headers on;
}

3. Using Custom Variables

When you need partial processing:

location /foo {
    set $encoded_uri $uri;
    if ($args) {
        set $encoded_uri "$encoded_uri?$args";
    }
    proxy_pass http://backend$encoded_uri;
}

Verify the solution works by checking what reaches your backend. For testing:

curl -v "http://nginx-server/foo/%5B%5D"

Your backend should receive the exact same encoded path. For additional verification, use tcpdump or ngrep:

sudo ngrep -d any -W byline port 8080

Be aware that certain characters still require special handling:

  • Spaces (%20) may need additional validation
  • Double-encoded URLs might cause issues
  • Query parameters require special attention

For query parameters, consider:

location /foo {
    proxy_pass http://backend$uri?$args;
}

While these solutions work, they have different performance characteristics:

Method Memory CPU
$request_uri Low Low
rewrite Medium Medium
custom vars High High

For most use cases, $request_uri provides the best balance of functionality and performance.