When working with encoded URLs in nginx proxy configurations, you might encounter this behavior:
Original request: http://localhost:8080/foo/%5B-%5D
Expected backend request: GET /foo/%5B-%5D HTTP/1.1
Actual backend request: GET /foo/[-] HTTP/1.1
This automatic URL decoding can break applications expecting encoded characters, particularly with special characters like square brackets that have special meaning in URIs.
Nginx performs URI normalization by default, which includes:
- Percent-decoding of encoded characters
- Path component normalization
- Duplicate slash removal
While generally useful, this behavior becomes problematic when your backend application expects raw, encoded URIs.
To maintain the original encoded URI when proxying, use one of these approaches:
Method 1: Using proxy_pass with Variables
location /foo {
proxy_pass http://localhost:8080/$request_uri;
}
This preserves the original request URI including the query string.
Method 2: For Specific Path Prefixes
location /foo {
rewrite ^/foo/(.*) /foo/$1 break;
proxy_pass http://localhost:8080;
proxy_pass_request_headers on;
}
Method 3: Advanced Regex Matching
location ~* "^/foo/([^?]*)(?:\?(.*))?$" {
set $path $1;
set $args $2;
proxy_pass http://localhost:8080/foo/$path?$args;
}
Verify the solution works by checking what reaches your backend:
# Simple test server
nc -l 8080
# Test request
curl "http://localhost/foo/%5B-%5D"
The backend should now receive the encoded version: GET /foo/%5B-%5D HTTP/1.1
- Be cautious with
$request_uri
as it includes the query string - For complex routing, consider using
map
directives - Test edge cases with multiple encoded sequences
When working with APIs or legacy systems that require encoded URLs to remain unchanged, Nginx's default URL decoding behavior in proxy_pass
can cause issues. Consider this scenario:
Original request to Nginx:
GET /foo/%5B-%5D HTTP/1.1
What backend receives:
GET /foo/[-] HTTP/1.1
Many servers (especially Java-based ones) will reject requests with unencoded special characters like square brackets, resulting in HTTP 400 errors.
Nginx automatically decodes percent-encoded characters before processing the URI. This is generally helpful for human-readable URLs but problematic when:
- Backend systems expect encoded characters
- Special characters have semantic meaning in the URL
- Security systems validate exact URL patterns
There are several approaches to maintain the original encoded URI:
1. Using $request_uri Variable
The most reliable method is to use $request_uri
which contains the original unparsed URI:
location /foo {
proxy_pass http://backend$request_uri;
}
This preserves all encoded characters exactly as received.
2. Rewriting with encoded URI
For more complex routing, combine with rewrite:
location ~* ^/foo/(.*) {
rewrite ^ /foo/$1 break;
proxy_pass http://backend;
proxy_pass_request_headers on;
}
3. Using Custom Variables
When you need partial processing:
location /foo {
set $encoded_uri $uri;
if ($args) {
set $encoded_uri "$encoded_uri?$args";
}
proxy_pass http://backend$encoded_uri;
}
Verify the solution works by checking what reaches your backend. For testing:
curl -v "http://nginx-server/foo/%5B%5D"
Your backend should receive the exact same encoded path. For additional verification, use tcpdump or ngrep:
sudo ngrep -d any -W byline port 8080
Be aware that certain characters still require special handling:
- Spaces (%20) may need additional validation
- Double-encoded URLs might cause issues
- Query parameters require special attention
For query parameters, consider:
location /foo {
proxy_pass http://backend$uri?$args;
}
While these solutions work, they have different performance characteristics:
Method | Memory | CPU |
---|---|---|
$request_uri | Low | Low |
rewrite | Medium | Medium |
custom vars | High | High |
For most use cases, $request_uri
provides the best balance of functionality and performance.