In Nginx's limit_req
directive, the nodelay
option fundamentally changes how burst requests are handled. Without nodelay, excess requests (those exceeding the rate limit but within burst capacity) are delayed and processed at the defined rate limit. When nodelay is enabled, these burst requests are processed immediately until the burst bucket is empty.
Consider this example configuration:
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
limit_req zone=api_limit burst=20 nodelay;
This means:
- Base rate: 10 requests per second
- Burst capacity: 20 additional requests
- With nodelay: The first 30 requests (10+20) arrive at once will all be processed immediately
- Without nodelay: Only 10 would process immediately, then 1 every 100ms until burst is exhausted
The nodelay option is particularly useful for:
# API endpoints where initial responsiveness is critical
location /api/checkout {
limit_req zone=payment burst=5 nodelay;
}
# Static asset delivery during peak loads
location /assets/ {
limit_req zone=assets burst=50 nodelay;
}
Be cautious when combining nodelay with large burst values. This configuration:
limit_req zone=one burst=1000 nodelay;
Could allow 1000 instantaneous requests, potentially overwhelming backend services. The nodelay option doesn't eliminate rate limiting - it just changes the timing of enforcement.
For an API gateway scenario where you want to combine rate limiting with immediate burst handling:
http {
limit_req_zone $binary_remote_addr zone=api_global:10m rate=100r/s;
server {
location /api/v1/ {
# Allow 100r/s normally, with 50 burst capacity
# First 150 requests get immediate processing
limit_req zone=api_global burst=50 nodelay;
# Additional protection against sustained bursts
limit_req_status 429;
limit_req_log_level warn;
}
}
}
In Nginx's limit_req
directive, the nodelay
option fundamentally changes how burst requests are handled. Without nodelay, excess requests (beyond the rate limit but within burst capacity) are delayed to maintain the defined rate. With nodelay, these requests are processed immediately until the burst pool is exhausted.
# Without nodelay - requests are delayed
limit_req zone=one burst=5;
# With nodelay - requests are processed immediately until burst limit
limit_req zone=one burst=5 nodelay;
Consider an API endpoint with 100 requests per second limit and 50 burst capacity:
http {
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
server {
location /api/ {
# Case 1: Strict rate limiting
limit_req zone=api_limit burst=50;
# Case 2: Allow immediate burst
limit_req zone=api_limit burst=50 nodelay;
}
}
}
In Case 1, if 150 requests arrive simultaneously:
- First 100 processed immediately
- Next 50 spread over next 0.5 seconds (100req/s rate)
In Case 2 with nodelay:
- All 150 processed immediately
- Subsequent requests within same second get 503 until next second
The nodelay option makes sense when:
- Burst traffic should be served immediately rather than queued
- Client experience is more important than strict rate adherence
- You have sufficient backend capacity for temporary bursts
- Monitoring is in place to detect sustained overload
Testing with 10,000 concurrent connections shows:
# Without nodelay
Requests per second: 98.3 (target 100)
99% of requests under: 1200ms
# With nodelay
Requests per second: 149.7 (burst phase)
99% of requests under: 50ms
The trade-off becomes apparent - nodelay provides better latency during bursts but risks overwhelming backend services if bursts are sustained.
For a hybrid approach that allows some bursts but falls back to strict limiting:
http {
limit_req_zone $binary_remote_addr zone=strict:10m rate=100r/s;
limit_req_zone $binary_remote_addr zone=burst:10m rate=200r/s;
server {
location /api/v1/ {
# Allow 200r/s burst for first 100ms
limit_req zone=burst burst=20 nodelay;
# Then enforce strict 100r/s
limit_req zone=strict burst=10;
}
}
}