Rate limiting is your first line of defense against brute-force attacks, API abuse, and application-layer DDoS. This tutorial builds three layers of protection using only Nginx and fail2ban. No third-party DDoS services required, no traffic leaving your server.

You will configure request rate limiting (limit_req), connection throttling (limit_conn), and automated IP banning (fail2ban) on Debian 12 or Ubuntu 24.04.

Prerequisites:

Nginx installed and running (Install Nginx on Debian 12 and Ubuntu 24.04 from the Official Repository)
Basic familiarity with Nginx config structure (Nginx Config File Structure Explained)
Root or sudo access

How does Nginx rate limiting work?

Nginx rate limiting uses the leaky bucket algorithm via the limit_req_zone and limit_req directives. Incoming requests fill a bucket at whatever rate they arrive. The bucket drains at a fixed rate you define. When the bucket overflows, Nginx rejects the excess requests. This smooths traffic bursts while maintaining a steady processing rate.

The implementation spans two directives. limit_req_zone defines the shared memory zone that tracks client state across all worker processes. limit_req applies the limit to specific locations.

# In the http block: define the zone
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

# In a server or location block: apply the limit
limit_req zone=api;

The key $binary_remote_addr stores each client IP in a compact binary format (4 bytes for IPv4, 16 bytes for IPv6). A 10 MB zone fits roughly 160,000 IPv4 addresses or 80,000 IPv6 addresses. For most servers, 10m is plenty.

The rate parameter accepts requests per second (r/s) or requests per minute (r/m). Nginx tracks this internally in milliseconds. A rate of 10r/s means one request allowed every 100ms.

How do I configure limit_req_zone and limit_req?

Create a rate limiting config file to keep things modular:

sudo nano /etc/nginx/conf.d/rate-limiting.conf

# Shared memory zones - defined at http level
limit_req_zone $binary_remote_addr zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=login:10m rate=1r/s;
limit_req_zone $binary_remote_addr zone=api:10m rate=30r/s;

# Return 429 instead of the default 503
limit_req_status 429;

# Log rate limit events at warn level (delays logged at notice)
limit_req_log_level warn;

Then apply the zones in your server block:

sudo nano /etc/nginx/sites-available/example.com

server {
    listen 80;
    server_name example.com;

    # General rate limit for all requests
    limit_req zone=general burst=20 nodelay;

    location /login {
        # Strict limit on login endpoint
        limit_req zone=login burst=3 nodelay;
        proxy_pass http://127.0.0.1:3000;
    }

    location /api/ {
        # Higher limit for API consumers
        limit_req zone=api burst=50 delay=30;
        proxy_pass http://127.0.0.1:3000;
    }

    location / {
        proxy_pass http://127.0.0.1:3000;
    }
}

Test the config and reload:

sudo nginx -t

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

sudo systemctl reload nginx

When a limit_req directive is defined in a location block, it overrides any limit_req inherited from the server level. The general zone applies to / but not to /login or /api/ because those locations have their own limit_req directives. If you need both zones to apply, add multiple limit_req lines in the same block.

What do burst, nodelay, and delay do?

The burst parameter controls how many excess requests Nginx queues instead of rejecting immediately. Without burst, any request exceeding the rate gets a 429. With burst, Nginx holds excess requests in a queue and drains them at the base rate.

Parameter	Immediate requests	Queued requests	Rejected	Use case
`burst=0` (default)	1 per interval	None	Everything over rate	Strict API limits
`burst=5`	1 per interval	Up to 5, released at base rate	Over burst+1	Form submissions
`burst=5 nodelay`	Up to 6 at once	None queued, but burst slots refill at base rate	Over burst+1 until slots refill	Login pages, general traffic
`burst=20 delay=10`	Up to 11 at once	Requests 12-21 throttled at base rate	Over burst+1	APIs with occasional spikes

With burst=5 (no nodelay), if 6 requests arrive simultaneously, request 1 processes immediately. Requests 2-6 queue and release one per interval (every 100ms at 10r/s). The last queued request waits 500ms. This adds latency but never drops legitimate bursts.

With burst=5 nodelay, all 6 requests process immediately. But the 5 burst slots take 500ms to refill. If 6 more requests arrive 200ms later, only 3 slots have refilled, so 3 excess requests get rejected.

With burst=20 delay=10, the first 11 requests (1 base + 10 delay threshold) process without waiting. Requests 12-21 are throttled at the base rate. Anything beyond 21 is rejected. This hybrid works well for APIs that get periodic bursts from legitimate batch clients.

How do I rate limit different endpoints separately?

Define separate zones with different keys to apply independent limits. The example above already uses three zones. You can also rate limit by URI path using $uri as the key:

# Per-URI rate limiting: limits total requests to each unique URI
limit_req_zone $uri zone=per_uri:10m rate=50r/s;

This is useful when certain endpoints (like a search page or an export function) need global throttling regardless of which client calls them.

For API key-based rate limiting, use map to extract the key from a header:

map $http_x_api_key $api_key_limit {
    default          $binary_remote_addr;
    "~^.+$"          $http_x_api_key;
}

limit_req_zone $api_key_limit zone=api_keyed:10m rate=100r/s;

If the client sends an X-API-Key header, the rate limit tracks by that key. Otherwise it falls back to IP-based limiting.

How do I throttle connections with limit_conn?

While limit_req controls the request rate, limit_conn caps the number of simultaneous connections from a single client. It works well against slowloris attacks and download abuse.

# In http block
limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_conn_status 429;
limit_conn_log_level warn;

# In server or location block
server {
    # Max 20 simultaneous connections per IP
    limit_conn addr 20;

    location /downloads/ {
        # Max 2 simultaneous downloads per IP
        limit_conn addr 2;
        limit_rate 1m;  # Also throttle bandwidth to 1MB/s per connection
    }
}

Note for HTTP/2 and HTTP/3: each concurrent request counts as a separate connection. A browser loading a page with 30 assets over a single HTTP/2 connection counts as 30 connections for limit_conn purposes. Set the limit higher than you would for HTTP/1.1.

limit_conn and limit_req are complementary. Use both. limit_req stops rapid-fire requests. limit_conn stops connection floods.

How do I return a custom 429 error page?

By default, rate-limited requests get a generic error page. A custom 429 page can include a Retry-After header and a human-readable message.

Create the error page:

sudo mkdir -p /var/www/error
sudo nano /var/www/error/429.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <title>429 - Too Many Requests</title>
    <style>
        body { font-family: system-ui, sans-serif; text-align: center; padding: 5rem 1rem; }
        h1 { font-size: 2rem; }
        p { color: #555; }
    </style>
</head>
<body>
    <h1>429 - Too Many Requests</h1>
    <p>You have exceeded the request limit. Wait a moment and try again.</p>
</body>
</html>

sudo chmod 644 /var/www/error/429.html

Add the error page and Retry-After header to your server block:

server {
    #... rate limiting directives...

    error_page 429 /429.html;
    location = /429.html {
        root /var/www/error;
        internal;
        add_header Retry-After 5 always;
    }
}

The internal directive prevents direct access to the error page. The always keyword on add_header ensures the header is sent even on error responses. The Retry-After value (in seconds) tells well-behaved clients when to retry.

How do I test rate limits safely with dry_run?

Enable limit_req_dry_run on to simulate rate limiting without rejecting requests. Nginx logs what it would have done, but all requests pass through. This is available since Nginx 1.17.1.

server {
    limit_req zone=general burst=20 nodelay;
    limit_req_dry_run on;  # Log only, don't enforce

    # Add the rate limit status to access logs
    #...
}

Add $limit_req_status to your log format so you can track dry run events in access logs:

# In http block
log_format ratelimit '$remote_addr - $remote_user [$time_local] '
                     '"$request" $status $body_bytes_sent '
                     '"$http_referer" "$http_user_agent" '
                     'rate_limit=$limit_req_status';

# In server block
access_log /var/log/nginx/access.log ratelimit;

The dry_run workflow:

Add limit_req_dry_run on; to your config
Reload Nginx
Generate test traffic (see the testing section below)
Check the error log for dry run entries:

sudo grep "dry run" /var/log/nginx/error.log

2026/03/19 14:22:31 [warn] 1234#1234: *567 limiting requests, dry run, excess: 1.532 by zone "general", client: 203.0.113.50, server: example.com, request: "GET / HTTP/1.1", host: "example.com"

The log level here is [warn] because of the limit_req_log_level warn directive. Make sure your error_log directive includes warn level or lower, otherwise these messages won't appear. The production config's error_log /var/log/nginx/example.error.log warn; handles this.

Check access logs for the $limit_req_status variable:

sudo grep "REJECTED_DRY_RUN\|DELAYED_DRY_RUN" /var/log/nginx/access.log

The $limit_req_status variable returns one of: PASSED, DELAYED, REJECTED, DELAYED_DRY_RUN, or REJECTED_DRY_RUN.

When the rates look right, remove the limit_req_dry_run line and reload.

How do I allowlist trusted IPs from rate limiting?

Use a geo block to exclude monitoring systems, load balancers, or your own office IPs from rate limiting:

# In http block
geo $limit {
    default 1;
    10.0.0.0/8      0;  # Internal network
    192.168.0.0/16   0;  # Internal network
    203.0.113.10     0;  # Monitoring server
}

map $limit $limit_key {
    0 "";
    1 $binary_remote_addr;
}

limit_req_zone $limit_key zone=general:10m rate=10r/s;

When $limit_key is an empty string, Nginx skips rate limiting for that request entirely. IPs matching the geo block get $limit = 0, which maps to an empty key.

If you want allowlisted IPs to have a higher rate instead of no limit:

limit_req_zone $limit_key zone=general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=trusted:1m rate=100r/s;

server {
    limit_req zone=general burst=20 nodelay;
    limit_req zone=trusted burst=100 nodelay;
}

All IPs match trusted, but only non-allowlisted IPs match general. The most restrictive limit applies, so allowlisted IPs are effectively limited to 100r/s while everyone else gets 10r/s.

How do I ban repeat offenders with fail2ban?

Rate limiting rejects individual requests, but persistent attackers keep coming back. fail2ban monitors the Nginx error log and bans IPs at the firewall level after repeated violations.

Install fail2ban if you haven't already (Install and Configure Fail2Ban on a Linux VPS):

sudo apt update && sudo apt install -y fail2ban

sudo systemctl enable --now fail2ban

sudo systemctl status fail2ban

● fail2ban.service - Fail2Ban Service
     Loaded: loaded (/usr/lib/systemd/system/fail2ban.service; enabled; preset: enabled)
     Active: active (running) since Wed 2026-03-19 14:30:00 UTC; 2s ago

fail2ban ships with a built-in nginx-limit-req filter. The filter's regex matches lines like:

limiting requests, excess: 1.532 by zone "general", client: 203.0.113.50

Create the jail configuration. Never edit .conf files directly; use .local overrides:

sudo nano /etc/fail2ban/jail.local

[nginx-limit-req]
enabled  = true
port     = http,https
filter   = nginx-limit-req
logpath  = /var/log/nginx/error.log
maxretry = 10
findtime = 60
bantime  = 600

This bans an IP for 10 minutes after 10 rate limit violations within 60 seconds.

For escalating bans, add a second jail in the same file:

[nginx-limit-req-repeat]
enabled  = true
port     = http,https
filter   = nginx-limit-req
logpath  = /var/log/nginx/error.log
maxretry = 30
findtime = 3600
bantime  = 86400

The first jail catches short bursts (10 hits in a minute = 10-minute ban). The second catches persistent offenders (30 hits in an hour = 24-hour ban).

Restart fail2ban and check the jail status:

sudo systemctl restart fail2ban

sudo fail2ban-client status nginx-limit-req

On Debian 12 and older systems, the output looks like this:

Status for the jail: nginx-limit-req
|- Filter
|  |- Currently failed:	0
|  |- Total failed:	0
|  `- File list:	/var/log/nginx/error.log
`- Actions
   |- Currently banned:	0
   |- Total banned:	0
   `- Banned IP list:

On Ubuntu 24.04, fail2ban defaults to the systemd journal backend (backend = auto resolves to systemd). The output shows Journal matches: instead of File list::

Status for the jail: nginx-limit-req
|- Filter
|  |- Currently failed:	0
|  |- Total failed:	0
|  `- Journal matches:	_SYSTEMD_UNIT=nginx.service + _COMM=nginx
`- Actions
   |- Currently banned:	0
   |- Total banned:	0
   `- Banned IP list:

Both backends work. The journal backend reads the same Nginx log messages through systemd. If you prefer file-based monitoring, add backend = pyinotify to the jail section.

To manually unban an IP during testing:

sudo fail2ban-client set nginx-limit-req unbanip 203.0.113.50

The jail log is at:

sudo journalctl -u fail2ban -f

How do I verify rate limiting works?

Test from a machine outside the server. Never test from localhost because 127.0.0.1 might be in your allowlist.

Quick test with a curl loop:

for i in $(seq 1 20); do
    curl -s -o /dev/null -w "%{http_code}\n" https://example.com/
done

With a rate of 10r/s and burst=20 nodelay, the first 21 requests return 200. Once the burst is exhausted, responses switch to 429.

Load test with wrk:

sudo apt install -y wrk

wrk -t2 -c10 -d10s https://example.com/

Running 10s test @ https://example.com/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.23ms    2.11ms  28.44ms   75.32%
    Req/Sec     1.02k   121.33     1.34k    68.00%
  20384 requests in 10.01s, 15.22MB read
  Non-2xx or 3xx responses: 18241
Requests/sec:   2036.36
Transfer/sec:      1.52MB

The Non-2xx or 3xx responses count shows how many requests Nginx rate-limited. Here, 18,241 out of 20,384 requests received a 429.

Check the error log during the test:

sudo tail -f /var/log/nginx/error.log

2026/03/19 14:45:12 [warn] 1234#1234: *890 limiting requests, excess: 9.876 by zone "general", client: 203.0.113.50, server: example.com, request: "GET / HTTP/1.1", host: "example.com"

The excess value shows how far over the limit the request was. Higher numbers indicate more aggressive traffic.

Complete production config

The full rate limiting configuration combining all three layers:

# /etc/nginx/conf.d/rate-limiting.conf

# --- Allowlist ---
geo $limit {
    default 1;
    10.0.0.0/8      0;
    192.168.0.0/16   0;
    # Add your monitoring / office IPs here
}

map $limit $limit_key {
    0 "";
    1 $binary_remote_addr;
}

# --- Request rate zones ---
limit_req_zone $limit_key zone=general:10m rate=10r/s;
limit_req_zone $limit_key zone=login:10m rate=1r/s;
limit_req_zone $limit_key zone=api:10m rate=30r/s;

# --- Connection zone ---
limit_conn_zone $binary_remote_addr zone=addr:10m;

# --- Response codes and logging ---
limit_req_status 429;
limit_conn_status 429;
limit_req_log_level warn;
limit_conn_log_level warn;

# --- Access log with rate limit status ---
log_format ratelimit '$remote_addr - $remote_user [$time_local] '
                     '"$request" $status $body_bytes_sent '
                     '"$http_referer" "$http_user_agent" '
                     'rate_limit=$limit_req_status';

# /etc/nginx/sites-available/example.com

server {
    listen 80;
    server_name example.com;

    access_log /var/log/nginx/example.access.log ratelimit;
    error_log /var/log/nginx/example.error.log warn;

    # Global limits
    limit_req zone=general burst=20 nodelay;
    limit_conn addr 30;

    # Custom 429 page
    error_page 429 /429.html;
    location = /429.html {
        root /var/www/error;
        internal;
        add_header Retry-After 5 always;
    }

    location /login {
        limit_req zone=login burst=3 nodelay;
        limit_conn addr 5;
        proxy_pass http://127.0.0.1:3000;
    }

    location /api/ {
        limit_req zone=api burst=50 delay=30;
        limit_conn addr 20;
        proxy_pass http://127.0.0.1:3000;
    }

    location / {
        proxy_pass http://127.0.0.1:3000;
    }
}

# /etc/fail2ban/jail.local

[nginx-limit-req]
enabled  = true
port     = http,https
filter   = nginx-limit-req
logpath  = /var/log/nginx/example.error.log
maxretry = 10
findtime = 60
bantime  = 600

[nginx-limit-req-repeat]
enabled  = true
port     = http,https
filter   = nginx-limit-req
logpath  = /var/log/nginx/example.error.log
maxretry = 30
findtime = 3600
bantime  = 86400

After writing all config files:

sudo nginx -t && sudo systemctl reload nginx
sudo systemctl restart fail2ban

For a broader security setup including headers, TLS, and other hardening measures, see.

Directive reference

Directive	Context	Default	Since
`limit_req_zone`	http	-	0.7.21
`limit_req`	http, server, location	-	0.7.21
`limit_req_status`	http, server, location	503	1.3.15
`limit_req_log_level`	http, server, location	error	0.8.18
`limit_req_dry_run`	http, server, location	off	1.17.1
`limit_conn_zone`	http	-	1.1.8
`limit_conn`	http, server, location	-	0.7.21
`limit_conn_status`	http, server, location	503	1.3.15
`limit_conn_log_level`	http, server, location	error	0.8.18
`limit_conn_dry_run`	http, server, location	off	1.17.6

Troubleshooting

Rate limiting not working at all: Check that limit_req_zone is in the http block, not inside a server block. Zones must be defined before they are referenced. If your config uses include directives, make sure the zone file is included before the server blocks.

Legitimate users getting 429: Lower the rate, increase burst, or switch from no-burst to burst=N nodelay. Use dry_run mode to measure actual traffic patterns before setting limits. Check if HTTP/2 multiplexing is inflating limit_conn counts.

fail2ban not banning: Confirm the logpath matches where Nginx actually writes error logs. Check limit_req_log_level is set to warn or error (the default). Verify the jail is active with sudo fail2ban-client status nginx-limit-req. If testing from localhost, note that fail2ban ignores the server's own IP by default (ignoreself = true). Test from an external machine instead.

Rate limit messages not appearing in error log: The error_log directive defaults to error level, which filters out warn-level messages from limit_req_log_level warn. Set error_log /var/log/nginx/error.log warn; in your server block to see rate limit events.

Zone memory exhausted: A 10m zone holds about 160,000 IPv4 states. If you see could not allocate node in logs, increase the zone size. Monitor zone usage over time with $limit_req_status in your access logs.

Behind a load balancer or CDN: If Nginx sees only the load balancer's IP, rate limiting applies to that single IP. Use $http_x_forwarded_for or $realip_remote_addr as the zone key instead of $binary_remote_addr. You must also configure set_real_ip_from with the load balancer's IP range. Only trust X-Forwarded-For from known proxies because clients can spoof it.

Logs are your primary debugging tool:

# Follow rate limit events in real time
sudo tail -f /var/log/nginx/error.log | grep "limiting"

# Check fail2ban actions
sudo journalctl -u fail2ban -f