Rate Limiter

Distributed Systems · Token Bucket · Sliding Window

Algorithm
Rate limit (req/s)
5 req/s
Burst size
10 tokens
Auto-fire rate
8 req/s
0
Allowed
0
Denied
Tokens / State
Allow Rate
Token Bucket Algorithm

A bucket holds up to burst tokens. Tokens refill at a constant rate of rate tokens per second. Each request consumes one token. If no tokens are available, the request is denied.

Properties:
  • Allows bursts up to burst size as long as tokens are available.
  • Long-term average is bounded by rate req/s regardless of burst behaviour.
  • Simple to implement — state is just (tokens, last_refill_time).
  • Used by: AWS API Gateway, Stripe, Nginx (limit_req).
Pseudocode:
refill = (now - last) * rate
tokens = min(burst, tokens + refill)
if tokens >= 1: tokens -= 1; allow else: deny
Sliding Window Log Algorithm

Maintains a log of timestamps for each request. On each new request, remove all entries older than windowSize seconds, then check if the count is below limit.

Properties:
  • Precise — no boundary spikes unlike Fixed Window.
  • Memory proportional to traffic volume (stores all timestamps in window).
  • Sliding Window Counter is a hybrid — divides window into sub-windows for memory efficiency.
  • Used by: Redis (ZADD + ZREMRANGEBYSCORE pattern), Cloudflare.
Pseudocode:
log.removeOlderThan(now - window)
if log.count < limit: log.add(now); allow else: deny
Algorithm Comparison
Algorithm Burst Memory Precision Complexity Used By
Trade-off Analysis
When to pick each:
  • Token Bucket — Best general-purpose choice. Simple state, tolerates bursts, enforces long-term rate. Use when clients legitimately need short bursts (e.g. API SDK retries).
  • Sliding Window Log — When precision matters and memory is acceptable. Prevents boundary exploits. Use at low-to-moderate traffic where per-key memory is bounded.
  • Fixed Window — Simplest implementation. Risk: 2× burst at window boundary. Acceptable for coarse limits (e.g. 1000 req/hour).
  • Leaky Bucket — Enforces a smooth output rate regardless of input burst. Ideal for downstream protection (e.g. API calling a third-party with strict SLA).
Distributed considerations:
In a multi-node environment, state must live in a shared store (Redis). Use Redis Lua scripts for atomic check-and-update to prevent race conditions. Token Bucket in Redis: GET tokens; SET tokens = min(burst, tokens + refill) - 1 wrapped in a MULTI/EXEC transaction.