Rate Limiter — Technologist

Distributed Systems · Token Bucket · Sliding Window

Algorithm

Rate limit (req/s)

5 req/s

Burst size

10 tokens

Auto-fire rate

8 req/s

Allowed

Denied

—

Tokens / State

—

Allow Rate

Token Bucket Algorithm

A bucket holds up to burst tokens. Tokens refill at a constant rate of rate tokens per second. Each request consumes one token. If no tokens are available, the request is denied.

Properties:

Allows bursts up to burst size as long as tokens are available.
Long-term average is bounded by rate req/s regardless of burst behaviour.
Simple to implement — state is just (tokens, last_refill_time).
Used by: AWS API Gateway, Stripe, Nginx (limit_req).

Pseudocode:
refill = (now - last) * rate
tokens = min(burst, tokens + refill)
if tokens >= 1: tokens -= 1; allow else: deny

Sliding Window Log Algorithm

Maintains a log of timestamps for each request. On each new request, remove all entries older than windowSize seconds, then check if the count is below limit.

Properties:

Precise — no boundary spikes unlike Fixed Window.
Memory proportional to traffic volume (stores all timestamps in window).
Sliding Window Counter is a hybrid — divides window into sub-windows for memory efficiency.
Used by: Redis (ZADD + ZREMRANGEBYSCORE pattern), Cloudflare.

Pseudocode:
log.removeOlderThan(now - window)
if log.count < limit: log.add(now); allow else: deny

Algorithm Comparison

Algorithm	Burst	Memory	Precision	Complexity	Used By

Trade-off Analysis

When to pick each:

Token Bucket — Best general-purpose choice. Simple state, tolerates bursts, enforces long-term rate. Use when clients legitimately need short bursts (e.g. API SDK retries).
Sliding Window Log — When precision matters and memory is acceptable. Prevents boundary exploits. Use at low-to-moderate traffic where per-key memory is bounded.
Fixed Window — Simplest implementation. Risk: 2× burst at window boundary. Acceptable for coarse limits (e.g. 1000 req/hour).
Leaky Bucket — Enforces a smooth output rate regardless of input burst. Ideal for downstream protection (e.g. API calling a third-party with strict SLA).

Distributed considerations:
In a multi-node environment, state must live in a shared store (Redis). Use Redis Lua scripts for atomic check-and-update to prevent race conditions. Token Bucket in Redis: GET tokens; SET tokens = min(burst, tokens + refill) - 1 wrapped in a MULTI/EXEC transaction.