Rate Limiting
Protecting servers by restricting the number of requests a client can make in a specific timeframe
Overview
Rate limiting restricts the number of requests a client can make in a given time window. It protects servers from being overwhelmed by too many requests, whether from legitimate traffic spikes or malicious attacks.
Common algorithms include Token Bucket, Leaky Bucket, Fixed Window, and Sliding Window.
Key Concepts
Fixed Window
Allow N requests per fixed time window (e.g., 100 requests per minute). Simple but can have burst at window boundaries.
Sliding Window
Smooth rate limiting using weighted counts from current and previous window. More accurate but slightly more complex.
Token Bucket
Bucket fills with tokens at fixed rate. Each request consumes a token. Allows bursts while maintaining average rate.
Leaky Bucket
Requests enter bucket, leak out at fixed rate. Smooths bursts. Queue-based approach.
How It Works
Fixed Window Example (100 requests/minute): Window 1 (0:00-0:59): Count requests
- Request 1-100: Allow
- Request 101: Reject (429 Too Many Requests) Window 2 (1:00-1:59): Reset counter to 0
Token Bucket Example:
- Bucket capacity: 100 tokens
- Refill rate: 10 tokens/second
- Request arrives: consume 1 token
- No tokens? Reject request
- Allows bursts (up to 100) while maintaining 10/sec average
Implementation (Redis): key = user:123:requests INCR key EXPIRE key 60 if (GET key > 100) reject
Use Cases
API rate limiting (prevent abuse)
DDoS protection
Cost control (limit expensive operations)
Quality of service (prevent one user from hogging resources)
Compliance (adhere to third-party API limits)
Best Practices
Use Redis or similar for distributed rate limiting
Provide clear error messages with rate limit info
Return 429 status code with Retry-After header
Implement tiered limits (higher for paid users)
Rate limit by IP, user, or API key
Allow burst traffic with token bucket
Monitor rate limit hits to tune thresholds
Whitelist trusted clients
Interview Tips
What Interviewers Look For
- •
Explain common algorithms: Fixed Window (simple), Sliding Window (accurate), Token Bucket (allows bursts)
- •
Discuss Redis for distributed rate limiting
- •
Mention HTTP status codes: 429 Too Many Requests
- •
Talk about different rate limit scopes: per IP, per user, per API key
- •
Explain burst handling with token bucket
- •
Discuss trade-offs: too strict blocks legitimate users, too loose allows abuse
- •
Mention libraries: express-rate-limit (Node.js), Flask-Limiter (Python)