Rate Limiting

Protecting servers by restricting the number of requests a client can make in a specific timeframe

Overview

Rate limiting restricts the number of requests a client can make in a given time window. It protects servers from being overwhelmed by too many requests, whether from legitimate traffic spikes or malicious attacks.

Common algorithms include Token Bucket, Leaky Bucket, Fixed Window, and Sliding Window.

Key Concepts

Fixed Window

Allow N requests per fixed time window (e.g., 100 requests per minute). Simple but can have burst at window boundaries.

Sliding Window

Smooth rate limiting using weighted counts from current and previous window. More accurate but slightly more complex.

Token Bucket

Bucket fills with tokens at fixed rate. Each request consumes a token. Allows bursts while maintaining average rate.

Leaky Bucket

Requests enter bucket, leak out at fixed rate. Smooths bursts. Queue-based approach.

How It Works

Fixed Window Example (100 requests/minute): Window 1 (0:00-0:59): Count requests

  • Request 1-100: Allow
  • Request 101: Reject (429 Too Many Requests) Window 2 (1:00-1:59): Reset counter to 0

Token Bucket Example:

  • Bucket capacity: 100 tokens
  • Refill rate: 10 tokens/second
  • Request arrives: consume 1 token
  • No tokens? Reject request
  • Allows bursts (up to 100) while maintaining 10/sec average

Implementation (Redis): key = user:123:requests INCR key EXPIRE key 60 if (GET key > 100) reject

Use Cases

API rate limiting (prevent abuse)

DDoS protection

Cost control (limit expensive operations)

Quality of service (prevent one user from hogging resources)

Compliance (adhere to third-party API limits)

Best Practices

Use Redis or similar for distributed rate limiting

Provide clear error messages with rate limit info

Return 429 status code with Retry-After header

Implement tiered limits (higher for paid users)

Rate limit by IP, user, or API key

Allow burst traffic with token bucket

Monitor rate limit hits to tune thresholds

Whitelist trusted clients

Interview Tips

What Interviewers Look For

  • Explain common algorithms: Fixed Window (simple), Sliding Window (accurate), Token Bucket (allows bursts)

  • Discuss Redis for distributed rate limiting

  • Mention HTTP status codes: 429 Too Many Requests

  • Talk about different rate limit scopes: per IP, per user, per API key

  • Explain burst handling with token bucket

  • Discuss trade-offs: too strict blocks legitimate users, too loose allows abuse

  • Mention libraries: express-rate-limit (Node.js), Flask-Limiter (Python)

AI Tutor

Ask about the topic

Sign in Required

Please sign in to use the AI tutor

Sign In
Rate Limiting - Module 7: Real-Time Communication & Reliability | System Design | Revise Algo