Latency

Understanding the round-trip delay caused by physical distance between the user and the server

Overview

Latency is the time delay between sending a request and receiving a response. It's primarily caused by physical distance, network congestion, and processing time.

Understanding and minimizing latency is crucial for providing good user experience, especially for real-time applications.

Key Concepts

Network Latency

Time for data to travel from source to destination over the network. Limited by speed of light (roughly 1ms per 100km).

Processing Latency

Time taken by servers to process requests: database queries, computation, business logic.

Round-Trip Time (RTT)

Time for a request to go from client to server and back. Includes both network and processing latency.

How It Works

Total Latency Components:

  1. DNS Lookup: 20-120ms
  2. TCP Connection: 1 RTT (varies by distance)
  3. TLS Handshake: 2 RTT (for HTTPS)
  4. HTTP Request/Response: 1+ RTT
  5. Server Processing: varies (10ms-1000ms+)
  6. Database Query: varies (10ms-100ms+)

Example: US East to US West

  • Physical distance: ~4,000 km
  • Speed of light: ~300,000 km/s
  • Minimum theoretical latency: ~13ms one way
  • Real-world RTT: typically 60-80ms

Use Cases

Gaming: requires <50ms for good experience

Video calls: <150ms acceptable

E-commerce: <100ms page load ideal

Financial trading: microseconds matter

General web browsing: <200ms feels instant

Best Practices

Use CDNs to serve content from locations near users

Minimize number of round trips (combine requests, use HTTP/2)

Implement caching at multiple levels

Optimize database queries

Use connection pooling to avoid repeated connection setup

Compress responses to reduce transfer time

Implement lazy loading for non-critical content

Use async operations where appropriate

Interview Tips

What Interviewers Look For

  • Explain the speed of light limitation for geographic latency

  • Discuss CDNs as a primary solution for reducing latency

  • Mention the 3-tier latency impact: network, processing, database

  • Talk about HTTP/2 and HTTP/3 reducing latency through multiplexing

  • Explain how caching at different layers reduces effective latency

  • Know specific numbers: local cache ~1ms, same datacenter ~1-10ms, cross-country ~50-100ms, intercontinental ~100-300ms

AI Tutor

Ask about the topic

Sign in Required

Please sign in to use the AI tutor

Sign In
Latency - Module 2: Networking & Content Delivery | System Design | Revise Algo