πŸš€ πŸš€ Launch Offer β€” Courses starting at β‚Ή1499 (Limited Time)
CortexCookie

Instagram System Design

Instagram is a globally distributed, media-heavy social platform optimized for content creation, engagement, and extremely fast feed consumption. The system must balance write-heavy workloads with ultra-low-latency reads at massive scale.

Functional Requirements

The system must support the following Functional Requirements:

  • Users can create posts (images / videos)
  • Users can like posts
  • Users can comment on posts
  • Users can follow / unfollow others
  • Users can view:
    • Home timeline (personalized feed)
    • User timeline (profile posts)

Non-Functional Requirements

Key system constraints:

  • Eventual consistency acceptable for writes
  • Feed generation must be extremely fast
  • Highly available system
  • Durable & persistent storage
  • Support hot vs cold data
  • Handle multiple user classes
  • Operate at global scale

User Behavior Diversity

Different user classes influence architecture:

  • Famous Users β†’ Who has Millions of followers
  • Active Users β†’ Frequent consumers
  • Live Users β†’ Real-time updates required
  • Passive Users β†’ Rarely open app
  • Inactive Users β†’ No optimization needed

System behavior must adapt per user type.

Scale Estimation

Assumptions:

  • 2 Billion monthly active users
  • 1 Billion daily active users
  • 500 Million Posts / Day

For simplicity, assume:

1 day β‰ˆ 10510⁡ seconds (rounded for mental math)

POST TPS (transaction per second):

500β€…β€Šmillion105\frac{500 \;million}{10^5} posts per sec = 500Γ—106105\frac{500 \times 10^6}{10^5} posts per sec = 5k5k posts per sec

Engagement TPS:

Assume Each user likes 10 posts, comments on 3 posts (on an average)

1β€…β€ŠBillionΓ—10105\frac{1\; Billion \times 10} {10^5} likes per sec = 109Γ—10105\frac{10^9 \times 10} {10^5} likes per sec = 100k likes per sec

1β€…β€ŠBillionΓ—3105\frac{1\; Billion \times 3} {10^5} comments per sec = 109Γ—3105\frac{10^9 \times 3} {10^5} likes per sec = 30k comments per sec

Feed Read QPS

Reads vastly outnumber writes. Consider 20 feed requests per user per day.

10910^9 usersΓ—20=2Γ—10102Γ—10^{10} feed requests/day

2Γ—1010105\frac{2Γ—10^{10}}{10^5} feed requests/sec = 200,000 feed reads/sec

Even if we consider Peak QPS = 3Γ— to 5Γ— of Average QPS => 200KΓ—5=1M200KΓ—5=1M feed reads/sec which is huge.

Core Entities

Primary data models:

  • User
  • Post
  • Like
  • Comment
  • Media / Asset

Each entity has distinct scaling and storage needs.

APIs

Post Creation

POST /posts
{
  caption,
  mediaUrl,
  ...
}

media upload should happen using presigned url, that client uploads directly to the blob storage

Client β†’ Blob Storage (direct upload)

Avoids backend bottlenecks.

Engagement APIs

POST /likes/{postId}
POST /comments/{postId}
POST /follow/{userId}
POST /unfollow/{userId}

Timeline APIs

GET /timelines          β†’ Home feed
GET /timelines/{userId} β†’ User profile feed

High Level Design

Instagram’s architecture separates responsibilities into independent services to support massive scale, high availability, and low-latency reads.

Think about different responsibilities:

  • User Creation β†’ identity heavy
  • Follow/UnFollow Users - Graph Network
  • Post Creation β†’ Write-heavy
  • Timeline / Feed β†’ Read-heavy & latency critical

So let's create separate Microservices for different Responsibilities. Databases and other storage integrations are handled within each service boundary, allowing services to optimize their own storage and delivery strategies.

API Gateway

Role: Entry point for all client interactions.

Responsibilities

  • Routes requests to internal services
  • Handles authentication / authorization
  • Applies rate limiting & throttling
  • Centralizes logging & monitoring

The gateway prevents clients from directly accessing backend services and simplifies request management.

User Service

Owns: User identity and profile domain.

Responsibilities

  • User creation & updates
  • Profile retrieval
  • Account metadata management
  • User-related validations

Relational DB or strongly consistent KV store is the right choice for User Database.

This service is frequently accessed by nearly all system components, making it a foundational dependency.

Follow Service

Owns: Social graph relationships.

Responsibilities

  • Follow / unfollow operations
  • Fetch followers & followees
  • Maintain user relationship edges

Wide-column DB (Cassandra) or Graph-optimized store is the best choice for Follow Database as connections create a Graph type network.

The follow graph is a core driver of feed generation and requires high scalability and efficient querying.

Post Creation Service

Owns: Post write workflow.

Responsibilities

  • Validate post payloads
  • Handle post creation requests
  • Stores Post Images/Videos in Blob Storage, from where those can be pulled to CDN (content delivery network)
  • Persist post metadata (caption, description, creator_id etc.) in some Metadata database

Post creation is a write-heavy operation and must scale independently from feed reads.

A NoSQL distributed DB with high write throughput is the best choice for Post Metadata because of the Continuous high-volume writes.

Timeline / Feed Service

Owns: Feed generation and retrieval.

Responsibilities

  • Generate user home timeline
  • Fetch relevant posts
  • Rank & order feed items
  • Serve low-latency responses

Feed retrieval is the most latency-sensitive and read-heavy path in the system.

Cache-first design (Redis / Memcached) + backing store is the right approach for Timeline / Feed Service.

Feature Extraction

Deep Dive - Post Creation:

Media Processing & Multi-Resolution Support

In media-heavy systems like Instagram, storing a single uploaded image/video is insufficient. Different devices, screen sizes, and network conditions require multiple optimized variants of the same media. This introduces additional components into the post creation workflow. The system never serves the raw original directly to most users.

Optimized pipeline:

Client Upload β†’ Post Ingestion Service β†’ Blob Storage (Original) β†’ Media Processing Pipeline β†’ Blob Storage (Variants) β†’ CDN

Media Processing Service (Critical Component)

Responsibilities

  • Generate multiple resolutions (thumbnail, medium, high)
  • Resize & compress images
  • Transcode videos into adaptive formats
  • Optimize for bandwidth & device constraints

Why needed:

  • Different devices require different sizes
  • Reduces payload & latency
  • Improves user experience
  • Saves CDN bandwidth cost

Example variants:

  • Thumbnail (low resolution)
  • Feed version (compressed)
  • High-resolution viewer version

Async Processing Trigger

When media is uploaded:

Post Ingestion Service β†’ Emit MediaUploaded Event

Media Processing Service consumes events:

  • Fetch original media
  • Generate variants
  • Store derived assets

Avoids blocking post creation latency.

Blob Storage

Blob storage now holds:

  • Original media
  • Processed variants
  • Device-optimized formats

Consider these Variants:

/media/{postId}/original
/media/{postId}/thumbnail
/media/{postId}/medium
/media/{postId}/high
Feature Extraction

CDN Integration

CDN sits in front of blob storage. If Media not found in CDN then pulled from Blob Storage.

Client β†’ CDN β†’ Blob Storage

Benefits:

  • Edge caching
  • Low-latency global delivery
  • Offloads origin traffic

CDN primarily serves processed variants.

Metadata Implications

Post Metadata Store now keeps:

  • Media identifiers / keys
  • URLs for variants
  • Media type & attributes

Example:

{
  postId,
  media: {
    thumbnail_url,
    medium_url,
    high_url
  }
}

Timeline Generation

Timeline generation is one of the most critical read paths in Instagram.
A simple design often works at small scale but quickly collapses under real-world traffic.

Below are the intuitive naΓ―ve strategies for both home and user timelines.

User Timeline

Simpler naΓ―ve strategy:

  1. Query Post Store by creator_id
  2. Sort by timestamp
  3. Paginate results
Feature Extraction

Why This Works Better?

  • Single-partition access pattern
  • Predictable query cost
  • No cross-user aggregation
  • Naturally cacheable

User timeline is fundamentally easier than home timeline.

Home Timeline - NaΓ―ve Approaches

A straightforward approach:
  1. Fetch users followed by User A from Follow DB
  2. For each followed user:
    • Query Post Store for recent posts
  3. Aggregate posts
  4. Sort by creation time
  5. Return top N results
Feature Extraction

Though this is a simple approach, it fails at Scale because of Explosive Fan-Out Reads.

If User A follows 1,000 users:

  • 1 Follow query
  • 1,000 Post queries

Latency & DB pressure explode. Slowest dependency determines response time. This Direct aggregation approach becomes extremely expensive.

Even a few slow queries β†’ feed delays.

How should we shard post table?

If User A follows user B, C, D. So we need posts from user B, C, D

Feature Extraction
  • shard post by post id? or user Id?
  • we don't know the post id, we need posts for all these users. so lets shard based on user id
  • but then we have to aggregate the data

Will caching posts solve the problem?

No, Even if posts are cached, the system still performs fan-out reads.

Alternative NaΓ―ve Strategy: Full Scan + Filter

Another bad but intuitive idea:

  1. Scan recent posts globally
  2. Filter posts from followed users
  3. Sort & limit
  4. Cache

Why This Is Worse?

This approach is inefficient because it requires scanning a massive global dataset for every feed request, leading to extreme read amplification and wasted computation. Caching does not fix the problem because the system must still perform the expensive global scan before knowing what to cache, and personalized feeds have low cache reuse.

Timeline Generation Deep Dive: Solution

Populate Feed Cache on Write (Fan-Out on Write)

Instead of generating timelines at read time, the system shifts work to the write path.

When a user creates a post:

  • Identify followers from Follow Service
  • For each follower:
    • Insert post_id into their feed cache

Result:

  • Feed reads become simple lookups
  • No expensive aggregation during read requests
  • Predictable low-latency timeline retrieval

This converts read amplification into controlled write amplification, which is far more scalable for feed-heavy systems.

Feature Extraction

Problem: High-Follower Users & Write Amplification

With fan-out on write, post creation requires updating the feed cache of every follower.

For highly popular users:

  • Millions of followers
  • Millions of feed updates per post
  • Increased latency on write path
  • Risk of system overload

Synchronous updates become impractical.

Solution: Asynchronous Fan-Out

Shift feed updates to an async pipeline.

Post Ingestion Service:

  • Persist post metadata
  • Emit PostCreated event in Post Topic

Background workers / consumers:

  • Process follower fan-out
  • Update feed caches independently
  • Retry on failures
Feature Extraction

Benefits:

  • Write latency remains low
  • Workload spreads across workers
  • Prevents traffic spikes from blocking requests
  • Improves system resilience

Async processing decouples user-facing latency from heavy fan-out computation.

Hybrid Approach

Famous users: followers feed generation on read

Active users: active this month -> populate Feed cache on post write

Live User: populate feed cache, send live update via websocket (discussed in next section)

Passive user: can ignore generating feed cache (not active over a month)

Inactive User: No need to generate feed

How do you inform Live users:

Live users require real-time awareness of new posts without waiting for feed refresh.

Flow:

Post created β†’ Post Ingestion Service emits event

Post Workers process fan-out / feed updates

Timeline Cache updated

Notification event pushed to WebSocket Manager

Websocket Manager has mapping of Websocket Handlers -> connection (stored in cache)

WebSocket Handler delivers update to active connections

Result:

βœ” Connected users instantly see new posts

βœ” No polling / refresh required

βœ” Low latency user experience

image

Hot and Cold Posts:

Hot and cold data separation is essential in Instagram due to extreme read skew.

Most content access concentrates on:

βœ” Recent posts βœ” Popular posts βœ” Frequently viewed media

Older content becomes rarely accessed.

New Post β†’ Hot Tier (cache / fast DB)

β†’ Gradual Demotion

β†’ Cold Tier (cheap storage)

Promotion/demotion driven by:

  • Access frequency

  • Recency

  • Engagement signals

We are having one Archival Service that fetches old/cold posts from post database and puts them in Archival database.

image

That was a free preview lesson.