Skip to content

Designing A Social News Feed System (Push vs Pull Architecture)

1) Problem Clarification / Làm rõ bài toán

EN

We need to design a social feed system similar to Facebook/Instagram/TikTok.

Core operations:

  • publish a post
  • fetch personalized timeline
  • rank feed
  • update feed when following relationships or engagement changes

VI

Thiết kế hệ thống newsfeed giống Facebook/Instagram/TikTok.

Chức năng chính:

  • đăng bài
  • lấy timeline cá nhân hoá
  • rank feed
  • update feed theo follow/engagement

2) Requirements / Yêu cầu hệ thống

EN — Functional

✔ submit post
✔ distribute post to followers
✔ fetch user feed
✔ rank posts
✔ like/comment counter
✔ hide/report capability

VI — Chức năng

✔ publish post
✔ phân phối bài tới follower
✔ lấy feed cá nhân hoá
✔ rank post
✔ đếm like/comment
✔ report/hide

EN — Non-functional

✔ low latency (<100ms fetch)
✔ scalable to 100M DAU
✔ efficient fan-out

VI — Phi chức năng

✔ latency thấp (<100ms)
✔ scale tới 100M DAU
✔ fan-out hiệu quả

3) Key Architectural Problem / Vấn đề kiến trúc chính

Fan-out vs Fan-in trade-off

  • Fan-out: precompute timeline when post is published
  • Fan-in: compute timeline when user opens the app

4) Approach Selection / Lựa chọn chiến lược

EN

Hybrid:

✔ Fan-out for regular users
✔ Fan-in for celebrities with millions followers

VI

Hybrid:

✔ fan-out khi user bình thường (ít follower)
✔ fan-in khi celeb (1 bài 10M follower)

5) Architecture Overview / Kiến trúc tổng quan

User Post → Post Service → Distribution Service → Feed Store
                             ↓ ranking
Feed Request → Feed API → Timeline Read DB / Cache

VI

Bài đăng → Post Service → Distribution → Feed Store
Request timeline → Feed API → read feed store / cache

6) Feed Storage Model / Mô hình lưu feed

EN

Use per-user timeline store:

FEED:<user_id>: [
   (post_id, score, ts)
]

Stored sorted by ranking score and timestamp.

VI

Mỗi user có timeline riêng:

FEED:<user_id>: [
   (post_id, score, ts)
]

Sắp xếp theo score + thời gian.

7) Fan-out Execution / Thực thi fan-out

EN

When post published:

  • calculate follower list
  • enqueue jobs
  • consumer inserts post into each timeline store

VI

Khi post:

  • lấy list follower
  • enqueue tasks
  • consumer insert vào feed follower

8) Ranking Model / Mô hình ranking

EN

Formula combines:

  • recency
  • affinity (relationship strength)
  • popularity (likes/comments)
  • predicted engagement ML score

VI

Ranking dùng:

  • độ mới
  • affinity (quan hệ người xem – người post)
  • độ phổ biến (like/comment)
  • ML scoring dự đoán engagement

9) Read Path (Feed Fetch) / Luồng đọc feed

EN

  • Read feed store (sorted set / priority list)
  • If needed, compute missing posts via fan-in pull
  • TTL-based caching for fast scroll paging

VI

  • đọc timeline store
  • nếu thiếu thì fan-in
  • cache TTL để scroll mượt

10) Write Path (Publishing Post) / Luồng ghi

EN

  • insert post
  • compute distribution fan-out tasks
  • async processing
  • partial failures logged + retried

VI

  • lưu post
  • tính fan-out
  • xử lý async
  • partial fail lưu log + retry

11) Handling High Follow Graphs (Celeb Problem)

EN

Celebrities generate massive writes.

Solution:
→ fan-in: compute their posts at feed request time
→ pre-compute trending lists to speed ranking

VI

Celeb: 1 bài = hàng triệu insert.
Giải pháp:

  • fan-in khi đọc
  • precompute trending list

12) Storage & Indexing / Lưu trữ và index

EN

Feed store options:

  • Redis sorted set
  • Cassandra time series
  • ElasticSearch ranking index
  • RocksDB for embedding store

VI

Lưu bằng:

  • Redis sorted set
  • Cassandra time series
  • ES ranking index
  • RocksDB cho embedding

13) Cache Strategy / Chiến lược cache

EN

  • feed caching
  • post metadata caching
  • invalidation on engagement updates

VI

  • cache timeline
  • cache post metadata
  • invalidate khi engagement thay đổi

14) Engagement Event Pipeline / Pipeline tương tác

EN

Likes/comments update ML score:

  • publish events
  • update ranking
  • update timeline ordering

VI

Like/comment gửi event:

  • update ML score
  • reorder timeline

15) Observability / Giám sát

EN

Monitor:

  • feed latency
  • fanout backlog
  • ranking model errors
  • stuck follower jobs
  • engagement rate trend

VI

Theo dõi:

  • latency feed
  • backlog fan-out
  • lỗi ranking
  • job stuck
  • engagement trend

16) Future Enhancements / Mở rộng

EN

  • ML personalized ranking
  • exploring embeddings
  • feed diversification
  • cross-device feed sync

VI

  • ranking phụ thuộc user
  • ML embedding
  • diversified feed
  • sync đa thiết bị
Published inAll

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *