1) Problem Clarification / Làm rõ bài toán
EN
We need to design a social feed system similar to Facebook/Instagram/TikTok.
Core operations:
- publish a post
- fetch personalized timeline
- rank feed
- update feed when following relationships or engagement changes
VI
Thiết kế hệ thống newsfeed giống Facebook/Instagram/TikTok.
Chức năng chính:
- đăng bài
- lấy timeline cá nhân hoá
- rank feed
- update feed theo follow/engagement
2) Requirements / Yêu cầu hệ thống
EN — Functional
✔ submit post
✔ distribute post to followers
✔ fetch user feed
✔ rank posts
✔ like/comment counter
✔ hide/report capability
VI — Chức năng
✔ publish post
✔ phân phối bài tới follower
✔ lấy feed cá nhân hoá
✔ rank post
✔ đếm like/comment
✔ report/hide
EN — Non-functional
✔ low latency (<100ms fetch)
✔ scalable to 100M DAU
✔ efficient fan-out
VI — Phi chức năng
✔ latency thấp (<100ms)
✔ scale tới 100M DAU
✔ fan-out hiệu quả
3) Key Architectural Problem / Vấn đề kiến trúc chính
Fan-out vs Fan-in trade-off
- Fan-out: precompute timeline when post is published
- Fan-in: compute timeline when user opens the app
4) Approach Selection / Lựa chọn chiến lược
EN
Hybrid:
✔ Fan-out for regular users
✔ Fan-in for celebrities with millions followers
VI
Hybrid:
✔ fan-out khi user bình thường (ít follower)
✔ fan-in khi celeb (1 bài 10M follower)
5) Architecture Overview / Kiến trúc tổng quan
User Post → Post Service → Distribution Service → Feed Store
↓ ranking
Feed Request → Feed API → Timeline Read DB / Cache
VI
Bài đăng → Post Service → Distribution → Feed Store
Request timeline → Feed API → read feed store / cache
6) Feed Storage Model / Mô hình lưu feed
EN
Use per-user timeline store:
FEED:<user_id>: [
(post_id, score, ts)
]
Stored sorted by ranking score and timestamp.
VI
Mỗi user có timeline riêng:
FEED:<user_id>: [
(post_id, score, ts)
]
Sắp xếp theo score + thời gian.
7) Fan-out Execution / Thực thi fan-out
EN
When post published:
- calculate follower list
- enqueue jobs
- consumer inserts post into each timeline store
VI
Khi post:
- lấy list follower
- enqueue tasks
- consumer insert vào feed follower
8) Ranking Model / Mô hình ranking
EN
Formula combines:
- recency
- affinity (relationship strength)
- popularity (likes/comments)
- predicted engagement ML score
VI
Ranking dùng:
- độ mới
- affinity (quan hệ người xem – người post)
- độ phổ biến (like/comment)
- ML scoring dự đoán engagement
9) Read Path (Feed Fetch) / Luồng đọc feed
EN
- Read feed store (sorted set / priority list)
- If needed, compute missing posts via fan-in pull
- TTL-based caching for fast scroll paging
VI
- đọc timeline store
- nếu thiếu thì fan-in
- cache TTL để scroll mượt
10) Write Path (Publishing Post) / Luồng ghi
EN
- insert post
- compute distribution fan-out tasks
- async processing
- partial failures logged + retried
VI
- lưu post
- tính fan-out
- xử lý async
- partial fail lưu log + retry
11) Handling High Follow Graphs (Celeb Problem)
EN
Celebrities generate massive writes.
Solution:
→ fan-in: compute their posts at feed request time
→ pre-compute trending lists to speed ranking
VI
Celeb: 1 bài = hàng triệu insert.
Giải pháp:
- fan-in khi đọc
- precompute trending list
12) Storage & Indexing / Lưu trữ và index
EN
Feed store options:
- Redis sorted set
- Cassandra time series
- ElasticSearch ranking index
- RocksDB for embedding store
VI
Lưu bằng:
- Redis sorted set
- Cassandra time series
- ES ranking index
- RocksDB cho embedding
13) Cache Strategy / Chiến lược cache
EN
- feed caching
- post metadata caching
- invalidation on engagement updates
VI
- cache timeline
- cache post metadata
- invalidate khi engagement thay đổi
14) Engagement Event Pipeline / Pipeline tương tác
EN
Likes/comments update ML score:
- publish events
- update ranking
- update timeline ordering
VI
Like/comment gửi event:
- update ML score
- reorder timeline
15) Observability / Giám sát
EN
Monitor:
- feed latency
- fanout backlog
- ranking model errors
- stuck follower jobs
- engagement rate trend
VI
Theo dõi:
- latency feed
- backlog fan-out
- lỗi ranking
- job stuck
- engagement trend
16) Future Enhancements / Mở rộng
EN
- ML personalized ranking
- exploring embeddings
- feed diversification
- cross-device feed sync
VI
- ranking phụ thuộc user
- ML embedding
- diversified feed
- sync đa thiết bị
[…] Designing A Social News Feed System (Push vs Pull Architecture) […]
[…] Designing A Social News Feed System (Push vs Pull Architecture) […]