1) Problem Clarification / Làm rõ bài toán
EN
We need a large-scale system that delivers real-time events to users:
- chat messages
- alerts
- feed updates
- payment status
- order status
- live-stream reactions
VI
Thiết kế hệ thống gửi thông báo realtime:
- chat
- cảnh báo
- cập nhật feed
- trạng thái thanh toán
- trạng thái đơn hàng
- reactions livestream
2) Functional & Non-Functional Requirements / Yêu cầu
EN
Functional
✔ deliver notifications in < 100ms
✔ support millions of concurrent connections
✔ fan-out to multiple devices per user
✔ reliability & retry
✔ offline push support
Non-functional
✔ horizontal scalability
✔ low latency
✔ fault tolerance
✔ monitoring + rate limiting
VI
Chức năng
✔ gửi thông báo < 100ms
✔ hỗ trợ hàng triệu kết nối
✔ fan-out tới nhiều thiết bị user
✔ đảm bảo reliability
✔ push offline (Firebase/APNS)
Phi chức năng
✔ scale ngang
✔ latency thấp
✔ chịu lỗi
✔ metrics + rate limit
3) Protocol Comparison / So sánh giao thức
EN
| Feature | WebSocket | SSE | Push (FCM/APNS) |
|---|---|---|---|
| Direction | bidirectional | server → client | server → device |
| Connection | persistent | persistent | no connection |
| Scale | heavy | lighter | handled by Google/Apple |
| Use cases | chat, collab | feed, events | mobile offline |
VI
| Tính năng | WebSocket | SSE | Push |
|---|---|---|---|
| Hướng | 2 chiều | server → client | server → mobile |
| Kết nối | persistent | persistent | không cần |
| Scale | nặng | nhẹ hơn | Google/Apple quản lý |
| Use case | chat | feed | offline/mobile |
4) High-Level Architecture / Kiến trúc tổng quan
Producer → Kafka → Notification Service → Channel Manager → WebSocket/SSE Gateway → Clients
↓
Push Service (FCM/APNS)
VI
Flow:
Producer → Kafka → Notification Service → Channel Manager → WebSocket/SSE → Client
Dòng offline → Push (FCM, APNS)
5) Connection Management / Quản lý kết nối
EN
A connection gateway handles:
- millions of WebSocket/SSE connections
- heartbeat
- reconnection
- user-to-connection mapping
VI
Gateway chịu trách nhiệm:
- hàng triệu kết nối WebSocket/SSE
- heartbeat
- reconnect
- map user → connection
6) Routing Notifications / Điều phối thông báo
EN
Cases:
- User → single device
- User → multiple devices
- Group notifications
- Topic-based notifications
Use:
- Redis pub/sub
- Kafka fan-out topics
- Connection registry
VI
Các case:
- User 1 thiết bị
- User nhiều thiết bị
- gửi nhóm
- gửi theo topic
Dùng:
- Redis pub/sub
- Kafka topic
- Connection registry
7) Delivery Semantics / Ngữ nghĩa giao hàng
EN
Guarantees:
- Best-effort delivery (WebSocket/SSE)
- At-least-once for push (Firebase retries)
- Ordering per user using partitioning
VI
Bảo đảm:
- best effort (WebSocket)
- at-least-once cho push
- ordering theo user qua partition
8) Store-and-Forward for Offline Users
EN
If user offline:
- store notification in DB
- deliver next time they connect
- push fallback on mobile
VI
Nếu user offline:
- lưu DB
- gửi khi họ online lại
- dùng push làm fallback
9) Rate Limiting / Giới hạn tần suất
EN
Prevent spam or overload:
- per-user rate limit
- per-connection throughput
- burst control for viral posts
VI
Chống spam:
- limit theo user
- limit theo kết nối
- chống burst khi viral
10) Scaling WebSocket Gateways / Scale gateway
EN
Strategies:
- consistent hashing user → gateway
- sticky connection
- load balancer L4
- gateway cluster with shard awareness
VI
Chiến lược:
- hash user → gateway
- sticky connection
- load balancer L4
- cluster gateway
11) Failure Handling / Xử lý lỗi
EN
- gateway node crash → clients reconnect
- Kafka outage → backpressure + retry
- notification loss → dedupe logic
VI
- gateway crash → client reconnect
- Kafka down → backpressure
- mất thông báo → dedupe logic
12) Observability / Giám sát
EN
Metrics:
- connection count
- reconnect rate
- message delivery latency
- push success/failure
- fan-out queue lag
VI
Theo dõi:
- số kết nối
- tỷ lệ reconnect
- latency gửi
- push success/fail
- queue lag
13) Push Notification Integration
EN
When client offline:
- send via Firebase Cloud Messaging
- send via Apple Push Notification Service
Payload usually lightweight.
VI
Client offline:
- dùng Firebase (Android, web)
- dùng APNS (iOS)
Payload nhỏ gọn.
14) Choosing Between WebSocket, SSE, Push
EN
| Use Case | Recommended |
|---|---|
| Chat | WebSocket |
| Price ticker | SSE |
| Feed update | SSE |
| Mobile background | Push |
| Stock trading | WebSocket |
VI
| Use Case | Gợi ý |
|---|---|
| Chat | WebSocket |
| Ticker giá | SSE |
| Feed update | SSE |
| Mobile background | Push |
| Giao dịch chứng khoán | WebSocket |
15) Future Enhancements / Mở rộng
EN
- Edge WebSocket routing (Cloudflare Workers)
- Hybrid push + WebSocket optimization
- Prioritized notifications
- ML-based notification ranking
VI
- WebSocket tại edge
- hybrid push + socket
- ưu tiên thông báo
- ML ranking
[…] Designing A Real-Time Notification System (WebSocket / SSE / Push) […]