Load Balancer & Reverse Proxy (L4/L7, Consistent Hashing, Health Checks)

1) Problem Clarification / Làm rõ bài toán

EN

A load balancer distributes client traffic across multiple backend servers to achieve:

Why we need it

Avoid overload → prevents a single server from being overwhelmed.
Improve availability → if one server fails, others continue serving.
Reduce latency → route requests to the fastest or nearest server.
Handle failures gracefully → automatic failover and retry.
Support horizontal scaling → simply add more servers behind the LB.

VI

Load balancer phân phối request giữa nhiều server nhằm:

Tại sao cần

Tránh quá tải → không dồn hết request vào 1 server.
Tăng availability → 1 server chết vẫn không ảnh hưởng hệ thống.
Giảm latency → gửi request đến server nhanh nhất/gần nhất.
Xử lý lỗi tự động → failover, retry.
Scale linh hoạt → thêm bớt server dễ dàng.

2) L4 vs L7 Load Balancing

Layer 4 – Transport Layer (TCP/UDP LB)

EN

Routes traffic using IP + Port only.
Does NOT inspect HTTP payload.
Fastest and lowest overhead.
Works with any TCP/UDP protocol (HTTP, gRPC, Redis, MySQL).

Use cases

High-throughput network traffic
Database load balancing
Games, streaming, custom protocols

VI

Định tuyến bằng IP + Port.
Không phân tích nội dung HTTP.
Nhanh nhất, nhẹ nhất.
Dùng cho mọi giao thức TCP/UDP.

Layer 7 – Application Layer (HTTP/HTTPS LB)

EN

Understands HTTP headers, path, host, method, cookies.
Can route based on:
- /api/users/*
- Hostname: api.example.com
- Request method: GET vs POST
Can modify request/response headers.
Supports authentication, rate limiting.

VI

Hiểu nội dung HTTP (header, path, cookie).
Route theo:
- URL
- Host
- Method
Có thể modify header/body.
Dùng trong API Gateway, service mesh.

3) Reverse Proxy Overview

EN

A reverse proxy sits between the client and backend servers:

Client → Reverse Proxy → Backend Pool

Functions

Load balancing
Request routing
Caching
Compression
TLS termination
Web Application Firewall (WAF)
Hiding internal server topology

VI

Reverse proxy nằm giữa client và backend:

Client → Reverse Proxy → Backend

Chức năng

Load balancer
Routing
Cache
Nén
TLS termination
WAF
Ẩn topology nội bộ

4) Load Balancing Algorithms

EN + VI

A) Round Robin

Luân phiên từng server.
Easiest to implement, good when all servers are equal.

B) Weighted Round Robin

Server khoẻ hơn (CPU/RAM nhiều) nhận nhiều request hơn.
Good for mixed-capacity clusters.

C) Least Connections

Gửi vào server có ít active connection nhất.
Good when requests vary in duration.

D) Least Response Time

Đo latency + pending requests → chọn backend nhanh nhất.
Good for real-time/low-latency systems.

E) Hash-based (IP Hash / URL Hash)

Stable routing for caching/CDN.

Dùng cho sticky session.

5) Consistent Hashing (Key Concept)

EN

Used in distributed systems to ensure minimal key movement when nodes join/leave.

Applications

Session stickiness
Distributed caching (Memcached, Redis Cluster)
Microservice sharding (userId % N)

VI

Dùng khi cần giữ ổn định mapping key → server khi thay đổi số server.

Ưu điểm

Thêm/bớt server → chỉ một phần nhỏ key phải remap
Giảm load khi scale

6) Health Checks & Failover

EN

LB continuously checks server health:

Types

TCP check → port open?
HTTP check → expect 200 OK
gRPC health check

Failover Behavior

Unhealthy server removed from pool
Traffic automatically redistributed
Automatic recovery when server becomes healthy

VI

LB liên tục kiểm tra server:

Loại check

TCP
HTTP 200
gRPC health

Failover

Server lỗi bị loại khỏi pool
Traffic tự dồn sang server khác
Tự add lại khi server khoẻ

7) Connection Management

EN

LB handles complex connection logic:

Keep-alive
Connection reuse
Concurrent connection limit
HTTP/2 multiplexing
Idle timeout
Slow client protection

VI

LB quản lý:

Keep-alive
Reuse connection
Giới hạn connection
HTTP/2 multiplexing
Timeout
Chống slow client

8) TLS Termination

EN

A load balancer can handle SSL/TLS to reduce backend load.

Modes

TLS Termination
- LB decrypts HTTPS → backend receives HTTP.
TLS Passthrough
- LB does not decrypt, backend handles TLS.
TLS Re-encryption
- LB decrypt → inspect → encrypt lại gửi backend.

VI

LB có thể xử lý SSL theo 3 kiểu:

Termination → backend nhận HTTP
Passthrough → backend tự giải mã
Re-encryption → giải mã + đọc + mã hoá lại

9) Global Load Balancing (GSLB)

EN

Used for multi-region systems.

Techniques

GeoDNS → route based on the user’s location
Anycast → same IP announced globally
Latency-based routing
Region failover → switch to backup region

VI

Dùng cho hệ thống đa vùng (multi-region).

Kỹ thuật

GeoDNS
Anycast
Routing theo latency
Failover vùng khác

10) Application Gateway vs Load Balancer

EN

API Gateway provides:

Authentication
Rate limiting
Request/response transformation
Logging
Routing rules
Caching
Token validation

Load Balancer provides:

Traffic distribution
Health checks
Connection pooling
Failover
SSL termination

VI

API Gateway có:

Auth
Rate limit
Log
Routing thông minh
Biến đổi request

Load Balancer có:

Phân phối request
Health check
Connection pool
Failover

11) Observability(Giám sát LB)

EN

Key metrics:

Request rate (RPS/QPS)
p95/p99 latency
Backend error rate
Connection count
Health check failures
Queue time

VI

Các chỉ số cần theo dõi:

RPS/QPS
Độ trễ p95/p99
Lỗi backend
Số connection
Health check fail
Queue chờ

12) Failure Scenarios

EN

Typical issues:

LB node crashes → use HA pair, VRRP/Keepalived
Backend overload → LB redistributes
DNS misconfiguration → clients can’t resolve LB
Slow backend → queue builds up
Sticky session imbalance
TLS certificate expiry

VI

Các lỗi thường gặp:

LB chết → dùng cặp dự phòng (HA)
Backend quá tải → LB route sang server khác
DNS lỗi → client không kết nối được
Backend chậm → queue dài
Sticky session bị lệch
SSL hết hạn

13) When to Use L4 or L7?

EN

Use L4 when:

Ultra-high throughput or millions of connections
Load balancing TCP/UDP
No need for HTTP inspection
Lowest latency

Use L7 when:

Need routing based on URL/host
API gateway features
Authentication, rate limiting
gRPC, GraphQL, microservices routing

VI

Dùng L4 khi:

Throughput rất lớn
Chỉ cần TCP/UDP
Không cần đọc nội dung HTTP

Dùng L7 khi:

Microservice
Routing thông minh (URL/host)
API Gateway
Auth, rate limit

14) Technologies

L4 Load Balancers

HAProxy (TCP mode)
NGINX stream module
Envoy L4 mode
AWS NLB (Network Load Balancer)
Google Cloud TCP LB

L7 Load Balancers

NGINX
Envoy Proxy
HAProxy (HTTP mode)
Traefik
Kong / Istio / Ambassador
AWS ALB (Application Load Balancer)
Cloudflare Reverse Proxy