Skip to content

Sharding vs Partitioning

1) Problem Clarification / Làm rõ bài toán

EN

We need to understand how data is divided to scale systems. People often confuse sharding and partitioning, but they aren’t the same.

VI

Cần hiểu cách chia dữ liệu để scale hệ thống.
Nhiều người nhầm sharding và partitioning, nhưng không giống nhau.

2) Definition / Định nghĩa

EN

✔ Partitioning = dividing data inside a single node or logical instance
✔ Sharding = distributing data across multiple nodes / servers

VI

✔ Partitioning = chia data bên trong 1 node / instance
✔ Sharding = chia data sang nhiều node / nhiều server

3) Why Needed? / Tại sao cần?

EN

  • reduce contention
  • improve performance
  • scale storage capacity
  • reduce index scanning

VI

  • giảm tranh chấp
  • tăng performance
  • mở rộng dung lượng
  • giảm index scan

4) Partitioning Types / Loại partitioning

EN

Range partitioning — based on value intervals
Hash partitioning — distribute evenly via hash
List partitioning — categorical grouping

Works inside one DB instance.

VI

Partitioning bên trong DB:

  • range
  • hash
  • list (theo category)

5) Sharding Types / Loại sharding

EN

Vertical sharding: split features/tables
Horizontal sharding: split rows across nodes

Horizontal sharding = scaled-out database.

VI

Vertical sharding: tách bảng theo domain
Horizontal sharding: chia hàng sang nhiều node

Horizontal sharding = database scale-out.

6) Routing Layer / Lớp định tuyến

EN

Sharding requires routing logic:

  • client-side routing
  • proxy router
  • central shard-map lookup

VI

Sharding cần routing:

  • client routing
  • proxy router
  • lookup shard map

Partitioning does not need routing.

7) Example / Ví dụ minh họa

EN

Partitioning usage:
A PostgreSQL table partitioned by month for reporting.

Sharding usage:
User IDs distributed across 32 MySQL clusters.

VI

Partitioning:
1 table PostgreSQL partition theo tháng.

Sharding:
User ID chia sang 32 cụm MySQL khác nhau.

8) Failover & Consistency / Lỗi & nhất quán

EN

Partitioning failure: local failure of one partition but still same DB instance.

Sharding failure: node lost → subset of users/data unavailable.

VI

Partitioning lỗi = lỗi nội bộ DB.
Sharding lỗi = 1 node mất → 1 phần dữ liệu mất.

9) Migration & Rebalancing / Migration & cân bằng lại

EN

Partitioning migration: merging/splitting partitions within 1 DB.

Sharding migration: resharding user base → expensive and operationally complex.

VI

Partitioning migration = gộp/chia trong một database.
Sharding migration = chia lại dữ liệu giữa nhiều DB → phức tạp.

10) Access Patterns / Pattern truy cập

EN

Partitioning improves index locality.
Sharding requires:

  • cross-shard joins forbidden
  • fan-out queries
  • query coordinator

VI

Partitioning tăng locality trong DB.
Sharding bắt buộc:

  • tránh join cross shard
  • fan-out query
  • coordinator query

11) When to use Partitioning? / Khi dùng partitioning?

EN

  • large table performance
  • time-series
  • archived records

VI

Dùng partitioning khi:

  • bảng lớn
  • time-series
  • archive

12) When to use Sharding? / Khi dùng sharding?

EN

  • DB instance runs out of capacity
  • millions of users
  • write-heavy workloads

VI

Dùng sharding khi:

  • 1 DB không chứa nổi nữa
  • user rất lớn
  • workload write-heavy

13) Combined Approach / Kết hợp cả hai

EN

Real systems use both:

Example:
Cassandra = sharded + partitioned
Clickhouse = sharded cluster + partitioned storage

VI

Thực tế dùng cả hai:

Ví dụ:
Cassandra = sharding + partitioning
Clickhouse = sharding + partitioning

14) Architecture Lessons / Bài học kiến trúc

EN

  • partition first, shard later
  • avoid premature sharding
  • catalog service required for routing
  • avoid cross-shard joins

VI

  • partition trước, shard khi cần
  • tránh sharding quá sớm
  • cần catalog routing
  • tránh join cross shard

15) Diagram Summary / Tóm tắt bằng sơ đồ

EN

Partitioning:
DB instance
 └── table partition A
 └── table partition B
 └── table partition C

Sharding:
Cluster
 ├── DB shard 1
 ├── DB shard 2
 └── DB shard 3

VI

Partitioning:
1 DB
 └── partition A
 └── partition B

Sharding:
Cluster
 ├── node 1
 ├── node 2
 └── node 3
Published inAll

One Comment

Leave a Reply

Your email address will not be published. Required fields are marked *