At a glance
A ranked table of your ten tables with the most partitions, where a partition is a logical slice of a MergeTree table defined by itsPARTITION BYexpression (commonly one per month, or one per day). Partition count is a design-health signal, not a real-time load signal. A table with hundreds or thousands of partitions almost always has a too-granular partition key (partitioning by hour or by a high-cardinality column instead of by month), which fragments data, inflates part counts, slows merges and queries, and pushes the table toward theToo Many Partscliff. This card is how you catch a partitioning mistake before it becomes an ingest outage.
| Data source | system.parts, aggregated: SELECT table, uniqExact(partition_id) AS partitions FROM system.parts WHERE active GROUP BY table ORDER BY partitions DESC LIMIT 10. The count is distinct partition IDs among active parts. |
| Metric basis | Distinct partitions, not parts. Many parts can belong to one partition; this card counts the partitions, which reflects the partition-key design. Pair with Active Parts (Top 10 Tables) for the parts view. |
| Why partitions matter | Each partition is merged independently; ClickHouse never merges parts across partition boundaries. Too many partitions means more, smaller merge units, more files on disk, slower partition pruning, and a faster climb toward the per-table parts cap. |
| Aggregation window | Real-time. Recomputed from system.parts on each refresh. |
| Alert threshold | > 1000 partitions on any table. A table breaching 1,000 partitions is a near-certain sign of an over-granular PARTITION BY and warrants a redesign before it destabilises ingest. |
| What does NOT count | (1) Inactive parts (the query filters WHERE active, so parts pending cleanup after a merge are excluded); (2) detached partitions (those are not active); (3) the parts-within-a-partition count (that is the Active Parts card); (4) tables outside the top 10. |
| Healthy shape | Monthly partitioning over a few years is tens of partitions. Daily partitioning over a year or two is hundreds and is often acceptable but worth watching. Anything in the thousands signals a key chosen at too fine a grain. |
| Sensitivity note | This is a Sensitivity card: the > 1000 default surfaces an over-partitioned table early, while it is still a tuning problem and not yet an outage. |
| Time window | RT (real-time, recomputed on each refresh) |
| Alert trigger | > 1000 partitions on any single table. |
| Roles | owner, platform, dba |
Calculation
The engine groups active parts by table and counts distinct partitions:partition_id is the internal identifier for the slice a part belongs to, derived from the table’s PARTITION BY expression. Counting uniqExact(partition_id) over active parts gives the number of live partitions per table. The card surfaces partitions as the ranked headline and carries active_parts and size as context columns, because the relationship between partitions and parts is what diagnoses the problem: a table with 1,200 partitions and 1,300 parts has ~1 part per partition (heavy granularity, light data per slice), which is the classic over-partitioning signature.
On a cluster, the card aggregates across nodes so a table’s total partition footprint is visible rather than one node’s slice. The > 1000 alert evaluates per table: any single table in the result set breaching the threshold turns the card red and names that table.
Worked example
A platform team runs ClickHouse for a clickstream pipeline. A newpage_events table was created two weeks ago. Snapshot taken on 17 Apr 26 at 10:05 UTC.
The card shows:
| Rank | Table | Partitions | Active parts | Size |
|---|---|---|---|---|
| 1 | page_events | 1,247 | 1,389 | 04 GB |
| 2 | order_events | 36 | 412 | 88 GB |
| 3 | session_summary | 24 | 190 | 31 GB |
| 4 | user_profiles | 1 | 22 | 9 GB |
| … |
page_events is red: 1,247 partitions on a table only two weeks old and just 04 GB in size. Compare it to order_events, which is far bigger (88 GB) but has only 36 partitions. The team checks the DDL:
- This is a design fault, not load. 1,247 partitions in two weeks is impossible from daily partitioning alone (that would be ~14). The
site_idin the partition key multiplied the slices: ~14 days x ~40 sites plus historical backfill produced ~1,200 partitions. The fix is thePARTITION BY, not more hardware. - The parts-to-partitions ratio confirms it. 1,389 parts across 1,247 partitions is ~1.1 parts per partition. ClickHouse cannot merge across partitions, so each tiny slice is stuck as its own near-unmergeable unit. That fragments the 04 GB across ~1,400 files and pushes the table toward the
Too Many Partscap even though the data volume is trivial. - It will become an outage if ignored. As ingest continues, parts climb. Pair with Too Many Parts Errors (24h): once a single partition’s part count breaches the cap, inserts to
page_eventsstart failing with code 252 and the pipeline stalls.
PARTITION BY to a coarse, low-cardinality expression (monthly, toYYYYMM(event_time)), and move site_id into the ORDER BY / primary key where it belongs for filtering. Because you cannot alter PARTITION BY on an existing table in place, the standard path is to create a correctly partitioned table and INSERT INTO ... SELECT the data across, then swap names. After the migration the partition count for page_events drops to single digits and the parts backlog clears as merges consolidate freely within the larger partitions.
Sibling cards platform teams should reference together
| Card | Why pair it with Partition Count | What the combination tells you |
|---|---|---|
| Active Parts (Top 10 Tables) | Parts are the other half of the fragmentation story. | High partitions plus high parts with a ~1:1 ratio is the over-partitioning signature; high parts within few partitions is a merge-throughput problem instead. |
| Too Many Parts Errors (24h) | The outage an over-partitioned table eventually causes. | A table topping this card today is the prime suspect when code-252 errors start tomorrow. |
| Merges In Progress | More partitions means more, smaller merge units. | Many partitions plus a high merge count means the scheduler is thrashing on tiny per-partition merges. |
| Database Disk Usage % | Fragmentation wastes space and inodes. | An over-partitioned table inflates file count and disk overhead out of proportion to its data. |
| Query Latency p95 (ms) | Too many partitions slows partition pruning and reads. | Latency creeping up on a specific table that also tops this card points straight at the partition key. |
| Inserts per Second (live) | Insert pattern shapes partition/part growth. | Frequent small inserts into an over-partitioned table is the fastest route to a parts cliff. |
| ClickHouse Health Score | The composite that weights ingest health. | A table over the partition threshold drags the score down ahead of any visible failure. |
Reconciling against the source
Where to look in ClickHouse itself:Why our number may legitimately differ from a manual read:system.partsfor the authoritative count:SELECT table, uniqExact(partition_id) AS partitions, count() AS parts FROM system.parts WHERE active GROUP BY table ORDER BY partitions DESC. This is exactly what the card runs.system.partsper partition to see the offender’s shape:SELECT partition, count() AS parts, sum(rows) FROM system.parts WHERE active AND table = 'page_events' GROUP BY partition ORDER BY parts DESC.SHOW CREATE TABLE <name>to read thePARTITION BYexpression that produced the partitions, the root-cause artefact.system.tablesforpartition_keyand engine metadata across tables. ClickHouse Cloud: the samesystem.partsreads apply; the Cloud console also surfaces table-level part and partition statistics per service.
| Reason | Direction | Why |
|---|---|---|
| Active-only filter | Vortex IQ lower | The card counts WHERE active. A manual count without that filter includes inactive parts pending cleanup and over-reports. |
| Distinct partitions vs parts | Different numbers | This card counts distinct partition_id; a naive count() on system.parts counts parts, which is a larger and different number. Match the aggregation. |
| Sampling instant | Marginal | During heavy ingest, partitions/parts churn between the card’s sample and a hand-run query, especially right after a backfill. |
| Cluster aggregation | Vortex IQ higher | The card aggregates across nodes; a single-node query sees only that node’s partitions. |
| Detached partitions | Vortex IQ lower | Detached partitions are not active and are excluded; if you ATTACH them, the count rises. |
| Card | Expected relationship | What causes divergence |
|---|---|---|
clickhouse.active-parts-top-10-tables | A table high here is usually high there too. | A high parts count concentrated in few partitions means a merge-throughput issue, not a partition-key issue; the two cards disagree by design in that case. |
clickhouse.too-many-parts-errors-24h | The over-partitioned table is the likely source of future code-252 errors. | If errors appear on a table not topping this card, the cause is insert frequency, not partition granularity. |
Known limitations / FAQs
What is the difference between a partition and a part? A partition is a logical slice defined by yourPARTITION BY expression (for example, all rows for one month). A part is a physical immutable directory of data on disk; many parts can belong to one partition, and ClickHouse merges parts within a partition over time but never across partitions. This card counts partitions (a design-health signal). The Active Parts (Top 10 Tables) card counts parts (a merge-throughput signal). You usually need both to diagnose a table.
Why is 1,000 partitions the threshold? My table has 800 and seems fine.
Healthy partitioning is coarse: monthly over a few years is tens, daily over a couple of years is hundreds. By the time a table reaches the high hundreds it is almost always partitioned at too fine a grain or by a high-cardinality column, and the cost (slower pruning, fragmented merges, file-count overhead) is already being paid even if nothing has failed yet. The > 1000 default is the point where a redesign is clearly warranted. This is a Sensitivity card, so lower the threshold per profile if you want earlier warning.
Can I just ALTER TABLE to change the partition key?
No. ClickHouse does not allow changing PARTITION BY on an existing table in place. The standard remedy is to create a new table with the correct (coarser) partition key, copy the data across with INSERT INTO new_table SELECT * FROM old_table, validate row counts, then RENAME to swap them. Plan this as a migration; it is the price of getting the partition key right after the fact, which is why catching an over-partitioned table early (when it is small) matters so much.
My table has thousands of partitions but they are all tiny. Is that really a problem?
Yes, often the worst kind. Tiny partitions mean each one holds very few parts and rows, and because merges never cross partition boundaries, ClickHouse cannot consolidate them into efficient larger units. You end up with thousands of near-unmergeable slices, a huge file count, slow partition pruning, and a fast climb toward the per-table parts cap, all for a trivial amount of data. The fragmentation, not the data volume, is the problem.
Should I partition by day to make DROP PARTITION retention easy?
Daily partitioning is a reasonable trade-off only if your data spans a bounded window (so the partition count stays in the low hundreds) and you genuinely use DROP PARTITION for retention. For longer-lived tables, prefer monthly partitioning and use TTL (TTL event_time + INTERVAL 90 DAY) for retention instead, which deletes at the part level without needing a fine partition grain. Partitioning purely for delete-convenience is the most common cause of over-partitioning.
Does a high-cardinality column in the partition key cause this?
Almost always, and it is the classic mistake. Partitioning by (date, site_id), (date, customer_id), or any column with many distinct values multiplies the partition count by that cardinality. The partition key should be low-cardinality and coarse (typically just a time bucket). High-cardinality columns belong in the ORDER BY / primary key, where they help query filtering without exploding the partition count. If a table tops this card, check SHOW CREATE TABLE for a multi-column or high-cardinality PARTITION BY first.
The card only shows 10 tables. What if my over-partitioned table is number 11?
The card ranks the top 10 by partition count, so the worst offenders surface first; an over-partitioned table is, by definition, near the top. If you suspect a table outside the top 10, run the system.parts aggregation manually without the LIMIT 10 to see the full ranking. In practice, any table serious enough to risk an outage is in the top few, which is what the alert (> 1000 on any table) is tuned to catch.