At a glance
How full each warehouse is, expressed as the share of its concurrency capacity currently in use: running queries divided by the warehouse’s maximum concurrency level. A gauge near 100% means the warehouse is at the edge of what it can run at once, so the next query queues instead of starting. Sustained saturation is the clearest live signal that a warehouse needs to be upsized (more compute per cluster) or made multi-cluster (more clusters to absorb concurrency). For a platform team it answers, in one glance: which warehouse is about to start making everyone wait?
| What it tracks | Per-warehouse saturation: running_queries / max_concurrency_level, shown as a percentage on a gauge. 100% means every concurrency slot is occupied and further queries must queue. |
| Data source | Live warehouse state: running query count from QUERY_HISTORY / live monitoring, divided by the warehouse’s effective MAX_CONCURRENCY_LEVEL (from SHOW PARAMETERS). Sustained 100% indicates a need to upsize or go multi-cluster. |
| Time window | RT/1m. Real-time, evaluated on a rolling 1-minute basis so a single momentary spike does not trip the gauge. |
| Alert trigger | > 90%. When saturation holds above 90% for the rolling minute, the card flags amber/red and the sensitivity rule fires. |
| Units | Percent (0 to 100) per warehouse. The headline shows the most-saturated warehouse; the gauge is per-warehouse on drill-in. |
| Concurrency vs size | MAX_CONCURRENCY_LEVEL (default 8) controls how many queries run at once per cluster; warehouse size (X-Small to 6X-Large) controls how much compute each query gets. Saturation is about the former. |
| Roles | owner, platform, dba, sre |
Calculation
Saturation is concurrency utilisation, not CPU utilisation. Snowflake admits a bounded number of queries to run simultaneously on a warehouse; that ceiling isMAX_CONCURRENCY_LEVEL (default 8) per cluster. Per warehouse, over the rolling minute:
running_queries reaches MAX_CONCURRENCY_LEVEL, saturation pins at 100% and new queries enter the queue (visible as QUEUED_OVERLOAD_TIME in QUERY_HISTORY). The 1-minute smoothing prevents a brief burst from tripping the > 90% alert; only sustained saturation flags.
Worked example
A platform team runs Snowflake behind an Adobe Commerce store’s reporting layer. The reporting warehouseBI_WH is a Medium, single-cluster, MAX_CONCURRENCY_LEVEL = 8. Snapshot at 09 Jun 26, 09:05 UTC, during the Monday morning dashboard rush.
| Warehouse | Size | Clusters | MAX_CONCURRENCY_LEVEL | Running queries | Saturation |
|---|---|---|---|---|---|
BI_WH | Medium | 1 | 8 | 8 | 100% |
TRANSFORM_WH | Large | 1 | 8 | 3 | 38% |
INGEST_WH | Small | 1 | 8 | 1 | 13% |
ADHOC_WH | X-Small | 1 | 8 | 2 | 25% |
BI_WH, well past the 90% trigger, so the card is red.
What is happening: every Monday at 09:00 the whole analytics team opens their dashboards at once. BI_WH admits 8 queries, the 9th onwards queue. Analysts experience this as dashboards that “hang” for 10 to 40 seconds before painting. The warehouse is not slow, it is full: each individual query runs fine, but there are more concurrent queries than concurrency slots.
The fix is concurrency, not size. Upsizing BI_WH from Medium to Large would give each query more compute (faster individual runtime) but MAX_CONCURRENCY_LEVEL stays 8, so the 9th query still queues. The right lever is multi-cluster:
- Saturation is a concurrency problem, so the cure is usually more clusters, not a bigger warehouse. Upsizing helps single slow queries; multi-cluster helps too many queries at once.
- Read the gauge with the queue card. Sustained 100% saturation and rising Avg Query Queue Depth per Warehouse together confirm real queueing, not just a healthy busy warehouse.
- Multi-cluster on ECONOMY mode is cheap insurance. It only adds clusters when concurrency demands it and scales back when the rush passes, so you pay for the surge, not for keeping it warm all day.
Sibling cards
| Card | Why pair it with Warehouse Saturation | What the combination tells you |
|---|---|---|
| Avg Query Queue Depth per Warehouse | The direct consequence of saturation: queries waiting. | 100% saturation with deep queueing equals real concurrency starvation, not a healthy busy warehouse. |
| Warehouse Queueing Sustained (>5 queries queued) | The alert that fires when saturation translates into a backlog. | Sustained saturation is the leading indicator; sustained queueing is the confirmed impact. |
| Query Latency p95 (ms) | Queue time inflates end-to-end latency even when execution is fast. | Rising p95 with high saturation means the wait, not the compute, is the bottleneck. |
| Queries per Hour (live) | The load driving saturation up. | A QPS surge plus rising saturation pinpoints the concurrency pressure source. |
| Credits by Warehouse (7d) | The cost side of any upsize or multi-cluster decision. | Confirms which warehouse to invest concurrency budget in. |
| Snowflake Health Score | The composite that takes saturation as a capacity input. | Sustained saturation pulls the capacity dimension of the score down. |
Reconciling against the source
Where to look in Snowflake:Snowsight → Admin → Warehouses shows each warehouse’s live load, running and queued query counts, and cluster count.Why our number may legitimately differ from Snowflake’s own view:SHOW PARAMETERS LIKE 'MAX_CONCURRENCY_LEVEL' IN WAREHOUSE <name>;confirms the concurrency ceiling. QuerySNOWFLAKE.ACCOUNT_USAGE.WAREHOUSE_LOAD_HISTORY(columnsAVG_RUNNING,AVG_QUEUED_LOAD,AVG_QUEUED_PROVISIONING) for the historical load profile.SHOW WAREHOUSES;reveals current state, size, andMIN/MAX_CLUSTER_COUNT.
| Reason | Direction | Why |
|---|---|---|
| Capacity definition | Variable | We compute saturation against MAX_CONCURRENCY_LEVEL * active_clusters; WAREHOUSE_LOAD_HISTORY reports AVG_RUNNING as a fractional load, not a percentage of a fixed ceiling. |
| Smoothing window | Vortex IQ steadier | Snowsight shows instantaneous load; we smooth over a rolling minute, so a sub-minute spike shows higher in Snowsight than on our gauge. |
| Multi-cluster timing | Brief divergence | When a cluster is mid-spin-up, capacity is changing; our value uses the cluster count at sample time. |
WAREHOUSE_LOAD_HISTORY latency | Historical lag | That ACCOUNT_USAGE view trails real time; the live gauge uses fresher monitoring, so do not expect minute-exact agreement against the historical table. |
| Query admission edge cases | Marginal | Snowflake may admit small queries beyond the nominal level under some conditions; counts can momentarily read slightly above 100%. |
Known limitations / FAQs
Is 100% saturation always bad? No. A warehouse running at 100% with no meaningful queue is simply well-utilised, which is efficient. The concern is sustained 100% with queries piling into the queue, because that is when users feel the wait. Always read this gauge with Avg Query Queue Depth per Warehouse. My warehouse is saturated. Should I make it bigger? Usually not for saturation alone. Warehouse size (X-Small to 6X-Large) gives each query more compute; it does not raiseMAX_CONCURRENCY_LEVEL, which is what saturation measures. To absorb more concurrent queries, convert to multi-cluster (raise MAX_CLUSTER_COUNT) so Snowflake adds clusters during surges. Upsize only if individual queries are also slow.
What is the difference between saturation and CPU utilisation?
Saturation here is concurrency utilisation: how many of the warehouse’s query slots are in use. It is not a CPU or memory percentage. A warehouse can be 100% saturated (all 8 slots busy) while each query is light, or 25% saturated while two heavy queries hammer the compute. Snowflake does not expose raw CPU per warehouse, so concurrency is the meaningful saturation signal.
Why does the gauge sometimes read just over 100%?
Snowflake can briefly admit additional lightweight queries beyond the nominal MAX_CONCURRENCY_LEVEL under certain conditions, and there is a small timing window when a cluster is spinning up or down. We surface the real ratio, so transient readings slightly above 100% are expected and harmless.
Does multi-cluster scaling make saturation disappear?
It moves the ceiling. A multi-cluster warehouse adds clusters during a surge, multiplying capacity, so saturation drops as long as you have not hit MAX_CLUSTER_COUNT. If you are pinned at max clusters and still saturated, you genuinely need a higher cluster cap or workload separation.
Why the 90% trigger and not 100%?
Because 90% sustained is the early-warning point: it means you are one or two queries away from queueing on the next surge. Waiting for 100% means waiting until users are already feeling lag. Adjust the threshold per warehouse in the Sensitivity tab to match how bursty that workload is.
Should every warehouse be multi-cluster as a precaution?
No. Multi-cluster suits warehouses with spiky concurrency (interactive BI, analyst pools). For steady, serialised pipelines (ELT, ingest) where queries run one or two at a time, a single cluster is cheaper and never saturates. Match the topology to the workload shape, which Credits by Warehouse (7d) helps you reason about.