At a glance
How close your Databricks SQL warehouse is to its concurrency ceiling, expressed as a percentage. A SQL warehouse can serve a finite number of concurrent queries before new ones start queuing; saturation measures how full that pipe is right now. At 40% there is plenty of headroom; at 90% queries are about to queue, and once they queue, latency rises sharply and dashboards feel slow. For a platform team this is the gauge that tells you whether to scale out (add clusters to the warehouse), and the alert at 90% is the line where action becomes urgent rather than optional.
| What it tracks | The ratio of in-use query slots to total available slots across the warehouse’s active clusters, as dbx_pool_saturation, rendered as a gauge from 0 to 100%. |
| Data source | The SQL Warehouses API monitoring endpoint (GET /api/2.0/sql/warehouses/{id} and the warehouse monitoring stats) plus system.compute.warehouse_events and query-history concurrency, where the system schema is enabled. Saturation is active concurrency divided by the warehouse’s max concurrency for its current cluster count. |
| Why it matters | Saturation is the leading indicator of query queuing. Latency and queue-time both stay flat until saturation nears 100%, then rise non-linearly. Catching the climb to 90% lets you scale before users feel it. |
| Time window | RT/1m: real-time gauge sampled on a one-minute cadence. |
| Alert trigger | > 90%. Sustained saturation above 90% means the warehouse is at its concurrency ceiling and queries are queuing or about to. |
| Sentiment | Lower is healthier for headroom, but very low sustained values (under 10%) suggest the warehouse is over-provisioned for its load. |
| Roles | owner, engineering, operations (DBA / platform / SRE) |
Calculation
A Databricks SQL warehouse runs one or more clusters, and each cluster admits a bounded number of concurrent queries (the platform targets roughly ten running queries per cluster before it considers scaling). Saturation is the live occupancy of that capacity:Worked example
A platform team runs a SQL warehouse (size Medium, autoscaling 1 to 4 clusters) that powers internal BI dashboards plus an embedded analytics layer on a storefront. Snapshot taken on 18 Apr 26 between 09:00 and 09:30 UTC, the morning reporting peak.| Time (UTC) | Clusters | Running queries | Total slots | Saturation % | Note |
|---|---|---|---|---|---|
| 09:02 | 1 | 7 | 10 | 70 | Warming up |
| 09:08 | 1 | 10 | 10 | 100 | At ceiling, queuing begins |
| 09:09 | 2 | 11 | 20 | 55 | Autoscale added a cluster |
| 09:21 | 2 | 19 | 20 | 95 | Alert: sustained > 90% |
| 09:24 | 3 | 21 | 30 | 70 | Scaled again, recovered |
- Confirm autoscaling is doing its job. It is, by 09:24 a third cluster is up and saturation falls to 70%. The 95% window was the autoscaler’s reaction lag, not a hard ceiling. The user-visible pain lasted about three minutes.
- Reduce the reaction lag. Raising the warehouse’s minimum cluster count from 1 to 2 during business hours means the morning peak starts with more headroom and the first burst does not hit 100%. This trades a small steady-state cost for smoother peaks.
- Check whether the load is queries or one heavy query. Here it was many concurrent dashboard refreshes, a true concurrency problem that scaling out solves. If it had been one giant query holding a slot, scaling out would not have helped, and the fix would live in Top 10 Slowest SQL Queries instead.
- Saturation is the cause; latency is the effect. The latency rise on SQL Query Latency p95 (ms) at 09:21 and this gauge hitting 95% are the same event. Watch saturation to act before latency degrades.
- A warehouse pinned at max clusters and still saturated is a different problem. When there is no headroom left to autoscale into, sustained high saturation means the warehouse is genuinely too small for its peak, and the answer is a larger warehouse size or a higher max-cluster ceiling, not patience.
Sibling cards
| Card | Why pair it with SQL Warehouse Saturation | What the combination tells you |
|---|---|---|
| Active SQL Sessions | Sessions drive concurrency; more sessions push saturation up. | A session surge with rising saturation equals a genuine demand peak. |
| SQL Query Latency p95 (ms) | Latency is the user-visible effect of saturation. | p95 climbing as saturation passes 90% confirms queuing, not slow queries. |
| SQL Queries per Hour (live) | The throughput driving the gauge. | High QPH plus high saturation equals scale out; low QPH plus high saturation equals heavy queries. |
| Active SQL Warehouses | Tells you how many warehouses exist to spread load. | One saturated warehouse among many idle ones equals a routing/sizing imbalance. |
| Top 10 Slowest SQL Queries | Distinguishes concurrency pressure from one slot-hogging query. | A single slow query holding a slot saturates a small warehouse on its own. |
| Slow-Query Rate % | Slow queries occupy slots longer, inflating saturation. | Rising slow-query rate plus saturation equals queries holding slots too long. |
| Avg Cluster CPU Utilisation % | Confirms whether the warehouse hardware is also hot. | High saturation with low CPU means slots are full but compute is idle (queries waiting, not working). |
Reconciling against the source
Where to look in Databricks:
Open SQL → SQL Warehouses → (your warehouse) → Monitoring. The live charts show Running queries, Queued queries, and Cluster count over time; saturation is running queries against slot capacity.
Query SELECT * FROM system.compute.warehouse_events WHERE warehouse_id = '...' (where the system schema is enabled) for scale-up / scale-down events that change the denominator.
The Query History view, filtered to the warehouse, shows per-query queue time, the direct symptom of saturation.
Why our number may legitimately differ from the Databricks UI:
| Reason | Direction | Why |
|---|---|---|
| Denominator timing | Brief mismatch | Saturation depends on current cluster count; during an autoscale event the UI and our poll can briefly disagree on the slot total. |
| Sampling cadence | Smoothing | Vortex IQ samples on a one-minute cadence; the UI’s live chart can show sub-minute spikes we average out. |
| Slots-per-cluster model | Variable | The exact concurrency a cluster admits depends on query weight; we use the platform’s nominal target, so our percentage is an estimate of occupancy, not an exact slot count. |
| Queued vs running | Definition | Our gauge measures running occupancy; a warehouse can be 100% saturated with a long queue behind it, which the UI shows as a separate “queued” series. |
| Time zone | Display only | Chart axes render in workspace time in the UI and profile time in Vortex IQ; the percentages are identical. |
| Card | Expected relationship | What causes divergence |
|---|---|---|
| Databricks SQL Spike vs Ecom Order Rate | A storefront traffic spike drives query volume and saturation. | Saturation rising with no order/traffic spike points at internal BI, not the storefront. |
| Slow SQL Queries During Checkout Window | Saturation during peak checkout slows embedded analytics. | High saturation off-peak is a reporting job, not a customer-facing risk. |