At a glance
Query Latency p95 (ms) is the 95th-percentile end-to-end execution time across all queries in the window: 95% of queries finish faster than this number, and the slowest 5% are above it. It is the most useful single read on the experience your heaviest, most-watched workloads feel, because dashboards and scheduled loads tend to sit in that upper tail. The median (p50) can look healthy while the p95 is quietly punishing your biggest reports. One important Snowflake-specific note: Snowflake reports query latency natively in seconds in most of its console views, not milliseconds, so the threshold here (5000ms equals 5 seconds) reflects an analytics use case where multi-second queries are normal, unlike a transactional database where 5 seconds would be catastrophic.
| What it tracks | The 95th percentile of TOTAL_ELAPSED_TIME across all queries in the selected period, expressed in milliseconds. |
| Data source | detail: Snowflake p95 measured in seconds, not ms, so the threshold reflects an analytics use case. The card normalises Snowflake’s native seconds into milliseconds for display; computed from TOTAL_ELAPSED_TIME in QUERY_HISTORY. |
| Time window | RT/1h (live read over the rolling last hour). |
| Alert trigger | > 5000ms (5 seconds). Sustained p95 above this pages the platform on-call. |
| Roles | owner, platform, SRE, data engineering |
Calculation
The card computes the 95th percentile ofTOTAL_ELAPSED_TIME (Snowflake’s wall-clock time from query submission to result return, inclusive of any queue and provisioning time) across every query that completed in the rolling hour. Snowflake stores this column in milliseconds internally, but its Snowsight Query History and Query Profile views display it in seconds; the card displays milliseconds for consistency with other latency cards, which is why the 5000ms threshold reads as 5 seconds when you cross-check in the console. Because p95 includes queue time, a spike can come from the warehouse being overloaded (queries waiting before they even run) as much as from the queries themselves being heavy. See the worked example below for how to tell the two apart.
Worked example
A data platform team runs a Snowflake account that serves a finance close dashboard, an hourly ELT pipeline, and an analyst worksheet pool. Snapshot taken on 22 Apr 26 at 16:20 BST, during the monthly finance close when dashboard usage triples.| Period | p50 latency | p95 latency | p99 latency | Read |
|---|---|---|---|---|
| Baseline (typical afternoon) | 480ms | 3,100ms | 9,200ms | Healthy, p95 under threshold |
| 16:20 today (finance close) | 510ms | 7,400ms | 41,000ms | p95 breached, p50 barely moved |
- The median is fine, so this is a tail problem, not a whole-distribution problem. p50 moved 30ms; the typical analyst worksheet still returns in half a second. The pain is concentrated in the heavy queries, the finance close dashboard tiles that scan large fact tables and the ELT models running concurrently.
- The cause is contention, not bad SQL. Drilling into Avg Query Queue Depth per Warehouse shows the
FINANCE_WHqueue depth sitting at 7 sustained. The heavy dashboard queries are not slow because they are badly written; they are slow because they are waiting in line behind the ELT job on a warehouse that is too small for the concurrent close-day load. - The fix is a warehouse change, not a query rewrite. Options, in order of speed: enable multi-cluster scaling on
FINANCE_WHso a second cluster spins up under queue pressure, or temporarily resize it from Medium to Large for the close window, or move the ELT job to its own warehouse so it stops competing with interactive dashboards.
- Always read p95 against p50. p95 up with p50 flat equals a tail/contention problem affecting the heavy workloads. p95 up and p50 up equals a whole-distribution shift, which points to an account-wide cause: result cache cold after a restart, statistics drift, or a broad data-volume increase.
- p95 includes queue time, so it is half a performance metric and half a capacity metric. Decompose it (execution time vs queue time) before deciding whether to optimise the query or resize the warehouse. The Avg Query Queue Depth per Warehouse card does this split for you.
- 5 seconds is the right alarm for analytics, not for OLTP. Snowflake is a warehouse, not a transactional store. A 5-second p95 on a transactional database would be a crisis; on an analytics warehouse running multi-table scans it is the sensible line between “normal heavy query” and “something is wrong”. Adjust in the Sensitivity tab if your workload is lighter or heavier.
Sibling cards to reference together
| Card | Why pair it with Query Latency p95 | What the combination tells you |
|---|---|---|
| Query Latency p50 (ms) | The median peer; the baseline the tail rides on. | p95 up, p50 flat equals tail/contention; both up equals whole-distribution shift. |
| Query Latency p99 (ms) | The extreme tail; the very worst queries. | p99 far above p95 means a small set of pathological queries dominates the worst case. |
| Avg Query Queue Depth per Warehouse | Splits p95 into execution vs queue time. | Deep queue plus high p95 equals an undersized warehouse, not bad SQL. |
| Warehouse Saturation % | Confirms whether running warehouses are maxed out. | High saturation plus high p95 confirms a capacity ceiling. |
| Slow-Query Rate % | The count-based companion to the percentile view. | A rising slow-query rate and rising p95 agree the tail is growing. |
| Top 10 Slowest Queries | Names the specific statements in the slow tail. | Use it to find the exact queries dragging p95 up. |
| Query Error Rate % | Statement timeouts appear as both slow and failed. | High p95 approaching the statement timeout means failures are next. |
| Snowflake Health Score | The composite that weights latency. | A sustained p95 breach drags the composite down even if other inputs are green. |
Reconciling against the source
Where to look in Snowflake’s own tooling:Snowsight to Monitoring to Query History, then sort by Duration and read the slowest queries (Snowsight shows duration in seconds, so divide our ms by 1000 to compare). Query Profile on any individual slow query to see the execution vs queue time breakdown.To reproduce p95 over the last hour (returns milliseconds, matching the card):SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORYor the liveINFORMATION_SCHEMA.QUERY_HISTORYtable function for the authoritative numbers.
| Reason | Direction | Why |
|---|---|---|
| Seconds vs milliseconds | Apparent 1000x gap | Snowsight displays seconds; the card displays milliseconds. 7.4s in Snowsight equals 7,400ms here. This is a display convention, not a real difference. |
| ACCOUNT_USAGE latency | Brief lag | ACCOUNT_USAGE can trail live activity by up to 45 minutes; the live read uses INFORMATION_SCHEMA but longer windows may use ACCOUNT_USAGE. |
| Queue time inclusion | Vortex IQ may read higher | TOTAL_ELAPSED_TIME includes queue and provisioning time; a UI view filtered to pure execution time will read lower. |
| Time zone | Window boundary shift | Snowsight uses account time zone; Vortex IQ aligns to your reporting time zone. |
| Card | Expected relationship | What causes divergence |
|---|---|---|
slow-analytics-queries-during-checkout-window | High p95 during peak ecom windows is higher-impact. | A p95 spike overlapping a sales event slows the reporting the business is actively watching. |
| Ecom order volume (Shopify / BigCommerce / Adobe) | No direct causal link. | A latency spike during a promotion delays merchandising’s ability to react to live sales. |
Known limitations / FAQs
Why is the threshold 5,000ms when my queries routinely take 8 seconds? Snowflake is an analytics warehouse, so multi-second queries are normal and 5 seconds is a deliberately conservative line. If your workload is genuinely heavy (large fact-table scans, complex joins), raise the threshold in the Sensitivity tab to a level you would not see during healthy operation. The default exists to catch contention and regression, not to flag every legitimately heavy query. My p95 is high but p50 is fine. Is anything actually wrong? It depends on what lives in your tail. If the slow 5% are background ELT jobs nobody is waiting on, a high p95 may be acceptable. If the slow 5% are interactive dashboard tiles your finance team refreshes during close, it is a real experience problem. Use Top 10 Slowest Queries to see who is in the tail before deciding whether to act. The card shows milliseconds but Snowsight shows seconds. Which is right? Both are correct; they are the same number in different units. The card normalises to milliseconds so it lines up with the p50 and p99 cards. Divide the card value by 1000 to compare against Snowsight’s Duration column. Does p95 include the time a query spent queued? Yes.TOTAL_ELAPSED_TIME is wall-clock time from submission to result, which includes queue and provisioning time. That is why a p95 spike can be a capacity problem (queries waiting) rather than a query-complexity problem. Decompose it with Avg Query Queue Depth per Warehouse.
Should I resize my warehouse or rewrite my queries when p95 spikes?
Decompose first. If most of the p95 is queue time, the warehouse is undersized: enable multi-cluster scaling or resize up. If most of it is execution time, the queries are heavy: use Query Profile to find expensive scans, add clustering keys, or prune the data scanned. Resizing a warehouse to fix badly written queries just makes expensive queries fail faster.
Why does my p95 drop sharply right after a warehouse restart, then climb?
Snowflake’s result cache and warehouse-local data cache are cold after a suspend or restart. The first queries pay full I/O cost (sometimes inflating p95), then as the cache warms repeat queries serve from cache and latency falls. A sawtooth p95 that tracks auto-suspend cycles is normal; tune AUTO_SUSPEND if the cold-start cost outweighs the credit savings of suspending.