Query Latency p95 (ms), Snowflake - Vortex IQ Help Centre

Card class: Hero • Category: Performance

At a glance

Query Latency p95 (ms) is the 95th-percentile end-to-end execution time across all queries in the window: 95% of queries finish faster than this number, and the slowest 5% are above it. It is the most useful single read on the experience your heaviest, most-watched workloads feel, because dashboards and scheduled loads tend to sit in that upper tail. The median (p50) can look healthy while the p95 is quietly punishing your biggest reports. One important Snowflake-specific note: Snowflake reports query latency natively in seconds in most of its console views, not milliseconds, so the threshold here (5000ms equals 5 seconds) reflects an analytics use case where multi-second queries are normal, unlike a transactional database where 5 seconds would be catastrophic.


What it tracks	The 95th percentile of `TOTAL_ELAPSED_TIME` across all queries in the selected period, expressed in milliseconds.
Data source	`detail`: Snowflake p95 measured in seconds, not ms, so the threshold reflects an analytics use case. The card normalises Snowflake’s native seconds into milliseconds for display; computed from `TOTAL_ELAPSED_TIME` in `QUERY_HISTORY`.
Time window	`RT/1h` (live read over the rolling last hour).
Alert trigger	`> 5000ms` (5 seconds). Sustained p95 above this pages the platform on-call.
Roles	owner, platform, SRE, data engineering

Calculation

The card computes the 95th percentile of TOTAL_ELAPSED_TIME (Snowflake’s wall-clock time from query submission to result return, inclusive of any queue and provisioning time) across every query that completed in the rolling hour. Snowflake stores this column in milliseconds internally, but its Snowsight Query History and Query Profile views display it in seconds; the card displays milliseconds for consistency with other latency cards, which is why the 5000ms threshold reads as 5 seconds when you cross-check in the console. Because p95 includes queue time, a spike can come from the warehouse being overloaded (queries waiting before they even run) as much as from the queries themselves being heavy. See the worked example below for how to tell the two apart.

Worked example

A data platform team runs a Snowflake account that serves a finance close dashboard, an hourly ELT pipeline, and an analyst worksheet pool. Snapshot taken on 22 Apr 26 at 16:20 BST, during the monthly finance close when dashboard usage triples.

Period	p50 latency	p95 latency	p99 latency	Read
Baseline (typical afternoon)	480ms	3,100ms	9,200ms	Healthy, p95 under threshold
16:20 today (finance close)	510ms	7,400ms	41,000ms	p95 breached, p50 barely moved

The p95 card turns red at 7,400ms (7.4 seconds), well over the 5,000ms threshold, while p50 is essentially unchanged at 510ms. The platform team’s read:

The median is fine, so this is a tail problem, not a whole-distribution problem. p50 moved 30ms; the typical analyst worksheet still returns in half a second. The pain is concentrated in the heavy queries, the finance close dashboard tiles that scan large fact tables and the ELT models running concurrently.
The cause is contention, not bad SQL. Drilling into Avg Query Queue Depth per Warehouse shows the FINANCE_WH queue depth sitting at 7 sustained. The heavy dashboard queries are not slow because they are badly written; they are slow because they are waiting in line behind the ELT job on a warehouse that is too small for the concurrent close-day load.
The fix is a warehouse change, not a query rewrite. Options, in order of speed: enable multi-cluster scaling on FINANCE_WH so a second cluster spins up under queue pressure, or temporarily resize it from Medium to Large for the close window, or move the ELT job to its own warehouse so it stops competing with interactive dashboards.

Decomposing the 7,400ms p95:
  - Execution time (actual compute): ~3,200ms (close to baseline p95)
  - Queue + provisioning time:        ~4,200ms (the new contention)
  => More than half the p95 is queue time, confirming a sizing problem
     rather than a query-complexity problem

Three takeaways:

Always read p95 against p50. p95 up with p50 flat equals a tail/contention problem affecting the heavy workloads. p95 up and p50 up equals a whole-distribution shift, which points to an account-wide cause: result cache cold after a restart, statistics drift, or a broad data-volume increase.
p95 includes queue time, so it is half a performance metric and half a capacity metric. Decompose it (execution time vs queue time) before deciding whether to optimise the query or resize the warehouse. The Avg Query Queue Depth per Warehouse card does this split for you.
5 seconds is the right alarm for analytics, not for OLTP. Snowflake is a warehouse, not a transactional store. A 5-second p95 on a transactional database would be a crisis; on an analytics warehouse running multi-table scans it is the sensible line between “normal heavy query” and “something is wrong”. Adjust in the Sensitivity tab if your workload is lighter or heavier.

Sibling cards to reference together

Card	Why pair it with Query Latency p95	What the combination tells you
Query Latency p50 (ms)	The median peer; the baseline the tail rides on.	p95 up, p50 flat equals tail/contention; both up equals whole-distribution shift.
Query Latency p99 (ms)	The extreme tail; the very worst queries.	p99 far above p95 means a small set of pathological queries dominates the worst case.
Avg Query Queue Depth per Warehouse	Splits p95 into execution vs queue time.	Deep queue plus high p95 equals an undersized warehouse, not bad SQL.
Warehouse Saturation %	Confirms whether running warehouses are maxed out.	High saturation plus high p95 confirms a capacity ceiling.
Slow-Query Rate %	The count-based companion to the percentile view.	A rising slow-query rate and rising p95 agree the tail is growing.
Top 10 Slowest Queries	Names the specific statements in the slow tail.	Use it to find the exact queries dragging p95 up.
Query Error Rate %	Statement timeouts appear as both slow and failed.	High p95 approaching the statement timeout means failures are next.
Snowflake Health Score	The composite that weights latency.	A sustained p95 breach drags the composite down even if other inputs are green.

Reconciling against the source

Where to look in Snowflake’s own tooling:

Snowsight to Monitoring to Query History, then sort by Duration and read the slowest queries (Snowsight shows duration in seconds, so divide our ms by 1000 to compare). Query Profile on any individual slow query to see the execution vs queue time breakdown. SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY or the live INFORMATION_SCHEMA.QUERY_HISTORY table function for the authoritative numbers.

To reproduce p95 over the last hour (returns milliseconds, matching the card):

SELECT PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY TOTAL_ELAPSED_TIME) AS p95_ms
FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY(
  DATEADD('hour', -1, CURRENT_TIMESTAMP()), CURRENT_TIMESTAMP()));

Why our number may legitimately differ from Snowflake’s UI:

Reason	Direction	Why
Seconds vs milliseconds	Apparent 1000x gap	Snowsight displays seconds; the card displays milliseconds. 7.4s in Snowsight equals 7,400ms here. This is a display convention, not a real difference.
ACCOUNT_USAGE latency	Brief lag	`ACCOUNT_USAGE` can trail live activity by up to 45 minutes; the live read uses `INFORMATION_SCHEMA` but longer windows may use `ACCOUNT_USAGE`.
Queue time inclusion	Vortex IQ may read higher	`TOTAL_ELAPSED_TIME` includes queue and provisioning time; a UI view filtered to pure execution time will read lower.
Time zone	Window boundary shift	Snowsight uses account time zone; Vortex IQ aligns to your reporting time zone.

Cross-connector reconciliation:

Card	Expected relationship	What causes divergence
`slow-analytics-queries-during-checkout-window`	High p95 during peak ecom windows is higher-impact.	A p95 spike overlapping a sales event slows the reporting the business is actively watching.
Ecom order volume (Shopify / BigCommerce / Adobe)	No direct causal link.	A latency spike during a promotion delays merchandising’s ability to react to live sales.

Known limitations / FAQs

Why is the threshold 5,000ms when my queries routinely take 8 seconds? Snowflake is an analytics warehouse, so multi-second queries are normal and 5 seconds is a deliberately conservative line. If your workload is genuinely heavy (large fact-table scans, complex joins), raise the threshold in the Sensitivity tab to a level you would not see during healthy operation. The default exists to catch contention and regression, not to flag every legitimately heavy query. My p95 is high but p50 is fine. Is anything actually wrong? It depends on what lives in your tail. If the slow 5% are background ELT jobs nobody is waiting on, a high p95 may be acceptable. If the slow 5% are interactive dashboard tiles your finance team refreshes during close, it is a real experience problem. Use Top 10 Slowest Queries to see who is in the tail before deciding whether to act. The card shows milliseconds but Snowsight shows seconds. Which is right? Both are correct; they are the same number in different units. The card normalises to milliseconds so it lines up with the p50 and p99 cards. Divide the card value by 1000 to compare against Snowsight’s Duration column. Does p95 include the time a query spent queued? Yes. TOTAL_ELAPSED_TIME is wall-clock time from submission to result, which includes queue and provisioning time. That is why a p95 spike can be a capacity problem (queries waiting) rather than a query-complexity problem. Decompose it with Avg Query Queue Depth per Warehouse. Should I resize my warehouse or rewrite my queries when p95 spikes? Decompose first. If most of the p95 is queue time, the warehouse is undersized: enable multi-cluster scaling or resize up. If most of it is execution time, the queries are heavy: use Query Profile to find expensive scans, add clustering keys, or prune the data scanned. Resizing a warehouse to fix badly written queries just makes expensive queries fail faster. Why does my p95 drop sharply right after a warehouse restart, then climb? Snowflake’s result cache and warehouse-local data cache are cold after a suspend or restart. The first queries pay full I/O cost (sometimes inflating p95), then as the cache warms repeat queries serve from cache and latency falls. A sawtooth p95 that tracks auto-suspend cycles is normal; tune AUTO_SUSPEND if the cold-start cost outweighs the credit savings of suspending.

Tracked live in Vortex IQ Nerve Centre

Query Latency p95 (ms) is one of hundreds of KPI pulses Vortex IQ tracks across Snowflake and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

​At a glance

​Calculation

​Worked example

​Sibling cards to reference together

​Reconciling against the source

​Known limitations / FAQs

​Tracked live in Vortex IQ Nerve Centre