What this audit checks
Credit Burn & Cost
- Credit burn +50% week-over-week without proportional query growth (runaway query / scheduled job overrun)
- Idle warehouse credits >10% of total (auto-suspend not firing; warehouse staying warm idle)
- Cost per query rising >25% week-over-week (query inefficiency or result-set bloat)
- Warehouse right-sizing opportunity - sustained low utilization (<30% for >7 days; downsize candidate)
Performance & Queueing
- Query latency p95 >5 seconds sustained 15m (INFORMATION_SCHEMA measurement; 45min+ latency from ACCOUNT_USAGE)
- Query queue depth >5 queries sustained 10m per warehouse (warehouse undersized; queries backing up)
- Slow-query rate >5% of total queries (TOTAL_ELAPSED_TIME > 5000ms in QUERY_HISTORY)
- Query error rate spike >1% in 1h window (syntax errors, permission failures, resource exhaustion)
Capacity & Saturation
- Warehouse saturation >90% (running_queries / max_concurrency_level sustained; need upsize or multi-cluster)
- Disk usage >90% (database/schema storage approaching quota; table growth outpacing cleanup)
- Active sessions climbing sustained (connection pool pressure from application retry loops)
Replication & Backup
- Cross-account replication lag >10 seconds (DATABASE_REPLICATION_USAGE_HISTORY measurement; stale replica risk)
- Time Travel snapshot >72 hours old (retention floor violation; recovery window shrinking)
- Failed login attempts >10 in 24h (LOGIN_HISTORY spike; brute force or stale credentials)
Severity thresholds
| Signal | Warn | Critical |
|---|---|---|
credit_burn_pct_wow | 25 | 50 |
idle_warehouse_credits_pct | 5 | 10 |
cost_per_query_pct_wow | 15 | 25 |
query_latency_p95_ms | 2000 | 5000 |
query_queue_depth_sustained | 3 | 5 |
slow_query_pct | 2 | 5 |
warehouse_saturation_pct | 75 | 90 |
disk_usage_pct | 75 | 90 |
replication_lag_seconds | 5 | 10 |
time_travel_age_hours | 48 | 72 |
failed_login_count_24h | 5 | 10 |
Data sources
GET- Credit burn and warehouse cost trends (45min-3hr latency)GET- Latency percentiles, queue depth, errors, elapsed time (45min-3hr latency)GET- Real-time 7-day query performance; use for p95/p99 over ACCOUNT_USAGE when fresh data neededGET- Per-warehouse credit consumption and idle-time detectionGET- Disk capacity, database-level and table-level growth trendsGET- Replication lag per database and target accountGET- Failed login detection; bursts indicate brute force or credential rotation issues