What this audit checks
Authentication & access
- API key valid (auth on /v2/getAccountDetails)
- Key type is a Main / account-level key, not a single-monitor key (full coverage)
- Monitor quota headroom > 15% (account up_monitors vs plan limit)
- Request rate within the account-tier window (avoid 429 throttling)
Uptime & SLA
- Any monitor fully down (status = down)
- Any monitor seems_down (intermittent / region-specific failure)
- Uptime ratio below SLA target (default < 99.5%) over the period
- SSL certificate expiring within 14 days (ssl_expiry_days)
- Paused monitors not re-enabled after > 7 days (silent blind spot)
Performance & latency
- Average response time above 1500ms sustained
- p95 response time above 1500ms
- p99 response time above 3000ms
- Apdex below 0.85 (frustrated-experience threshold)
- Throughput dropped > 30% vs prior period (capacity / outage signal)
Incident response & coverage (the blind-spot test)
- MTTA above 15 min (alerts firing but nobody acknowledging)
- MTTR above 60 min (slow recovery)
- Monitors without an alert contact attached (fires silently)
- Monitors flapping repeatedly in 24h (noisy threshold / real instability)
- Critical path uncovered - no monitor on the storefront / checkout URL
Cross-channel: revenue-at-risk (the killer area)
- Storefront monitor down with sibling commerce connector live = compute $/min lost (commerce.revenue_per_min × down_minutes × estimated_traffic_loss_pct)
- Checkout / cart URL monitor degraded (response > 3s) during peak hours
- Monitor down during a campaign push (sibling = google_ads / klaviyo) - paying for traffic that can’t load the page
- Conversion drop during outage windows (vs 90D commerce baseline)
Severity thresholds
| Signal | Warn | Critical |
|---|---|---|
apdex | 0.9 | 0.85 |
error_rate_pct | 1 | 2 |
avg_response_ms | 1000 | 1500 |
p95_latency_ms | 1000 | 1500 |
p99_latency_ms | 1500 | 3000 |
sla_compliance_pct | 99.9 | 99.5 |
services_down_count | 0 | 1 |
services_degraded_count | 1 | 2 |
incidents_open_count | 1 | 3 |
ssl_expiry_days | 30 | 14 |
mtta_ms | 300000 | 900000 |
mttr_ms | 1800000 | 3600000 |
throughput_change_pct_vsP | -15 | -30 |
Data sources
POST https://api.uptimerobot.com/v2/getAccountDetails- Auth + key sanity + monitor quota headroomPOST https://api.uptimerobot.com/v2/getMonitors- Monitor inventory + up/down/paused states + response_times + logs + ssl + tagsPOST https://api.uptimerobot.com/v2/getAlertContacts- Alert-contact inventory for silent-monitor (no-contact) coverage checksPOST https://api.uptimerobot.com/v2/getMWindows- Maintenance windows - exclude planned downtime from SLA + incident mathPOST https://api.uptimerobot.com/v2/getPSPs- Public status pages - verify a status page exists for the storefront