Skip to main content
Nerve Centre KPIs · Audit Profile · Sentiment Settings Pingdom uptime data is only worth money when it’s joined to revenue. This audit answers four questions: (1) is the storefront and its critical paths actually up right now, (2) are we meeting the uptime SLA this period, (3) when something breaks, how fast do we acknowledge and resolve it (MTTA / MTTR), and (4) when a probe is down in a region, how much commerce revenue is on fire per minute via the sibling ecommerce connector?

What this audit checks

Authentication & access

  • API token valid (auth on /api/3.1/credits)
  • X-Pingdom-Account header correct for multi-user sub-accounts
  • Check / TMS quota headroom > 15% (credits endpoint)
  • Token scope covers checks + actions + summary endpoints

Uptime & SLA

  • Any check fully down (status = down / unconfirmed_down)
  • Region-specific outage - probe failing only from some locations (CDN / DNS issue)
  • Uptime below SLA target (default < 99.5%) over the period
  • Degraded checks - slow or intermittent but not fully down
  • Paused checks not re-enabled after > 7 days (silent blind spot)

Performance & latency

  • Average response time above 1500ms sustained
  • p95 response time above 1500ms
  • p99 response time above 3000ms
  • Apdex below 0.85 (frustrated-experience threshold)
  • Throughput dropped > 30% vs prior period (capacity / outage signal)

Incident response & coverage (the blind-spot test)

  • MTTA above 15 min (alerts firing but nobody acknowledging)
  • MTTR above 60 min (slow recovery)
  • Checks without a notification / integration channel (fires silently)
  • Checks flapping repeatedly in 24h (noisy threshold / real instability)
  • Critical path uncovered - no check on the storefront / checkout URL

Cross-channel: revenue-at-risk (the killer area)

  • Storefront check down with sibling commerce connector live = compute $/min lost (commerce.revenue_per_min × down_minutes × estimated_traffic_loss_pct)
  • Checkout / cart URL check degraded (p95 > 3s) during peak hours
  • Probe failing in a region during a campaign push (sibling = google_ads / klaviyo) - paying for traffic that can’t load the page
  • Conversion drop during outage windows (vs 90D commerce baseline)

Severity thresholds

SignalWarnCritical
apdex0.90.85
error_rate_pct12
avg_response_ms10001500
p95_latency_ms10001500
p99_latency_ms15003000
sla_compliance_pct99.999.5
services_down_count01
services_degraded_count12
incidents_open_count13
mtta_ms300000900000
mttr_ms18000003600000
throughput_change_pct_vsP-15-30

Data sources

  • GET https://api.pingdom.com/api/3.1/credits - Auth + token sanity + quota headroom
  • GET https://api.pingdom.com/api/3.1/checks - Check inventory + current up/down/paused states + tags
  • GET https://api.pingdom.com/api/3.1/summary.performance/{checkid} - Response-time percentiles + uptime per check
  • GET https://api.pingdom.com/api/3.1/summary.outage/{checkid} - Outage windows for SLA compliance + incident duration
  • GET https://api.pingdom.com/api/3.1/actions - Alert events for MTTA + top-alerting + acknowledgement
  • GET https://api.pingdom.com/api/3.1/results/{checkid} - Raw probe results for top error-type clustering
  • GET https://api.pingdom.com/api/3.1/tms/checks - Transaction (multi-step) checks for critical-path coverage