Atlassian Statuspage audit profile, Vortex IQ

Nerve Centre KPIs · Audit Profile · Sentiment Settings A status page that lies to customers is worse than no status page. This audit answers four questions: (1) does the public status page match reality right now (components down with no published incident, incidents open on healthy components), (2) are the incidents we publish handled with discipline (acknowledged/resolved within target, not left stale), (3) are we keeping the uptime SLA the page advertises, and (4) when a component IS down, how much money is on fire per minute while a commerce sibling is live.

What this audit checks

Authentication & access

API key valid (auth on GET /v1/pages - returns the account’s pages)
page_id resolves to a page the key can read
Key has read scope on components, incidents, and metrics

Status-page truthfulness (the customer-trust test)

Components in partial_outage / major_outage with NO open incident published (page is lying)
Open incident referencing components that are all operational (stale incident, page over-reporting)
Components stuck in under_maintenance > 24h (forgotten maintenance window)
only_show_if_degraded components that are degraded but hidden from the public page
System metrics with no data in > 60 min (stale performance display)

Incident hygiene

Incidents in investigating state > 30 min without an identified/monitoring update
Mean time to acknowledge > 15 min (slow first update)
Mean time to resolve > 60 min (slow recovery)
Repeated incidents on the same component in 24h (noisy or unstable)
Top alerting components concentrating > 50% of incidents
Major/critical-impact incidents resolved without a postmortem

Reliability & SLA health

Component-group availability below SLA target (< 99.5% rolling 30D)
More than one component in major_outage concurrently (correlated outage)
Components in degraded_performance sustained > 15 min
Apdex below 0.85 / p95 above 1500ms on published system metrics
Error rate above 2% on published system metrics

Cross-channel: revenue-at-risk (the killer area)

Component in major_outage with sibling commerce connector live = compute $/min lost (commerce.revenue_per_min × outage_minutes × estimated_traffic_loss_pct)
Checkout / cart component degraded or down during peak commerce hours
Outage window overlapping a campaign push (sibling = google_ads / klaviyo) - paying for traffic that lands on a broken page
Conversion drop during published-incident windows (vs 90D baseline)

Severity thresholds

Signal	Warn	Critical
`availability_pct`	99.9	99.5
`incidents_open_count`	1	3
`mtta_ms`	300000	900000
`mttr_ms`	1800000	3600000
`components_major_outage_count`	1	1
`components_degraded_count`	1	2
`untruthful_components_count`	0	1
`error_rate_pct`	1	2
`p95_latency_ms`	800	1500
`apdex`	0.9	0.85
`metric_staleness_min`	30	60

Data sources

GET https://api.statuspage.io/v1/pages - Auth + page inventory
GET https://api.statuspage.io/v1/pages/{page_id}/components - Component status inventory (healthy/degraded/down truthfulness)
GET https://api.statuspage.io/v1/pages/{page_id}/component-groups - Group availability rollups for SLA
GET https://api.statuspage.io/v1/pages/{page_id}/incidents - Incident inventory + MTTA / MTTR + revenue-at-risk join
GET https://api.statuspage.io/v1/pages/{page_id}/incidents/unresolved - Currently-open incidents for truthfulness cross-check
GET https://api.statuspage.io/v1/pages/{page_id}/metrics - Published system metrics (apdex / latency / error-rate / throughput)
GET https://api.statuspage.io/v1/pages/{page_id}/metrics/{metric_id}/data - Metric data series for freshness + threshold checks

​What this audit checks

​Authentication & access

​Status-page truthfulness (the customer-trust test)

​Incident hygiene

​Reliability & SLA health

​Cross-channel: revenue-at-risk (the killer area)

​Severity thresholds

​Data sources