Datadog state × commerce-sibling baseline = $/hour at risk while the incident is open. The single most-valuable card in this manifest.
At a glance
The live financial cost of a Datadog incident, in pounds-per-hour, computed by multiplying incident severity by the merchant’s commerce-sibling baseline revenue rate and an estimated traffic-loss percentage. The card translates “we have a SEV-1 open” into “you are losing £X,XXX/hour right now”. The single most-valuable card in this manifest because it converts engineering jargon into the only number the founder cares about.
| The formula | revenue_at_risk_per_hour = active_severity_factor × commerce_sibling.revenue_per_min(90D_avg) × 60 × estimated_traffic_loss_pct. Active severity factor is 0 if no incident is open, 0.25 for SEV-3, 0.50 for SEV-2, 1.00 for SEV-1. |
| API endpoints touched | Datadog Incidents (/api/v2/incidents?filter[state]=active) for active incident severity; commerce-sibling KPI endpoint for the 90-day revenue/min baseline. |
| Estimated traffic loss percentage | Calibrated per severity: SEV-1 assumes 35% traffic loss (severe but not full outage), SEV-2 assumes 15%, SEV-3 assumes 5%. Adjustable per merchant in Settings → Datadog → Revenue-at-Risk Calibration. |
| Why “estimated” and not measured | Measured loss requires the incident to be over before you can compare to baseline. Live loss must be estimated from severity and historical patterns. The estimate is calibrated against post-incident measured loss across all Vortex IQ merchants and is typically accurate within ±25%. |
| Aggregation window | Real-time, refreshed every 60 seconds while any incident is open. |
| Severity threshold | All severities; SEV-3 weighted at 25% of SEV-1 because the customer impact is genuinely smaller. |
| Alert pre-filtering | Test incidents ([TEST] titled, or tagged incident_type:test) excluded. Drill incidents must not generate phantom revenue-at-risk numbers that startle finance teams. |
| Log Management gating | Not used. The card consumes incident state and commerce-sibling baseline; both are independent of Logs. |
| Commerce-sibling required | This card needs a commerce platform (Shopify, BigCommerce, Adobe Commerce) connected to compute the baseline. If no commerce sibling is connected, the card displays “Connect a commerce platform to enable Revenue at Risk”. |
| Time zone | UTC for cross-connector arithmetic; baseline revenue/min uses the commerce-sibling’s 90-day rolling average over the same hour-of-week. |
| Time window | RT (real-time, refreshed every 60 seconds). Display window is “while incident is open”, which is typically 15-180 minutes. |
| Alert trigger | > $0, the card surfaces any non-zero value as a notification (because by definition zero means no incident is open). |
| Roles | owner, finance, operations |
Calculation
Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.Worked example
A US specialty foods brand on BigCommerce with Datadog monitoring web, checkout, and payment. The 90-day baseline is 1,200/hour overnight. Scenario A: SEV-2 search latency degradation, opened 13:42 GMT (mid-afternoon).- The severity factor is calibrated, not arbitrary. SEV-1 = 1.00 because customers cannot complete the action; SEV-2 = 0.50 because most can but slowly; SEV-3 = 0.25 because the impact is indirect. These factors are tuned against historical post-incident measurements across all Vortex IQ merchants.
- The commerce-sibling baseline is timezone-aligned. The card uses the 90-day average revenue/min for the same hour of week the incident is currently in. A 13:00 Saturday baseline is different from a 13:00 Tuesday baseline. This prevents the card from over- or under-stating the cost during weekend or overnight hours.
- Estimates are intentional and honest. The card does not claim measured loss; it shows estimated loss while the incident is open, with a confidence band. Once the incident closes, the Conversion Drop During Incidents card surfaces the post-incident measured loss for reconciliation. Use both: estimated for live decisions, measured for post-mortems.
Sibling cards merchants should reference together
| Card | Why pair it with Revenue at Risk | What the combination tells you |
|---|---|---|
| Active Incidents | The state input for the formula. | If Active Incidents is zero but Revenue at Risk shows a value, your incident has just resolved (cached). |
| Operational Health Score | The composite engineering view. | Composite below 70 plus Revenue at Risk above zero equals real, measurable, costly incident. |
| Conversion Drop During Incidents | The post-incident measured peer. | Live (estimated) versus measured (post-incident) lets you reconcile and recalibrate the severity factors. |
| Revenue Lost / Min (active incidents) | The per-minute version of this card. | Per-minute is for live ticker; per-hour is for executive comms. |
| Cart Abandonment During 5xx Spikes | Mechanism: how the revenue gets lost during incidents. | High abandonment plus high Revenue at Risk equals “incident is converting visitors to bouncers”; low abandonment equals “incident is keeping visitors away entirely”. |
| Shopify / BC / Adobe Total Revenue | The baseline-input source. | Use this card to validate the 90-day baseline the formula uses. |
| Critical-Path Tests Status | Independent confirmation: is the synthetic test failing? | Synthetic failing plus Revenue at Risk above zero equals confirmed customer impact; synthetic passing equals the incident is on a non-customer-facing path and the revenue-at-risk number may be over-stated. |
| GA4 Sessions | The traffic-loss validation source. | If GA4 sessions did not actually drop during the incident, traffic-loss percentage in the formula is too high; recalibrate. |
Reconciling against the vendor’s own dashboard
Where to look in Datadog: Datadog does NOT compute or display Revenue at Risk; this card is a Vortex IQ-only synthesis. The component inputs come from:Incidents for the active-incident severity (the formula’s state input). Service Catalog for the service the incident affects (used for traffic-loss calibration if you tune per-service factors).The commerce-sibling baseline is fetched from the connected Shopify, BigCommerce, or Adobe Commerce platform via that platform’s Order API. Open that platform’s revenue dashboard for the 90-day average revenue/min during the same hour-of-week. Why our number may legitimately differ from a hand-computed estimate:
| Reason | Direction | Why |
|---|---|---|
| Time zone alignment | Either | The baseline uses the same hour of week in UTC; if you compute by hand using a different timezone alignment, the baseline differs. |
| API rate limits | Brief gaps | The Incidents and commerce-sibling APIs are rate-limited; cached values may be 1-2 minutes stale. |
| Log indexing latency | Not applicable | Revenue at Risk does not consume logs. |
| Severity factor calibration | Either | The default factors (1.00, 0.50, 0.25) are merchant-tunable. If you have changed them, the displayed value reflects your tuning. |
| Commerce-sibling sync lag | Vortex IQ baseline lower for “today” | The 90-day rolling average lags the most-recent 5-15 minutes of orders that the commerce platform has not yet acknowledged via webhook. Resolves automatically. |
| Card | Expected relationship | What causes the divergence |
|---|---|---|
shopify.total_revenue / bigcommerce.total_revenue / adobe_commerce.total_revenue | The baseline source. The card’s hourly baseline should equal the commerce-sibling’s 90-day average revenue/hour for the current hour-of-week. | A divergence indicates the commerce-sibling’s API is returning incomplete data; usually a webhook backlog. |
google_analytics.ga_sessions | Independent traffic-loss validator: did sessions actually drop during the incident? | If GA4 sessions are flat during a SEV-1, the 35% traffic-loss assumption is over-stated and the displayed value is too high. Recalibrate. |
stripe.stripe_total_revenue | Cross-validates the commerce-sibling baseline (Stripe sees payments; commerce sees orders). | A 5-15% gap is normal (refunds, currency conversion); a larger gap means one side is mis-syncing. |
Known limitations / merchant FAQs
Why is this card the most valuable in the Datadog manifest? Because it converts engineering jargon into the only number a non-engineering founder cares about. “Apdex 0.71, 1 SEV-1, p95 above 3,500 ms” means nothing to most owners; “£1,365/hour leaking right now” means everything. The card unblocks decisions: pause paid-media spend, post a status banner, escalate the incident, accept the cost. Each decision is informed by one shared number that finance, marketing, and engineering can all agree on. The number seems wrong, the actual revenue drop was much smaller. Why? The card shows estimated loss while the incident is open, calibrated by severity. After the incident closes, Conversion Drop During Incidents shows the post-incident measured loss. Compare the two; if measured is consistently lower than estimated for your store, the severity factors are too high for your specific traffic mix. Tune them in Settings → Datadog → Revenue-at-Risk Calibration. The defaults are calibrated for typical merchants but may over-state for stores with heavy international or low-conversion traffic. My incident is on the warehouse-sync worker, which has nothing to do with shoppers. Why is the card showing a non-zero value? The default formula does not distinguish between customer-facing and internal services because Vortex IQ does not know which is which without merchant-side tagging. To exclude internal-service incidents from the formula, tag those services withcustomer_facing:false in your Datadog Service Catalog; the engine will then weight those incidents at 0.
My commerce platform is not connected. Why does the card display a placeholder?
This card requires a commerce platform (Shopify, BigCommerce, Adobe Commerce) connected for the baseline. Without it, no revenue/min baseline exists to multiply by. Connect a commerce platform and the card populates automatically. If you do not use any of the supported platforms, the card cannot function and we recommend hiding it from the dashboard via Settings → Dashboard → Card visibility.
Why use 90-day baseline instead of last week or yesterday?
90 days smooths out promotional weeks, holidays, and seasonal effects. Yesterday’s revenue/min may be unusually high (a flash sale) or unusually low (a holiday). Last week may include a promotion that does not represent your steady-state. 90 days is the right window for “what would I have made if this incident had not happened” because it captures your normal cycle.
Datadog says everything is fine but the merchant dashboard reports loss.
The card responds to active incidents, not active alerts. If a real customer-impacting issue is happening but no incident has been declared, the card will show zero. This is the gap between “monitors are alerting” and “humans have acknowledged the alert is real”. Pair this card with Alerts Summary for the full picture: alerts firing plus zero incidents plus customer complaints equals “declare an incident now”.
The card’s number doubled in 5 minutes. What happened?
Almost always: a SEV-2 was upgraded to SEV-1. The severity factor doubles (0.50 to 1.00), which doubles the displayed cost. A second incident also being declared adds to the total: an open SEV-1 plus a new SEV-2 stacks (multiplied by their respective factors).
My Logs API returns 400 No valid indexes. Does this card still work?
Yes. Revenue at Risk does not consume logs. The card pulls Datadog Incidents API and the commerce-sibling KPI API; both are independent of Log Management.
Why is the card not showing during what I think is a real incident?
Three causes: (1) The incident has not been declared in Datadog Incident Management (it is just a PagerDuty page or Slack discussion); (2) The incident is tagged incident_type:test and is being filtered out; (3) Your Datadog account does not have Incident Management enabled (free tier). For case 1, configure your incident-tooling to create Datadog incidents automatically. For case 3, upgrade or use a different incident-source connector.
Can the card go above 100% of baseline revenue/hour?
No. The formula clamps at the baseline rate (multiplied by traffic-loss percentage), so the maximum displayed loss for a SEV-1 with 35% traffic loss is 35% of baseline revenue/hour. For full-outage scenarios where traffic loss approaches 100%, manually adjust the SEV-1 traffic-loss percentage to reflect reality (the default 35% is for severe-but-not-total degradation).