Revenue Lost / Min (active incidents), Datadog

Metrics type: Cross-Platform Metrics • Category: Monitoring

Live $/min loss while incidents are open. Stops being academic and starts being the COO’s number.

At a glance

The live, per-minute estimate of revenue being lost while a Datadog incident is open. Where Revenue at Risk shows the hourly rate, this card shows the per-minute ticker, which is what the COO and finance team want to read during a live incident. Every minute the displayed value persists is a minute of cost compounding.


The formula	`revenue_lost_per_min = active_severity_factor × commerce_sibling.revenue_per_min(90D_avg) × estimated_traffic_loss_pct`. Same components as Revenue at Risk but expressed at minute resolution rather than hourly.
API endpoints touched	Datadog Incidents (`/api/v2/incidents?filter[state]=active`); commerce-sibling KPI endpoint for 90-day revenue/min.
Severity factor	SEV-1 = 1.00; SEV-2 = 0.50; SEV-3 = 0.25; multiple incidents stack additively.
Estimated traffic loss percentage	SEV-1 = 35%, SEV-2 = 15%, SEV-3 = 5%. Tunable in Settings → Datadog → Revenue-at-Risk Calibration.
Aggregation window	Real-time, refreshed every 60 seconds while incident is open. The displayed number is the current per-minute rate, not a cumulative total.
Severity threshold	All severities; SEV-3 is the smallest contributor but stacks with higher severities when multiple incidents are open.
Alert pre-filtering	Test incidents (`[TEST]` titled, or tagged `incident_type:test`) excluded.
Log Management gating	Not used. The card consumes incident state and commerce-sibling baseline; both are independent of Logs.
Commerce-sibling required	This card needs a commerce platform connected. Without one, the card displays “Connect a commerce platform to enable this card”.
Why per-minute and not per-hour	The live ticker creates urgency. “ $23/minute leaking" feels different from "$ 1,380/hour at risk” even though they are the same number. During a live incident, the per-minute value is the heartbeat that keeps the response sharp. Pair with Revenue at Risk for the hourly view used in executive comms.
Time zone	UTC for cross-connector arithmetic; baseline revenue/min uses 90-day rolling average over the same hour-of-week.
Time window	`RT` (real-time, refreshed every 60 seconds). Display window is “while incident is open”.
Alert trigger	`> $0`, the card surfaces any non-zero value as a notification (zero means no incident is open).
Roles	owner, finance, operations

Calculation

Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.

Worked example

A UK fashion brand on Shopify with Datadog APM. Baseline revenue at 14:00 GMT (peak): £160/min. A SEV-1 checkout outage opened at 14:23 GMT.

Severity factor (SEV-1):           1.00
Commerce baseline (£/min, 14:00):  £160/min
Estimated traffic loss (SEV-1):    35%
Revenue Lost / Min:                1.00 × £160 × 0.35 = £56/min

The card displays £56/min with the value highlighted red while the incident is open. Three things this enables:

Live cost framing for the response team. “We have a SEV-1” is engineering jargon; “We are losing £56 per minute right now” is finance language. Both teams now share a number. The COO can walk into the engineering Slack channel and ask “are we still losing £56/min?” instead of asking technical questions; the on-call has a clear KPI for “incident is over”.
The cumulative cost is computed automatically. After 25 minutes the cumulative number reads £1,400. After 90 minutes it reads £5,040. After 4 hours it would be £13,440. The cumulative grows linearly until the incident closes; the per-minute rate is constant unless severity changes (e.g. SEV-1 downgraded to SEV-2 mid-investigation).
The per-minute frame discourages “let’s wait and see if it self-resolves” thinking. Without this card, the team may be tempted to spend 20 minutes investigating before deciding whether to rollback. With “£56/min leaking” displayed live, the team is more likely to rollback immediately and investigate later. The mental model shifts from “diagnose first” to “stop the bleeding first”.

Scenarios that change the per-minute value mid-incident:

Scenario A: Severity downgrades from SEV-1 to SEV-2 at minute 30
  - Pre-downgrade rate: £56/min
  - Post-downgrade rate: 0.50 × £160 × 0.15 = £12/min
  - Cumulative loss at minute 30: £1,680
  - If incident closes at minute 60: additional 30 × £12 = £360
  - Total: £2,040

Scenario B: Second SEV-2 opens during the SEV-1 (unrelated cause)
  - SEV-1 rate: £56/min
  - SEV-2 rate: £12/min (added)
  - Combined per-minute: £68/min until either resolves

Scenario C: Time-of-day shift (incident persists through peak to off-peak)
  - At 14:00 (peak baseline £160/min): £56/min lost
  - At 02:00 (off-peak baseline £40/min): £14/min lost
  - The same severity at different times produces different per-minute numbers
              because the underlying baseline shifts hour-by-hour

Three takeaways merchants should remember:

Per-minute and per-hour are the same number expressed differently. Use per-minute for live dashboards during an incident; use per-hour for executive briefings, status-page banners, and post-incident summaries. The per-minute figure is the live heartbeat; the per-hour figure is the executive frame.
The card encourages “stop the bleeding first” decisions. Engineering teams trained on “diagnose first, then fix” can be slow to rollback; the live cost ticker counters this with a clear, ongoing financial argument for immediate action.
The cumulative tally during a long incident is sobering. A 4-hour SEV-1 at £56/min is £13,440. Many merchants find that one bad incident per quarter costs more than the entire engineering tooling budget for the year. This is the card that justifies investments in deploy safety, automated rollback, and synthetic monitoring.

Sibling cards merchants should reference together

Card	Why pair it with Revenue Lost / Min	What the combination tells you
Revenue at Risk (live)	The hourly version of the same number.	Use this card for live dashboards; use Revenue at Risk for executive comms.
Active Incidents	The state input for the formula.	Active incidents drives the severity factor that drives this card.
Operational Health Score	The composite engineering view.	Composite below 70 plus this card non-zero equals real, measurable, costly incident.
Conversion Drop During Incidents	The post-incident measured-loss peer.	Compare live-estimated vs measured to recalibrate the formula.
Cart Abandonment During 5xx Spikes	Mechanism: how the revenue gets lost during incidents.	High abandonment plus high per-minute loss equals “incident is converting visitors to bouncers”.
Checkout Service Health × Sales	The latency-vs-orders dual-axis.	Confirms the live observation that orders/min dropped during the latency window.
Shopify / BC / Adobe Total Revenue	The baseline-input source.	Use this to validate the 90-day baseline the formula uses.
GA4 Sessions	The traffic-loss validation source.	If GA4 sessions did not actually drop during the incident, the traffic-loss percentage is over-stated.

Reconciling against the vendor’s own dashboard

Where to look in Datadog: Datadog does NOT compute or display Revenue Lost / Min; this card is a Vortex IQ-only synthesis. The component inputs come from:

Incidents for the active-incident severity (the formula’s state input). Service Catalog for the service the incident affects.

The commerce-sibling baseline is fetched from the connected Shopify, BigCommerce, or Adobe Commerce platform via that platform’s Order API. Why our number may legitimately differ from a hand-computed estimate:

Reason	Direction	Why
Time zone alignment	Either	The baseline uses the same hour of week in UTC; if you compute by hand using a different timezone alignment, the number shifts.
API rate limits	Brief gaps	Both Datadog Incidents API and the commerce-sibling Order API are rate-limited; cached values may be 1-2 minutes stale.
Log indexing latency	Not applicable	This card does not consume logs.
Severity factor calibration	Either	Default factors (1.00, 0.50, 0.25) are merchant-tunable.
Commerce-sibling sync lag	Vortex IQ baseline lower for “today”	The 90-day rolling average lags the most-recent 5-15 minutes of orders not yet acknowledged via webhook.

Cross-connector reconciliation:

Card	Expected relationship	What causes the divergence
`shopify.total_revenue` / `bigcommerce.total_revenue` / `adobe_commerce.total_revenue`	The baseline source. The hourly baseline equals the commerce-sibling 90-day average for the current hour-of-week, and per-minute equals that divided by 60.	A divergence indicates the commerce-sibling API is returning incomplete data; usually a webhook backlog.
`google_analytics.ga_sessions`	Independent traffic-loss validator.	If GA4 sessions did not drop during a SEV-1, the 35% traffic-loss assumption is over-stated and the displayed value is too high.
`stripe.stripe_total_revenue`	Cross-validates the commerce-sibling baseline.	A 5-15% gap is normal (refunds, currency); larger gap means one side is mis-syncing.

Known limitations / merchant FAQs

What is the difference between this card and Revenue at Risk? Same number, different units. Revenue at Risk is per-hour (£1,380/hour); this card is per-minute (£23/min). Use this card during live incidents on dashboards and Slack pings; use Revenue at Risk for executive summaries and post-incident reports. The per-minute frame creates urgency; the per-hour frame fits executive comms. Why per-minute, not per-second? Per-second would be jittery (the formula has minute-level resolution because commerce baselines are minute-resolution, not second-resolution). Per-minute is the smallest meaningful unit. Per-second would also feel performative; per-minute is direct without being theatrical. My commerce platform is not connected. What does the card show? “Connect a commerce platform to enable Revenue Lost / Min”. The card requires a commerce sibling for the baseline. Without it, no revenue/min baseline exists to multiply by. Connect Shopify, BigCommerce, or Adobe Commerce. The cumulative tally is shocking. Is the formula too aggressive? Possibly, depending on your traffic mix. The default 35% traffic-loss assumption for SEV-1 is calibrated against typical merchants; if your specific store has a more loyal customer base (high return-rate, branded-search-heavy traffic), shoppers may tolerate slowness better and actual loss is lower. After a few real incidents, compare estimated (this card) vs measured (Conversion Drop During Incidents) and tune the percentage in Settings → Datadog → Revenue-at-Risk Calibration. Does this card include refunds or chargebacks from the incident period? No. The card shows revenue-not-captured during the incident, not revenue-captured-then-refunded. If the incident causes payment-confirmation failures that lead to disputes weeks later, those costs are tracked separately on Stripe Dispute Rate and similar payment-side cards. What happens to the cumulative tally after the incident closes? The cumulative is preserved as a “post-incident summary” entry on the same card for 7 days, then archived. The post-incident view shows: (1) Total cumulative loss, (2) Incident duration, (3) Post-incident measured loss for comparison, (4) The deploy or change that caused the incident if identified. This lets the merchant build institutional memory of incident costs. My Logs API returns 400 No valid indexes. Does this card still work? Yes. Revenue Lost / Min consumes incident state and commerce-sibling baseline; both are independent of Logs. The card says £56/min but my Stripe dashboard shows revenue is barely below baseline. What is happening? Two possible reasons: (1) Shoppers are queuing or retrying and will eventually complete checkout (delayed revenue, not lost); (2) The traffic-loss percentage in the formula is too aggressive for your store. After the incident closes, compare cumulative vs measured loss and recalibrate. Some retail categories (high-loyalty, low-impulse) tolerate incidents much better than others (impulse, discount, fast-fashion). Why does the per-minute number change during an incident even though the severity stays the same? Because the baseline revenue/min varies hour-by-hour. An incident that persists from peak (£160/min baseline) into off-peak (£40/min baseline) will see the displayed loss decrease as the incident continues, even though the engineering problem is unchanged. This is correct: actual revenue loss really is lower at low-traffic hours. Can I see this card for closed incidents? The post-incident summary is available for 7 days on the card itself; the full historical view is available on the Vortex IQ Incident History page (Settings → Datadog → Incident History). For the post-mortem write-up, copy the cumulative loss number and the measured-loss number from Conversion Drop During Incidents.

Tracked live in Vortex IQ Nerve Centre

Revenue Lost / Min (active incidents) is one of hundreds of KPI pulses Vortex IQ tracks across Datadog and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

Get Started

The AI OS

Revenue Lost / Min (active incidents), Datadog

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre

​At a glance

​Calculation

​Worked example

​Sibling cards merchants should reference together

​Reconciling against the vendor’s own dashboard

​Known limitations / merchant FAQs

​Tracked live in Vortex IQ Nerve Centre

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre