Conversion Drop During Incidents, Datadog

Metrics type: Cross-Platform Metrics • Category: Monitoring

At a glance

The post-incident measured conversion-rate drop during Datadog incident windows, compared to the same hour-of-week baseline from the prior 90 days. Where Revenue at Risk and Revenue Lost / Min show estimated loss while incidents are open, this card shows what the loss actually was once data is in. The reconciliation card.


API endpoints touched	Datadog Incidents (`/api/v2/incidents`) for incident-window timestamps and severity; commerce-sibling KPI for orders/min and sessions/min during the window vs the baseline.
Metric basis	`(baseline_conversion_rate − incident_conversion_rate) / baseline_conversion_rate × 100`. Conversion rate = orders / sessions for the same window; baseline is the 90-day average for the same hour-of-week.
Aggregation window	Per-incident (one bar per resolved incident). The card displays the last 90 days of incidents as a bar chart.
Severity threshold	Bars colour-coded by Datadog severity: SEV-1 = red, SEV-2 = amber, SEV-3 = grey. The 10% drop alert threshold applies regardless of severity.
Alert pre-filtering	Test incidents excluded; incidents shorter than 5 minutes excluded (statistical noise dominates); incidents where session data is missing for either window excluded.
Log Management gating	Not used. The card consumes incident state and commerce-sibling KPI; both are independent of Logs.
Why post-incident, not live	Live conversion rate during a short incident is statistically noisy: a 15-minute incident at typical merchant traffic produces wide confidence intervals. Post-incident measurement waits for the dust to settle, the orders to flow through to the commerce platform’s API, and computes the drop with stable inputs. Use this card for institutional learning; use Revenue Lost / Min for live decisions.
Commerce-sibling required	Required. Conversion rate needs orders (numerator) and sessions (denominator). Without a commerce-sibling and a sessions-source connected, the card displays “Connect a commerce platform and GA4 to enable this card”.
Time zone	UTC for cross-connector arithmetic; baseline uses the same hour-of-week from the prior 90 days.
Time window	`90D` (rolling 90 days, one bar per incident)
Alert trigger	`> 10% drop`, an incident producing more than 10% conversion drop is flagged for post-incident review.
Roles	owner, marketing

Calculation

Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.

Worked example

A US specialty foods brand on BigCommerce with Datadog APM. The card shows the last 90 days of resolved incidents (showing the most-recent eight).

Date	Incident	Severity	Duration	Baseline conv	Incident conv	Drop
25 Apr 26	INC-2841 Checkout payment-pool	SEV-1	47 min	1.92%	0.43%	-78%
18 Apr 26	INC-2812 Search latency	SEV-2	92 min	1.88%	1.51%	-20%
09 Apr 26	INC-2785 Stripe upstream 502	SEV-1	78 min	1.86%	0.52%	-72%
02 Apr 26	INC-2759 DB pool exhaustion	SEV-2	35 min	1.91%	1.42%	-26%
28 Mar 26	INC-2724 CDN cache miss storm	SEV-3	26 min	1.89%	1.79%	-5%
19 Mar 26	INC-2698 Recommendations slow	SEV-3	41 min	1.88%	1.81%	-4%
11 Mar 26	INC-2671 Shopify webhook backlog	SEV-2	58 min	1.87%	1.44%	-23%
04 Mar 26	INC-2645 Search cluster restart	SEV-3	18 min	1.93%	1.86%	-4%

Three things this card surfaces that no single tool can:

The empirical relationship between severity and conversion drop. The data confirms what the severity-factor formula assumes: SEV-1 incidents produce 60-80% conversion drops; SEV-2 produces 15-30%; SEV-3 produces 3-8%. This brand’s pattern is consistent with the default formula. If a brand’s pattern diverges (e.g. SEV-2s consistently producing 50%+ drops), the formula’s traffic-loss percentages should be tuned upward.
The two SEV-1s in 90 days were both upstream-driven. INC-2841 was payment-pool exhaustion; INC-2785 was Stripe’s PSP layer. Neither was a code bug in the merchant’s own services. Action insight: invest in upstream-failure resilience (payment-PSP fallbacks, circuit breakers around third-party calls, queue-based decoupling) rather than in code-quality alone.
Cumulative quarterly cost. Adding up: 78% × 47 min × $80/min =$ 2,930; 20% × 92 min × $76/min =$ 1,398; 72% × 78 min × $80/min =$ 4,490; 26% × 35 min × $76/min =$ 691; remaining 4 incidents combined ~ $1,200; **total ≈$ 10,710 in 90 days**. Annualised that is roughly $43,000 of incident cost. Material.

Calibration check (estimated vs measured):
  - INC-2841 estimate from Revenue Lost / Min: $1,365/hour × 47 min = $1,069
  - INC-2841 measured from this card: 78% × 47 min × $80/min = $2,930
  - Estimate was ~36% of measured (under-stated)
  - The 35% traffic-loss assumption for SEV-1 was too conservative for this brand
  - Recommendation: bump merchant's SEV-1 traffic-loss to 50%
              in Settings → Datadog → Revenue-at-Risk Calibration

Three takeaways merchants should remember:

Use this card for post-incident reviews and quarterly planning, not for live decisions. Live decisions need Revenue Lost / Min. Post-incident reviews need this card. Both feed institutional learning; their roles are different.
The bar chart visualises severity-vs-impact. SEV-1 = tall red bars; SEV-2 = medium amber; SEV-3 = short grey. Patterns to watch: (1) Are SEV-3s producing larger-than-expected drops? Probably mis-classified severities. (2) Are SEV-2s clustering on a specific service? Probably an under-invested area. (3) Is the trend rising over the 90-day window? Probably a new class of regression introduced recently.
The card calibrates the live cards. Compare this card’s measured drops to the live cards’ estimates after each incident; if they consistently diverge, tune the severity factors. The estimates exist to guide live decisions; this card exists to keep the estimates honest.

Sibling cards merchants should reference together

Card	Why pair it with Conversion Drop During Incidents	What the combination tells you
Revenue at Risk (live)	The live-estimated peer.	Estimated vs measured comparison drives calibration of severity factors.
Revenue Lost / Min	The per-minute live counterpart.	Use Revenue Lost / Min for live decisions; this card for post-incident review.
Active Incidents	The state stream that drives the bar chart.	Each bar in this card corresponds to one incident.
Cart Abandonment During 5xx Spikes	Mechanism: how shoppers translated incidents into lost orders.	High abandonment plus high conversion drop equals “shoppers who started checkout could not finish”; low abandonment plus high conversion drop equals “shoppers never reached checkout”.
Checkout Service Health × Sales	The latency-vs-orders dual-axis.	Confirms the timing of the conversion drop within each incident window.
Operational Health Score	The composite that aggregates incident severity.	Composite drops correspond to the bars on this card; the timing is the same.
GA4 Sessions	The denominator source for conversion rate.	If GA4 sessions during the incident were also reduced, the conversion rate denominator was smaller, which can mask or amplify the drop.
Shopify / BC / Adobe Conversion Rate	The platform-side conversion rate.	Validates this card’s incident-window numbers against the merchant’s normal conversion-rate dashboard.

Reconciling against the vendor’s own dashboard

Where to look in Datadog: Datadog does NOT compute or display Conversion Drop During Incidents; this card is a Vortex IQ-only synthesis. The component inputs come from:

Incidents for the resolved-incident timestamps and severity (the bar-chart x-axis). Service Catalog for the service the incident affected.

The conversion-rate side of this card comes from the connected commerce platform’s KPI; open that platform’s analytics for the same incident windows. Why our number may legitimately differ from a hand-computed estimate:

Reason	Direction	Why
Time zone alignment	Either	Baseline uses same hour-of-week in UTC; hand calculations using different timezones produce different baselines.
API rate limits	Brief gaps	Datadog Incidents and commerce-sibling APIs are rate-limited; cached values may be 1-2 minutes stale at the start of incident reconciliation.
Log indexing latency	Not applicable	This card does not consume logs.
Session-source choice	Either	Default uses GA4 sessions for the conversion-rate denominator; if GA4 is unavailable, the engine falls back to commerce-platform internal sessions, which are typically lower (no bots, no non-tracked visits). The choice affects the absolute conversion rate but not the percentage drop.
Short-incident exclusion	Vortex IQ shorter list	Incidents under 5 minutes are excluded from the chart because statistical noise dominates; Datadog UI shows them.

Cross-connector reconciliation:

Card	Expected relationship	What causes the divergence
`shopify.total_revenue` / `bigcommerce.total_revenue` / `adobe_commerce.total_revenue`	Conversion drop multiplied by orders/min equals the revenue impact.	Confirms or contradicts the live estimates.
`google_analytics.ga_sessions`	The denominator source.	If GA4 sessions were also reduced during the incident (e.g. GA4 tag-fire failure), the conversion rate denominator is smaller, which biases the apparent drop. Pair with GA4 Property Health to confirm GA4 was healthy during the window.
`stripe.stripe_payment_health_score`	Payment-PSP cascade peer.	Conversion drops co-occurring with payment-health drops indicate the cause was payment-side (e.g. INC-2785 in the worked example was a Stripe upstream issue).

Known limitations / merchant FAQs

Why does this card show 90 days, not just recent? Because individual incidents are noisy and the pattern only emerges across multiple incidents. With 90 days of bars, you can spot trends (“our SEV-2s used to cost 15% conversion, now they cost 25%, what changed?”), repeated services (“60% of our SEV-1s are payment-related, invest there”), and severity calibration (“our SEV-3s cost more than our formula assumes, retune”). Single incidents don’t tell the story; 90 days of incidents do. My incident is over but it has not appeared on the bar chart yet. Why? Three usual delays: (1) The incident has not been formally resolved in Datadog Incident Management (still in stable state); (2) The commerce-sibling order webhook is still flushing; this card waits 30 minutes after incident close to ensure the order data is complete; (3) The session denominator (GA4 or platform-native) is still flushing. After 30 minutes post-resolution, the bar should appear; longer means investigate the upstream data sources. My commerce platform shows higher conversion drop than this card. Why? Two reasons: (1) The commerce platform may compute conversion using a different denominator (some include POS sessions; this card defaults to web-only); (2) The commerce platform may use the merchant’s timezone for windowing while this card uses UTC, producing slight boundary differences. Confirm by aligning timezones and definitions; usually the gap closes to 1-2%. Should I tune the severity factors based on this card? Yes, after 5+ incidents of each severity. With <5 incidents the sample is too small. Once you have a meaningful sample, compare the average measured drop per severity vs the formula’s traffic-loss percentage. If your SEV-1s average a 60% drop and the formula assumes 35%, increase the SEV-1 traffic-loss to 60%. The live cards will then estimate more accurately. Datadog says incidents are resolved but my dashboard still shows degradation. Three possible causes: (1) Datadog incident was closed prematurely (the underlying issue persisted); (2) The commerce-sibling order webhook is processing the backlog of orders that came in during the incident, making post-incident metrics look slow to recover; (3) Customer-side caching: shoppers who experienced errors are slow to retry. The bar in this card uses the closed-incident timestamps; if reality differs, the engineering team should re-open the incident. What is the “10% drop” alert trigger for? The card is post-incident. The trigger flags incidents that produced more than 10% conversion drop and surfaces them for post-incident review. Incidents below 10% drop are acknowledged but not flagged; above 10% they trigger an automatic post-mortem template in the engineering tooling. The threshold is calibrated to filter noise; below 10% the drop is often within statistical confidence and may not be incident-driven at all. My Logs API returns 400 No valid indexes. Does this card still work? Yes. The card consumes incident state and commerce-sibling KPI; both are independent of Logs. My commerce platform is connected but GA4 is not. Does this card work? Partially. The conversion-rate denominator falls back to commerce-platform-native sessions (Shopify Online Store sessions, BigCommerce visit count). These are typically lower than GA4 sessions because they exclude bots and untracked visitors, which makes the apparent conversion rate higher. The percentage drop is largely unaffected because both periods use the same denominator source. For absolute conversion rate, connect GA4. My incident was 3 minutes long. Why is it excluded? Statistical noise dominates at very short windows. A 3-minute incident may have only 2-3 minutes of impacted shopper traffic, which is too small a sample for a reliable conversion-rate calculation. Vortex IQ excludes incidents shorter than 5 minutes from the bar chart. To see them, use the Datadog Incident UI directly. Can I see incident-cost rolled up to a quarterly or annual view? Yes. The Vortex IQ Incident History page (Settings → Datadog → Incident History) aggregates incident costs by quarter and year using the same data that drives this card. Useful for board reports and engineering investment decisions (“we spent

43K on incidents last year; investing

50K in deploy-safety tooling pays back in 14 months”).

Tracked live in Vortex IQ Nerve Centre

Conversion Drop During Incidents is one of hundreds of KPI pulses Vortex IQ tracks across Datadog and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

Get Started

The AI OS

Conversion Drop During Incidents, Datadog

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre

​At a glance

​Calculation

​Worked example

​Sibling cards merchants should reference together

​Reconciling against the vendor’s own dashboard

​Known limitations / merchant FAQs

​Tracked live in Vortex IQ Nerve Centre

At a glance

Calculation

Worked example

Sibling cards merchants should reference together

Reconciling against the vendor’s own dashboard

Known limitations / merchant FAQs

Tracked live in Vortex IQ Nerve Centre