At a glance
The post-incident measured conversion-rate drop during Datadog incident windows, compared to the same hour-of-week baseline from the prior 90 days. Where Revenue at Risk and Revenue Lost / Min show estimated loss while incidents are open, this card shows what the loss actually was once data is in. The reconciliation card.
| API endpoints touched | Datadog Incidents (/api/v2/incidents) for incident-window timestamps and severity; commerce-sibling KPI for orders/min and sessions/min during the window vs the baseline. |
| Metric basis | (baseline_conversion_rate − incident_conversion_rate) / baseline_conversion_rate × 100. Conversion rate = orders / sessions for the same window; baseline is the 90-day average for the same hour-of-week. |
| Aggregation window | Per-incident (one bar per resolved incident). The card displays the last 90 days of incidents as a bar chart. |
| Severity threshold | Bars colour-coded by Datadog severity: SEV-1 = red, SEV-2 = amber, SEV-3 = grey. The 10% drop alert threshold applies regardless of severity. |
| Alert pre-filtering | Test incidents excluded; incidents shorter than 5 minutes excluded (statistical noise dominates); incidents where session data is missing for either window excluded. |
| Log Management gating | Not used. The card consumes incident state and commerce-sibling KPI; both are independent of Logs. |
| Why post-incident, not live | Live conversion rate during a short incident is statistically noisy: a 15-minute incident at typical merchant traffic produces wide confidence intervals. Post-incident measurement waits for the dust to settle, the orders to flow through to the commerce platform’s API, and computes the drop with stable inputs. Use this card for institutional learning; use Revenue Lost / Min for live decisions. |
| Commerce-sibling required | Required. Conversion rate needs orders (numerator) and sessions (denominator). Without a commerce-sibling and a sessions-source connected, the card displays “Connect a commerce platform and GA4 to enable this card”. |
| Time zone | UTC for cross-connector arithmetic; baseline uses the same hour-of-week from the prior 90 days. |
| Time window | 90D (rolling 90 days, one bar per incident) |
| Alert trigger | > 10% drop, an incident producing more than 10% conversion drop is flagged for post-incident review. |
| Roles | owner, marketing |
Calculation
Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.Worked example
A US specialty foods brand on BigCommerce with Datadog APM. The card shows the last 90 days of resolved incidents (showing the most-recent eight).| Date | Incident | Severity | Duration | Baseline conv | Incident conv | Drop |
|---|---|---|---|---|---|---|
| 25 Apr 26 | INC-2841 Checkout payment-pool | SEV-1 | 47 min | 1.92% | 0.43% | -78% |
| 18 Apr 26 | INC-2812 Search latency | SEV-2 | 92 min | 1.88% | 1.51% | -20% |
| 09 Apr 26 | INC-2785 Stripe upstream 502 | SEV-1 | 78 min | 1.86% | 0.52% | -72% |
| 02 Apr 26 | INC-2759 DB pool exhaustion | SEV-2 | 35 min | 1.91% | 1.42% | -26% |
| 28 Mar 26 | INC-2724 CDN cache miss storm | SEV-3 | 26 min | 1.89% | 1.79% | -5% |
| 19 Mar 26 | INC-2698 Recommendations slow | SEV-3 | 41 min | 1.88% | 1.81% | -4% |
| 11 Mar 26 | INC-2671 Shopify webhook backlog | SEV-2 | 58 min | 1.87% | 1.44% | -23% |
| 04 Mar 26 | INC-2645 Search cluster restart | SEV-3 | 18 min | 1.93% | 1.86% | -4% |
- The empirical relationship between severity and conversion drop. The data confirms what the severity-factor formula assumes: SEV-1 incidents produce 60-80% conversion drops; SEV-2 produces 15-30%; SEV-3 produces 3-8%. This brand’s pattern is consistent with the default formula. If a brand’s pattern diverges (e.g. SEV-2s consistently producing 50%+ drops), the formula’s traffic-loss percentages should be tuned upward.
- The two SEV-1s in 90 days were both upstream-driven. INC-2841 was payment-pool exhaustion; INC-2785 was Stripe’s PSP layer. Neither was a code bug in the merchant’s own services. Action insight: invest in upstream-failure resilience (payment-PSP fallbacks, circuit breakers around third-party calls, queue-based decoupling) rather than in code-quality alone.
- Cumulative quarterly cost. Adding up: 78% × 47 min × 2,930; 20% × 92 min × 1,398; 72% × 78 min × 4,490; 26% × 35 min × 691; remaining 4 incidents combined ~10,710 in 90 days**. Annualised that is roughly $43,000 of incident cost. Material.
- Use this card for post-incident reviews and quarterly planning, not for live decisions. Live decisions need Revenue Lost / Min. Post-incident reviews need this card. Both feed institutional learning; their roles are different.
- The bar chart visualises severity-vs-impact. SEV-1 = tall red bars; SEV-2 = medium amber; SEV-3 = short grey. Patterns to watch: (1) Are SEV-3s producing larger-than-expected drops? Probably mis-classified severities. (2) Are SEV-2s clustering on a specific service? Probably an under-invested area. (3) Is the trend rising over the 90-day window? Probably a new class of regression introduced recently.
- The card calibrates the live cards. Compare this card’s measured drops to the live cards’ estimates after each incident; if they consistently diverge, tune the severity factors. The estimates exist to guide live decisions; this card exists to keep the estimates honest.
Sibling cards merchants should reference together
| Card | Why pair it with Conversion Drop During Incidents | What the combination tells you |
|---|---|---|
| Revenue at Risk (live) | The live-estimated peer. | Estimated vs measured comparison drives calibration of severity factors. |
| Revenue Lost / Min | The per-minute live counterpart. | Use Revenue Lost / Min for live decisions; this card for post-incident review. |
| Active Incidents | The state stream that drives the bar chart. | Each bar in this card corresponds to one incident. |
| Cart Abandonment During 5xx Spikes | Mechanism: how shoppers translated incidents into lost orders. | High abandonment plus high conversion drop equals “shoppers who started checkout could not finish”; low abandonment plus high conversion drop equals “shoppers never reached checkout”. |
| Checkout Service Health × Sales | The latency-vs-orders dual-axis. | Confirms the timing of the conversion drop within each incident window. |
| Operational Health Score | The composite that aggregates incident severity. | Composite drops correspond to the bars on this card; the timing is the same. |
| GA4 Sessions | The denominator source for conversion rate. | If GA4 sessions during the incident were also reduced, the conversion rate denominator was smaller, which can mask or amplify the drop. |
| Shopify / BC / Adobe Conversion Rate | The platform-side conversion rate. | Validates this card’s incident-window numbers against the merchant’s normal conversion-rate dashboard. |
Reconciling against the vendor’s own dashboard
Where to look in Datadog: Datadog does NOT compute or display Conversion Drop During Incidents; this card is a Vortex IQ-only synthesis. The component inputs come from:Incidents for the resolved-incident timestamps and severity (the bar-chart x-axis). Service Catalog for the service the incident affected.The conversion-rate side of this card comes from the connected commerce platform’s KPI; open that platform’s analytics for the same incident windows. Why our number may legitimately differ from a hand-computed estimate:
| Reason | Direction | Why |
|---|---|---|
| Time zone alignment | Either | Baseline uses same hour-of-week in UTC; hand calculations using different timezones produce different baselines. |
| API rate limits | Brief gaps | Datadog Incidents and commerce-sibling APIs are rate-limited; cached values may be 1-2 minutes stale at the start of incident reconciliation. |
| Log indexing latency | Not applicable | This card does not consume logs. |
| Session-source choice | Either | Default uses GA4 sessions for the conversion-rate denominator; if GA4 is unavailable, the engine falls back to commerce-platform internal sessions, which are typically lower (no bots, no non-tracked visits). The choice affects the absolute conversion rate but not the percentage drop. |
| Short-incident exclusion | Vortex IQ shorter list | Incidents under 5 minutes are excluded from the chart because statistical noise dominates; Datadog UI shows them. |
| Card | Expected relationship | What causes the divergence |
|---|---|---|
shopify.total_revenue / bigcommerce.total_revenue / adobe_commerce.total_revenue | Conversion drop multiplied by orders/min equals the revenue impact. | Confirms or contradicts the live estimates. |
google_analytics.ga_sessions | The denominator source. | If GA4 sessions were also reduced during the incident (e.g. GA4 tag-fire failure), the conversion rate denominator is smaller, which biases the apparent drop. Pair with GA4 Property Health to confirm GA4 was healthy during the window. |
stripe.stripe_payment_health_score | Payment-PSP cascade peer. | Conversion drops co-occurring with payment-health drops indicate the cause was payment-side (e.g. INC-2785 in the worked example was a Stripe upstream issue). |
Known limitations / merchant FAQs
Why does this card show 90 days, not just recent? Because individual incidents are noisy and the pattern only emerges across multiple incidents. With 90 days of bars, you can spot trends (“our SEV-2s used to cost 15% conversion, now they cost 25%, what changed?”), repeated services (“60% of our SEV-1s are payment-related, invest there”), and severity calibration (“our SEV-3s cost more than our formula assumes, retune”). Single incidents don’t tell the story; 90 days of incidents do. My incident is over but it has not appeared on the bar chart yet. Why? Three usual delays: (1) The incident has not been formally resolved in Datadog Incident Management (still instable state); (2) The commerce-sibling order webhook is still flushing; this card waits 30 minutes after incident close to ensure the order data is complete; (3) The session denominator (GA4 or platform-native) is still flushing. After 30 minutes post-resolution, the bar should appear; longer means investigate the upstream data sources.
My commerce platform shows higher conversion drop than this card. Why?
Two reasons: (1) The commerce platform may compute conversion using a different denominator (some include POS sessions; this card defaults to web-only); (2) The commerce platform may use the merchant’s timezone for windowing while this card uses UTC, producing slight boundary differences. Confirm by aligning timezones and definitions; usually the gap closes to 1-2%.
Should I tune the severity factors based on this card?
Yes, after 5+ incidents of each severity. With <5 incidents the sample is too small. Once you have a meaningful sample, compare the average measured drop per severity vs the formula’s traffic-loss percentage. If your SEV-1s average a 60% drop and the formula assumes 35%, increase the SEV-1 traffic-loss to 60%. The live cards will then estimate more accurately.
Datadog says incidents are resolved but my dashboard still shows degradation.
Three possible causes: (1) Datadog incident was closed prematurely (the underlying issue persisted); (2) The commerce-sibling order webhook is processing the backlog of orders that came in during the incident, making post-incident metrics look slow to recover; (3) Customer-side caching: shoppers who experienced errors are slow to retry. The bar in this card uses the closed-incident timestamps; if reality differs, the engineering team should re-open the incident.
What is the “10% drop” alert trigger for? The card is post-incident.
The trigger flags incidents that produced more than 10% conversion drop and surfaces them for post-incident review. Incidents below 10% drop are acknowledged but not flagged; above 10% they trigger an automatic post-mortem template in the engineering tooling. The threshold is calibrated to filter noise; below 10% the drop is often within statistical confidence and may not be incident-driven at all.
My Logs API returns 400 No valid indexes. Does this card still work?
Yes. The card consumes incident state and commerce-sibling KPI; both are independent of Logs.
My commerce platform is connected but GA4 is not. Does this card work?
Partially. The conversion-rate denominator falls back to commerce-platform-native sessions (Shopify Online Store sessions, BigCommerce visit count). These are typically lower than GA4 sessions because they exclude bots and untracked visitors, which makes the apparent conversion rate higher. The percentage drop is largely unaffected because both periods use the same denominator source. For absolute conversion rate, connect GA4.
My incident was 3 minutes long. Why is it excluded?
Statistical noise dominates at very short windows. A 3-minute incident may have only 2-3 minutes of impacted shopper traffic, which is too small a sample for a reliable conversion-rate calculation. Vortex IQ excludes incidents shorter than 5 minutes from the bar chart. To see them, use the Datadog Incident UI directly.
Can I see incident-cost rolled up to a quarterly or annual view?
Yes. The Vortex IQ Incident History page (Settings → Datadog → Incident History) aggregates incident costs by quarter and year using the same data that drives this card. Useful for board reports and engineering investment decisions (“we spent 50K in deploy-safety tooling pays back in 14 months”).