Skip to main content
Card class: HeroCategory: Monitoring

At a glance

Count of currently open New Relic Alert incidents at CRITICAL or HIGH priority. The single number that answers “is anything actively broken right now?” Zero means quiet; anything above zero is a duty-engineer signal.
What it countsNumber of incidents with state triggered or acknowledged (not yet closed) at priority CRITICAL or HIGH in New Relic Alerts. Each violating condition produces an incident; correlated incidents may be grouped into an issue by Applied Intelligence (the issue contains 1, N incidents).
NerdGraph endpointNerdGraph actor.account.aiIssues.issues(filter: {states: [ACTIVATED]}) { issues { issueId, priority, conditionFamilyId, eventType } }. The card returns count(issues WHERE priority IN ['CRITICAL', 'HIGH']).
Metric basisLive count, not a window aggregate. Updates every 30s on the standard sync cadence; near-real-time on Webhook-driven Pro plans.
Aggregation windowReal-time. Number flips up the moment an incident opens, flips down the moment it closes (or is acknowledged-then-closed).
Severity thresholdCRITICAL and HIGH only by default. MEDIUM and LOW excluded because they typically encode “informational” rather than “customer-affecting” rules. To include all severities, override the manifest scope.
Filtered hosts / servicesAll entities in scope. Restrict via filter: { entityGuids: [...] } per-merchant during onboarding to focus on storefront, checkout, and payment services.
Browser vs APM scopeBrowser, APM, Infrastructure, Synthetics, and Logs all contribute. The count is the union across all NR product lines. To restrict to APM-only, filter eventType = 'APPLICATION'.
Sample basisNot sampled. Incidents are first-class entities, every triggered condition produces exactly one incident regardless of event-stream sampling.
Time zoneReal-time count is timezone-independent. Incident createdAt and closedAt follow UTC in NerdGraph.
Time windowRT
Alert trigger>0 CRITICAL/HIGH (any open incident at the two highest severities). Flips green-to-red the moment a P1/P2 fires.
Rolesowner, engineering, operations

Calculation

Calculated automatically from your New Relic data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.

Worked example

A BigCommerce + Cloud Run storefront, monitored by New Relic across 8 services and 22 hosts. The card shows 3 at 11:42 on 02 May 26. The duty engineer expands the card and sees:
Issue IDPriorityServiceConditionOpen since
iss-7af2c1CRITICALcheckout-apierror rate > 5% for 5m11:38 (4m ago)
iss-7af2d4HIGHcheckout-apip95 latency > 2000ms for 5m11:39 (3m ago)
iss-7af2e0HIGHpayment-bridgeDB connection pool > 90% for 3m11:40 (2m ago)
Three incidents, but Applied Intelligence has correlated them into one issue family because they share the checkout-api entity and started within a 4-minute window. Reading: a single root cause is producing three symptom-incidents. Conversion impact translation. checkout-api is the most revenue-critical path. With error rate at 5%+ and p95 over 2s, the customer experience on checkout is materially degraded. Mapping to revenue:
  • Healthy checkout completion rate: ~62% of carts.
  • During this incident: roughly 30% drop in completion (customers who hit a 5xx or wait >5s typically abandon).
  • Affected customers: ~140/min in business-hours traffic.
  • Lost completions: 140 x 0.62 x 0.30 = ~26 lost orders/min.
  • Average checkout AOV: £95.
  • Exposure: ~£2,470/min of risked GMV.
If the engineer takes 18 minutes from alert to mitigation (typical for a deploy rollback), total exposure is ~£44k. If the incident covers an evening peak (3x traffic), exposure scales proportionally to ~£132k. The composite Operational Health Score drops by 60 points (3 incidents x 20-point amplifier) plus knock-on Apdex / error-rate degradation, taking it from a baseline 87 to roughly 25, well below the 70 alert threshold and clearly visible to non-engineers in real time. This is the value of the count-of-incidents reading: a single number a CEO can interpret, even though three engineers are working three different angles of the same root cause. When the issue resolves at 11:54, the count flips back to 0 and the score recovers within 5 minutes (Apdex / error-rate are 5-min rolling windows).

Sibling cards merchants should reference together

CardWhy pair it with Active Incidents
Currently Triggered ConditionsGranular view: each condition (vs each grouped issue). Open this when the count is >1 to see which underlying conditions are firing.
Alerts FiringFlat list of firing alerts (not de-duplicated). Useful when an issue family bundles unrelated incidents and the granular detail matters.
Mean Time To AcknowledgeOutcome metric. If active incidents stay >0 for long stretches without acknowledgement, MTTA reveals a paging-tooling problem.
Mean Time To ResolveOutcome metric. Pair to see whether incidents are short-and-sharp (good) or long-and-grinding (bad).
Operational Health ScoreComposite parent. Each open P1 deducts 20 points from the score.
Datadog Active IncidentsCross-connector peer. Two platforms typically agree within +/- 1 because most incidents have peer alerts on both.
PagerDuty Open IncidentsDownstream incident-management view. NR fires the alert; PD escalates. Discrepancies indicate routing problems.
Shopify Sales / MinRevenue-side outcome. Watch sales/min co-move with incident count to quantify customer impact.

Reconciling against the vendor’s own dashboard

Where to look in New Relic: Why our number may legitimately differ from New Relic’s own screens:
ReasonDirection of divergence
Issue grouping latency. Applied Intelligence groups correlated incidents into issues; grouping happens 30, 90s after individual incidents fire. Vortex IQ count may briefly show 3 unrelated issues that AI later collapses into 1.Vortex IQ may briefly read higher
Sync cadence. Vortex IQ syncs every 30s by default. NR’s UI is real-time. During the 30s gap a fast-resolved incident may show in NR’s history but never appear in Vortex IQ.Vortex IQ may miss flutter-incidents
Account timezone vs UTC. Open-since timestamps in this card are UTC; NR UI follows account timezone. Doesn’t affect the count, only the displayed timestamps.Display only
NerdGraph rate limits. Default 3,000 points / minute / account. The aiIssues query is cheap (<5 points) so rate-limiting is rare; under heavy investigation it can stale by 60s.Stale, not wrong
Severity scope. Vortex IQ counts CRITICAL + HIGH by default; NR’s “Open issues” view shows all severities including MEDIUM and LOW. Numbers differ accordingly.NR usually higher than Vortex IQ
Cross-connector reconciliation: NR Alerts and Datadog Monitors are independent alerting systems. A merchant with both connectors typically has peer alerts on each (the same condition wired to both platforms), so the active incident counts should track within +/- 1 most of the time. Persistent gaps of 3+ are diagnostic: the most common cause is one platform missing an alert that fires on the other (a recently-added Datadog monitor with no NR equivalent, or vice versa). Audit the alert inventory on both sides. NR Alerts vs PagerDuty open-incidents: NR fires, PD escalates. PD count should equal NR count plus any alerts routed from other sources (e.g., status pages, log management tools). A NR open count of 3 with PD showing 5 is normal if 2 of the PD incidents come from non-NR sources; PD showing 1 when NR shows 3 is a routing problem (NR alerts not reaching PD).

Known limitations / merchant FAQs

NR vs Datadog: should I configure alerts on both? Yes if both are connected. Peer alerts give you redundancy when one platform’s alerting backend has an outage (rare, but it happens). The cost is alert-noise discipline: you must keep alert definitions in sync, or the two platforms drift apart and the comparison loses meaning. Apdex math: does an Apdex breach trigger an incident? Only if you’ve configured an Apdex condition. NR doesn’t auto-create incidents from low Apdex; you create a condition in Alerts that says “incident if Apdex < 0.7 for 5 minutes”. Many teams forget this and rely on error-rate / latency alerts alone, missing the satisfaction-weighted signal. We recommend at least one Apdex condition per critical service. NRQL retention vs incident retention: how long can I look back? NRQL Transaction data has 8-day full-resolution retention on standard plans. Incidents are first-class entities in NerdGraph and retained 13 months by default, regardless of the underlying event retention. So you can review incident history (this card’s count over time) for a year even when the raw data is rolled up. NR and Datadog disagree: NR shows 3 incidents, DD shows 1, who’s right? Both. The most common cause is alert-condition coverage difference. NR may have 3 separate conditions (error rate, latency, DB pool) all firing on the same root cause; DD may have a single composite condition that bundles those signals into one incident. Neither is wrong, the counting policy differs. Compare condition inventories; align them if a single-number reading across both platforms matters to you. Sampling: are incidents sampled? No. Incidents are first-class entities, every triggered condition produces exactly one incident regardless of event-stream sampling. Even on heavily-sampled accounts, the count is accurate. Multi-account: I have a US and EU NR account, can I see one combined count? Vortex IQ reads one NR account per integration. Connect each as a separate integration and stack the cards in the Nerve Centre, the stack panel sums correctly across accounts. Ingest cost vs visibility tradeoff: do incident alerts cost ingest? Alert evaluation runs against ingested events (Transaction, Logs, Infrastructure, etc.) so the underlying ingest cost matters, but the alerting layer itself is free on standard NR plans. Reducing ingest by sampling does NOT reduce alert accuracy if you’re alerting on rates (sample-corrected) rather than absolute counts. Avoid alerting on raw counts of sampled events; alert on the percentage instead. Alert tuning playbook: my count flutters between 0 and 1 every few minutes, how do I quiet it? The flutter source is almost always a single condition with a too-tight threshold. Open Recently Flapped Conditions to identify it. Three fixes in order: (a) add a for at least 3 minutes clause to the condition; (b) raise the threshold by 10, 20% if the baseline is in noise territory; (c) use NR’s adaptive baseline condition type instead of a fixed threshold (auto-tunes to the service’s normal behaviour). One issue groups 5 incidents, does the count show 1 or 5?
  1. The card counts issues (the AI-grouped containers), not raw incidents. This matches how a duty engineer reads the alerting screen, one issue means one root cause to investigate, even if it surfaced through five different conditions. To see the raw count, use Alerts Firing.

Tracked live in Vortex IQ Nerve Centre

Active Incidents is one of hundreds of KPI pulses Vortex IQ tracks across New Relic and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.