Skip to main content
Card class: HeroCategory: Monitoring

At a glance

The 95th-percentile response time across all instrumented services: 19 of every 20 shopper requests are at least this fast, but the slowest 1-in-20 may be slower. For a merchant, this is “for the unlucky 5% of shoppers, how long do they wait?” Above 1.5 seconds the page feels broken; above 3 seconds shoppers start abandoning.
API endpointDatadog Metrics API, GET /api/v1/query with p95:trace.servlet.request{*} (or the runtime equivalent: trace.aspnet_core.request, trace.express.request, trace.flask.request, etc).
Metric basisAPM span duration percentile, NOT individual request timing. Datadog computes p95 from histograms sampled at the agent.
Aggregation window1-minute rollup at source; the card displays the rolling 5-minute p95 against the 7-day comparison window.
Severity thresholdP1 = p95 above 3,000 ms (revenue impact likely); P2 = p95 above 1,500 ms (alert trigger); P3 = p95 above 800 ms (warning).
Alert pre-filteringSynthetic test traffic (@user_agent:Datadog/Synthetic) and health-check endpoints (/health, /ping, /metrics) are excluded by default. Without this, your synthetic test cadence dominates the percentile distribution.
Log Management gatingNot used. Latency is APM-derived; the card returns valid values regardless of whether Logs is enabled.
Why p95 and not averageAverage latency hides the long tail. A p50 of 200 ms looks fine even when the slowest 5% is 8 seconds. p95 is the merchant-meaningful number because it captures the experience of the unlucky shoppers most likely to bounce.
Why p95 and not p99p99 is too noisy at typical merchant traffic levels; a single GC pause or cold start skews the number. p99 is appropriate for sites above 100,000 req/min. See p99 Response Time.
Filtered hosts / servicesAll instrumented services in aggregate. For per-service breakdown see Top Slow Endpoints.
Time zoneAccount timezone for chart axes; UTC for cross-connector windowing.
Time windowT/7D vsP (today vs prior 7-day average)
Alert trigger> 1500ms, p95 above 1.5 seconds sustained for 5 minutes pages on-call.
Sentiment keyavg_response
Rolesowner, engineering

Calculation

Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.

Worked example

A UK supplements brand on Shopify with Datadog APM on the storefront, checkout, and search services. The team upgraded a third-party recommendations widget on 22 Apr 26 and missed that it added a 600 ms server-side fetch to every product-detail page.
Hour (UTC)p50p95p99Throughput
09:00 (before deploy)180 ms800 ms1,420 ms4,100 req/min
11:00 (deploy at 10:50)210 ms1,200 ms2,100 ms4,250 req/min
13:00240 ms1,650 ms3,200 ms3,890 req/min
15:00260 ms1,890 ms3,800 ms3,520 req/min
16:00 (rolled back)195 ms850 ms1,500 ms4,180 req/min
Two stories the table tells. First, p50 barely moved (180 ms to 260 ms is invisible to most shoppers); the regression hit the tail, which is exactly what p95 is designed to surface. Second, throughput dropped from 4,100 to 3,520, a 14% decline. That is shoppers abandoning slow pages and not coming back. Same period: Shopify orders/min dropped from 32 to 24. Conversion rate dropped from 1.86% to 1.43%, a 23% relative drop.
Revenue impact (estimated):
  - Slow window: 11:00 to 16:00, 5 hours
  - Baseline orders/min: 32
  - Observed orders/min during slowdown: 26 (average across the window)
  - Lost orders ≈ (32 − 26) × 60 × 5 = 1,800 orders
  - At AOV £58: lost revenue ≈ £104,400
  - Cost of the recommendations widget: ~£300/month
  - Net impact: catastrophic
Three takeaways merchants should remember:
  1. p95 is the first metric to read when conversion drops without a clear cause. Average latency, error rate, and uptime all looked fine in this incident; only p95 told the story. The slowest 5% of shoppers are the ones most likely to bounce, and they are invisible to average-based metrics.
  2. 600 ms is the magic number. Industry research (Akamai, Google) consistently finds that every 100 ms of added p95 costs roughly 1% of conversion at typical ecommerce sites. This brand added 600 ms and lost ~6% of conversion in absolute terms (which is 23% in relative terms because their baseline conversion was 1.86%).
  3. Third-party fetches are the most common p95 regression cause. Recommendations widgets, A/B test SDKs, chat widgets, review widgets, paid-social pixels, fraud-detection scripts. They each add a small amount of latency individually; collectively they dominate the tail. Run Top Slow Endpoints once a quarter and audit anything in the top 10 that is third-party.

Sibling cards merchants should reference together

CardWhy pair it with p95 LatencyWhat the combination tells you
p99 Response TimeThe deeper-tail view. Above 100k req/min, p99 is the right percentile; below, p95 is more stable.Both moving together equals general slowdown; only p99 moving equals isolated bad-luck requests.
Apdex ScoreApdex amalgamates p95 and error rate into a single shopper-perception number.If Apdex is fine but p95 is high, your tolerance threshold may be too generous; tighten it.
Top Slow EndpointsThe breakdown card. When p95 jumps, this card tells you which endpoint is responsible.One endpoint dominating equals a code-path regression; many endpoints equals an infrastructure regression.
Database Query Latency p95The most common cause of application p95 regression.App p95 up + DB p95 up equals slow query or pool exhaustion; app up + DB flat equals upstream/CPU/lock contention.
Throughput (req/s)Latency and throughput trade off.Latency up + throughput down equals capacity-limited; latency up + throughput flat equals slowness without queue building.
Deploy Markers vs LatencyOverlay deploy events on the latency chart.Almost every p95 regression aligns with a recent deploy.
Page Load p95Browser-side peer. APM measures server time; RUM measures end-to-end shopper time.RUM p95 minus APM p95 equals network + browser render time; widening gap equals CDN, third-party, or asset bloat.
Shopify / BC / Adobe Total RevenueThe merchant-impact card.Sustained p95 above 1,500 ms typically corresponds to a 5-15% revenue dip within an hour.

Reconciling against the vendor’s own dashboard

Where to look in Datadog:
APM → Service List for per-service latency percentiles. APM → Traces filtered by @duration:>1500ms for individual slow request examples. Dashboards → APM Overview for the time-series of p50, p95, p99. APM → Service Map to see latency by upstream/downstream dependency.
Why our number may legitimately differ from Datadog’s UI:
ReasonDirectionWhy
Time zoneBoundary days offDatadog UI displays in account timezone; Vortex IQ uses UTC for cross-connector windowing.
API rate limitsBrief gapsThe Metrics query API is rate-limited; on burst minutes a polled value may use the cached prior result.
Log indexing latencyNot applicableLatency is APM-derived, not log-derived.
Monitor state cacheUp to 60 secondsMonitor state refreshes once per minute.
Span samplingBoth directionsIf your APM uses head-based sampling at <100%, the p95 is computed from the sample which may slightly differ from the population. Tail-based sampling that prefers slow spans inflates Datadog’s p95 vs raw.
Cross-connector reconciliation:
CardExpected relationshipWhat causes the divergence
google_analytics.ga_property_health and RUM peersDatadog APM measures server-side timing; RUM and GA4 measure browser-side end-to-end timing. RUM client-side vs APM server-side discrepancy is healthy and expected.RUM p95 minus APM p95 equals network + browser render + third-party scripts. A 1.5x-2x ratio (RUM higher) is typical and healthy.
shopify.total_revenue / bigcommerce.total_revenue / adobe_commerce.total_revenueInverse: when p95 sustains above 1,500 ms, conversion typically drops 5-15% within the hour.If p95 is up but revenue is steady, the slowness is on a non-revenue path (admin, workers, internal API).
Datadog logsSubset relationship: log indexing latency is a different metric (data freshness) from app latency.Do not conflate the two.

Known limitations / merchant FAQs

My average response time is 250 ms. Why is the dashboard amber? Because p95 is amber, not the average. Average hides the long tail. The 5% of shoppers in the slowest bucket are typically the ones most likely to abandon the cart. A p95 of 1,800 ms with a p50 of 250 ms is a real problem even though the “average shopper experience” looks fine. What is a healthy p95 for an ecommerce site? Below 800 ms is excellent. 800-1,200 ms is good. 1,200-1,500 ms is the warning zone. Above 1,500 ms is the alert zone (page abandonment becomes measurable). Above 3,000 ms shoppers abandon at high rates and the site feels broken. These thresholds are calibrated against industry research; your specific brand may tolerate slightly more (luxury) or less (high-velocity discount) before conversion impact. Datadog says p95 is fine but customers are complaining the site is slow. The classic Datadog blind spot. APM measures server-side timing, NOT browser-side experience. Three places to check: (1) Page Load p95, the RUM equivalent that includes network and browser render time; (2) Browser Test Latency p95, a synthetic browser bot that runs the full page load including third-party scripts; (3) Look at your CDN cache hit rate, a degraded CDN will not affect APM but will tank shopper experience. Also check whether a third-party script (chat widget, A/B SDK, fraud check) added a blocking JS execution that delays interactivity. Why does my p95 spike at 03:00 UTC every day? Two common causes: (1) Your nightly batch jobs (sitemap regeneration, search-index rebuild, fraud-pattern training) run at low-traffic hours and the few requests that do arrive during the batch wait behind the batch’s DB locks; (2) Your APM agent’s metric flush may coincide with another scheduled task. Check whether the spike is driven by 1-2 endpoints (which is the batch case) or all endpoints (which is the agent case). Should I optimise for p95 or p99? For most merchants, p95. p99 is too noisy at typical traffic levels (under 100,000 req/min) because a single GC pause moves it 30-50%. p95 is stable enough to alert on and meaningful enough that improvements correspond to measurable shopper-experience gains. Optimise p99 only if you are at high traffic AND p95 is already excellent (<400 ms). For most stores, p99 is decoration. My Logs API returns 400 No valid indexes. Does this card still work? Yes. p95 latency is APM-derived, not log-derived. Log Management gating only affects log-volume cards. Vortex IQ logs the gating event once at INFO level and skips log-only cards. Why does my p95 differ between weekdays and weekends? Most ecommerce sites see lower p95 on weekends because traffic is lower and queues do not build. If your weekend p95 is higher, look for: (1) reduced ops staff causing slower issue resolution, (2) batch jobs scheduled for weekends when traffic is light but still affecting users, (3) infrastructure auto-scaling that scales down too aggressively for weekend traffic patterns. Datadog measures p95 per-service; how is the headline computed? Datadog’s API supports p95:trace.servlet.request{*} which computes the percentile across all instrumented services together. This is the right number for a merchant headline because it captures the experience across whatever endpoint a shopper happened to hit. For per-service breakdown, use Top Slow Endpoints. Note: Datadog computes per-service percentiles independently then merges; the headline is NOT a weighted average of service p95s, it is a true cross-service percentile. My multi-region site has different p95 in different regions. Which one is shown? The headline is computed across all regions. For per-region breakdown, use Uptime by Region for synthetic-test results, or filter the Datadog query by @datacenter: tag in the Datadog UI. Vortex IQ does not currently expose per-region APM percentiles in the headline (planned for a future release).

Tracked live in Vortex IQ Nerve Centre

p95 Response Time is one of hundreds of KPI pulses Vortex IQ tracks across Datadog and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.