At a glance
The 95th-percentile response time across all instrumented services: 19 of every 20 shopper requests are at least this fast, but the slowest 1-in-20 may be slower. For a merchant, this is “for the unlucky 5% of shoppers, how long do they wait?” Above 1.5 seconds the page feels broken; above 3 seconds shoppers start abandoning.
| API endpoint | Datadog Metrics API, GET /api/v1/query with p95:trace.servlet.request{*} (or the runtime equivalent: trace.aspnet_core.request, trace.express.request, trace.flask.request, etc). |
| Metric basis | APM span duration percentile, NOT individual request timing. Datadog computes p95 from histograms sampled at the agent. |
| Aggregation window | 1-minute rollup at source; the card displays the rolling 5-minute p95 against the 7-day comparison window. |
| Severity threshold | P1 = p95 above 3,000 ms (revenue impact likely); P2 = p95 above 1,500 ms (alert trigger); P3 = p95 above 800 ms (warning). |
| Alert pre-filtering | Synthetic test traffic (@user_agent:Datadog/Synthetic) and health-check endpoints (/health, /ping, /metrics) are excluded by default. Without this, your synthetic test cadence dominates the percentile distribution. |
| Log Management gating | Not used. Latency is APM-derived; the card returns valid values regardless of whether Logs is enabled. |
| Why p95 and not average | Average latency hides the long tail. A p50 of 200 ms looks fine even when the slowest 5% is 8 seconds. p95 is the merchant-meaningful number because it captures the experience of the unlucky shoppers most likely to bounce. |
| Why p95 and not p99 | p99 is too noisy at typical merchant traffic levels; a single GC pause or cold start skews the number. p99 is appropriate for sites above 100,000 req/min. See p99 Response Time. |
| Filtered hosts / services | All instrumented services in aggregate. For per-service breakdown see Top Slow Endpoints. |
| Time zone | Account timezone for chart axes; UTC for cross-connector windowing. |
| Time window | T/7D vsP (today vs prior 7-day average) |
| Alert trigger | > 1500ms, p95 above 1.5 seconds sustained for 5 minutes pages on-call. |
| Sentiment key | avg_response |
| Roles | owner, engineering |
Calculation
Calculated automatically from your Datadog data. See the At a glance summary above for what the metric tracks and the worked example below for a typical reading.Worked example
A UK supplements brand on Shopify with Datadog APM on the storefront, checkout, and search services. The team upgraded a third-party recommendations widget on 22 Apr 26 and missed that it added a 600 ms server-side fetch to every product-detail page.| Hour (UTC) | p50 | p95 | p99 | Throughput |
|---|---|---|---|---|
| 09:00 (before deploy) | 180 ms | 800 ms | 1,420 ms | 4,100 req/min |
| 11:00 (deploy at 10:50) | 210 ms | 1,200 ms | 2,100 ms | 4,250 req/min |
| 13:00 | 240 ms | 1,650 ms | 3,200 ms | 3,890 req/min |
| 15:00 | 260 ms | 1,890 ms | 3,800 ms | 3,520 req/min |
| 16:00 (rolled back) | 195 ms | 850 ms | 1,500 ms | 4,180 req/min |
- p95 is the first metric to read when conversion drops without a clear cause. Average latency, error rate, and uptime all looked fine in this incident; only p95 told the story. The slowest 5% of shoppers are the ones most likely to bounce, and they are invisible to average-based metrics.
- 600 ms is the magic number. Industry research (Akamai, Google) consistently finds that every 100 ms of added p95 costs roughly 1% of conversion at typical ecommerce sites. This brand added 600 ms and lost ~6% of conversion in absolute terms (which is 23% in relative terms because their baseline conversion was 1.86%).
- Third-party fetches are the most common p95 regression cause. Recommendations widgets, A/B test SDKs, chat widgets, review widgets, paid-social pixels, fraud-detection scripts. They each add a small amount of latency individually; collectively they dominate the tail. Run Top Slow Endpoints once a quarter and audit anything in the top 10 that is third-party.
Sibling cards merchants should reference together
| Card | Why pair it with p95 Latency | What the combination tells you |
|---|---|---|
| p99 Response Time | The deeper-tail view. Above 100k req/min, p99 is the right percentile; below, p95 is more stable. | Both moving together equals general slowdown; only p99 moving equals isolated bad-luck requests. |
| Apdex Score | Apdex amalgamates p95 and error rate into a single shopper-perception number. | If Apdex is fine but p95 is high, your tolerance threshold may be too generous; tighten it. |
| Top Slow Endpoints | The breakdown card. When p95 jumps, this card tells you which endpoint is responsible. | One endpoint dominating equals a code-path regression; many endpoints equals an infrastructure regression. |
| Database Query Latency p95 | The most common cause of application p95 regression. | App p95 up + DB p95 up equals slow query or pool exhaustion; app up + DB flat equals upstream/CPU/lock contention. |
| Throughput (req/s) | Latency and throughput trade off. | Latency up + throughput down equals capacity-limited; latency up + throughput flat equals slowness without queue building. |
| Deploy Markers vs Latency | Overlay deploy events on the latency chart. | Almost every p95 regression aligns with a recent deploy. |
| Page Load p95 | Browser-side peer. APM measures server time; RUM measures end-to-end shopper time. | RUM p95 minus APM p95 equals network + browser render time; widening gap equals CDN, third-party, or asset bloat. |
| Shopify / BC / Adobe Total Revenue | The merchant-impact card. | Sustained p95 above 1,500 ms typically corresponds to a 5-15% revenue dip within an hour. |
Reconciling against the vendor’s own dashboard
Where to look in Datadog:
APM → Service List for per-service latency percentiles.
APM → Traces filtered by @duration:>1500ms for individual slow request examples.
Dashboards → APM Overview for the time-series of p50, p95, p99.
APM → Service Map to see latency by upstream/downstream dependency.
Why our number may legitimately differ from Datadog’s UI:
| Reason | Direction | Why |
|---|---|---|
| Time zone | Boundary days off | Datadog UI displays in account timezone; Vortex IQ uses UTC for cross-connector windowing. |
| API rate limits | Brief gaps | The Metrics query API is rate-limited; on burst minutes a polled value may use the cached prior result. |
| Log indexing latency | Not applicable | Latency is APM-derived, not log-derived. |
| Monitor state cache | Up to 60 seconds | Monitor state refreshes once per minute. |
| Span sampling | Both directions | If your APM uses head-based sampling at <100%, the p95 is computed from the sample which may slightly differ from the population. Tail-based sampling that prefers slow spans inflates Datadog’s p95 vs raw. |
| Card | Expected relationship | What causes the divergence |
|---|---|---|
google_analytics.ga_property_health and RUM peers | Datadog APM measures server-side timing; RUM and GA4 measure browser-side end-to-end timing. RUM client-side vs APM server-side discrepancy is healthy and expected. | RUM p95 minus APM p95 equals network + browser render + third-party scripts. A 1.5x-2x ratio (RUM higher) is typical and healthy. |
shopify.total_revenue / bigcommerce.total_revenue / adobe_commerce.total_revenue | Inverse: when p95 sustains above 1,500 ms, conversion typically drops 5-15% within the hour. | If p95 is up but revenue is steady, the slowness is on a non-revenue path (admin, workers, internal API). |
| Datadog logs | Subset relationship: log indexing latency is a different metric (data freshness) from app latency. | Do not conflate the two. |
Known limitations / merchant FAQs
My average response time is 250 ms. Why is the dashboard amber? Because p95 is amber, not the average. Average hides the long tail. The 5% of shoppers in the slowest bucket are typically the ones most likely to abandon the cart. A p95 of 1,800 ms with a p50 of 250 ms is a real problem even though the “average shopper experience” looks fine. What is a healthy p95 for an ecommerce site? Below 800 ms is excellent. 800-1,200 ms is good. 1,200-1,500 ms is the warning zone. Above 1,500 ms is the alert zone (page abandonment becomes measurable). Above 3,000 ms shoppers abandon at high rates and the site feels broken. These thresholds are calibrated against industry research; your specific brand may tolerate slightly more (luxury) or less (high-velocity discount) before conversion impact. Datadog says p95 is fine but customers are complaining the site is slow. The classic Datadog blind spot. APM measures server-side timing, NOT browser-side experience. Three places to check: (1) Page Load p95, the RUM equivalent that includes network and browser render time; (2) Browser Test Latency p95, a synthetic browser bot that runs the full page load including third-party scripts; (3) Look at your CDN cache hit rate, a degraded CDN will not affect APM but will tank shopper experience. Also check whether a third-party script (chat widget, A/B SDK, fraud check) added a blocking JS execution that delays interactivity. Why does my p95 spike at 03:00 UTC every day? Two common causes: (1) Your nightly batch jobs (sitemap regeneration, search-index rebuild, fraud-pattern training) run at low-traffic hours and the few requests that do arrive during the batch wait behind the batch’s DB locks; (2) Your APM agent’s metric flush may coincide with another scheduled task. Check whether the spike is driven by 1-2 endpoints (which is the batch case) or all endpoints (which is the agent case). Should I optimise for p95 or p99? For most merchants, p95. p99 is too noisy at typical traffic levels (under 100,000 req/min) because a single GC pause moves it 30-50%. p95 is stable enough to alert on and meaningful enough that improvements correspond to measurable shopper-experience gains. Optimise p99 only if you are at high traffic AND p95 is already excellent (<400 ms). For most stores, p99 is decoration. My Logs API returns 400 No valid indexes. Does this card still work? Yes. p95 latency is APM-derived, not log-derived. Log Management gating only affects log-volume cards. Vortex IQ logs the gating event once at INFO level and skips log-only cards. Why does my p95 differ between weekdays and weekends? Most ecommerce sites see lower p95 on weekends because traffic is lower and queues do not build. If your weekend p95 is higher, look for: (1) reduced ops staff causing slower issue resolution, (2) batch jobs scheduled for weekends when traffic is light but still affecting users, (3) infrastructure auto-scaling that scales down too aggressively for weekend traffic patterns. Datadog measures p95 per-service; how is the headline computed? Datadog’s API supportsp95:trace.servlet.request{*} which computes the percentile across all instrumented services together. This is the right number for a merchant headline because it captures the experience across whatever endpoint a shopper happened to hit. For per-service breakdown, use Top Slow Endpoints. Note: Datadog computes per-service percentiles independently then merges; the headline is NOT a weighted average of service p95s, it is a true cross-service percentile.
My multi-region site has different p95 in different regions. Which one is shown?
The headline is computed across all regions. For per-region breakdown, use Uptime by Region for synthetic-test results, or filter the Datadog query by @datacenter: tag in the Datadog UI. Vortex IQ does not currently expose per-region APM percentiles in the headline (planned for a future release).