At a glance
Search Latency p50 (ms) is the median time the cluster takes to return a search query: half of all queries finish faster, half slower. It is the steady-state “typical experience” number. Unlike the tail percentiles, p50 should be low and stable; a rising median tells a DBA the whole query mix is getting slower, not just the unlucky outliers.
What it tracks
The card reports the 50th-percentile query service time for the selected period, refreshed in real time on a rolling 5-minute window (RT/5m). It is derived from the same source as its tail siblings: the indices.search.query_time_in_millis divided by query_total delta exposed in the Elasticsearch node stats API, with the distribution reconstructed across the window so a true median can be reported rather than a flat average. Because p50 reflects the bulk of traffic, it is the right number for capacity planning and for spotting broad regressions (a heavier query mix, a mapping change, a hot-thread storm) before they reach the tail. Pair it with Search Latency p95 (ms) and Search Latency p99 (ms): a healthy cluster keeps p50 well below p95, and a narrowing gap between them means the slow tail is dragging the whole distribution up. This card carries no alert threshold of its own; it is the context number that makes the tail-latency alerts readable.
Reconciling against the source
Confirm the figure against Elasticsearch’s own tooling withGET /_nodes/stats/indices/search (the query_time_in_millis and query_total counters), or watch the median directly in the Kibana Stack Monitoring “Search” panel. On a managed service such as Elastic Cloud or AWS OpenSearch, the same series appears under the cluster’s search-latency monitoring chart. Small differences are expected: Vortex IQ reconstructs a percentile over the 5-minute window, whereas a raw counter ratio gives a window average.