MariaDB Health Score, MariaDB - Vortex IQ Help Centre

Card class: Hero • Category: Executive Overview

At a glance

The MariaDB Health Score is a single 0 to 100 composite that rolls up the instance’s most important operational signals (connection-pool saturation, query latency, error rate, replication lag, buffer-pool hit rate, disk usage, and Galera quorum where present) into one number a platform lead can read at a glance. It answers “is the database fundamentally healthy right now, and was it healthy over the last week?” A score of 100 means every input is inside its green band. A score below 70 means at least one critical input has degraded enough to warrant attention before it becomes an incident.


What it tracks	A weighted composite health index for the MariaDB instance (or Galera cluster) for the selected period. The score blends the sub-metrics that each have their own card, so a drop here always traces back to a specific input.
Data source	Derived inside Vortex IQ from the same `SHOW GLOBAL STATUS`, `information_schema`, and Galera `wsrep_*` counters that feed the underlying cards. No separate query: it reuses the polled inputs.
Time window	`RT/7D`. The gauge shows the live composite; the trend shows the rolling 7-day band so you can tell “always been like this” from “degraded this week”.
Alert trigger	`<70`. A composite below 70 flags amber on the Executive Overview and pages the on-call rota if sustained.
Calculation basis	Weighted average of normalised sub-scores. Each input is mapped to a 0 to 100 band against its own threshold (for example pool saturation 0% maps to 100, 90%+ maps to 0), then combined by weight.
Sensitivity	This is a sensitivity card: the thresholds and input weights are tunable per profile in the Sensitivity tab so the score reflects your own baseline rather than a generic default.
What does NOT move it	Cosmetic or non-operational counters (uptime in days, total queries served lifetime) are excluded; they do not indicate health.
Roles	owner, engineering, operations

Calculation

The score is a weighted blend of the instance’s critical operational inputs, each first normalised to a 0 to 100 sub-score against its own threshold, then averaged by weight. The inputs and the direction that hurts the score are:

Input (sibling card)	Source signal	Direction that lowers the score
Connection pool saturation	`Threads_connected` / `max_connections`	Saturation rising toward 90%+
Query error rate	`(Aborted_clients + Connection_errors) / Questions`	Error rate above 1%
Query latency p95 / p99	statement digests in `performance_schema`	p95 above 200ms, p99 above 500ms
Buffer-pool hit rate	`Innodb_buffer_pool_read_requests` vs `..._reads`	Hit rate below 95%
Replication lag	`Seconds_Behind_Master` / `wsrep_local_recv_queue`	Lag above 10s
Disk usage	data directory free space	Usage above 90%
Galera quorum (clustered only)	`wsrep_cluster_status`, `wsrep_cluster_size`	Status not `Primary`, or node count below expected

Each sub-score sits at 100 when the input is comfortably inside its green band and falls toward 0 as the input crosses its alert threshold. The composite is the weighted mean, so one badly degraded critical input (a non-Primary Galera state, for example) can pull the headline well below 70 even when everything else is green. Because every input has its own card, a low score is always explainable: open the Executive Overview, find the red sub-metric, and drill into its card. Calculated automatically from your MariaDB data; see the worked example for a typical reading.

Worked example

A platform team runs a 3-node Galera cluster behind a high-traffic Magento storefront. Snapshot taken on 14 Apr 26 at 19:40 BST during an evening promotional push.

Input	Reading	Sub-score	Weight
Connection pool saturation	78% (climbing)	55	high
Query error rate	0.3%	92	high
Query latency p95	240ms (over 200ms band)	60	medium
Buffer-pool hit rate	99.1%	100	medium
Replication / Galera lag	flow control paused 2%	90	high
Disk usage	71%	100	medium
Galera quorum	`Primary`, size 3/3	100	critical

The composite lands at 74, amber but not yet alerting. The headline gauge sits just above the 70 line and the 7-day trend shows the score has slipped from a steady 88 over the past three hours. The platform lead reads three things:

The slip is real, not noise. The 7-day band makes clear this instance normally runs at 88, so a drop to 74 during the promo is a genuine degradation, not the usual evening shape.
Two inputs are dragging the score. Pool saturation (55) and p95 latency (60) are the culprits; everything else is green. The story is “traffic is pushing the connection pool and queries are starting to queue”, a classic load-driven pattern rather than a fault.
Action is preventative, not reactive. Because the score is still above 70, the team has runway: raise max_connections headroom or add a read replica before saturation crosses 90 and the cluster starts refusing connections at checkout.

Composite framing for this snapshot:
  - Live score: 74 (amber, threshold <70)
  - 7-day baseline: 88
  - Largest drags: pool saturation 55, p95 latency 60
  - Recommended action: add read capacity before saturation hits 90%
  - If ignored: saturation crosses 90% -> "Too many connections" at checkout -> revenue impact

Three takeaways:

A composite is a starting point, never an endpoint. The number tells you “look”, the sub-metric cards tell you “where”. Always drill from the score into the red input before acting.
Read the gauge with the 7-day trend. A score of 74 means very different things for an instance that normally runs at 75 versus one that normally runs at 92. The trend supplies the baseline.
One critical input can dominate. A non-Primary Galera state or a disk above 90% can sink the headline on its own regardless of how green the rest is, by design, because those conditions are existential for the database.

Sibling cards

Card	Why pair it with MariaDB Health Score	What the combination tells you
Connection Pool Saturation %	Highest-weight load input into the composite.	A low score during a traffic peak almost always traces to rising saturation here.
Query Error Rate %	Error-side input.	Score down plus error rate up equals a fault, not just load; investigate failing statements.
Query Latency p95 (ms)	Latency input.	Score down plus p95 up equals queries queueing; check slow-query rate and buffer pool.
InnoDB / XtraDB Buffer Pool Hit Rate %	Memory-efficiency input.	A falling hit rate drags latency and the composite together; often a sizing problem.
Async Replication Lag (seconds)	Replication input.	Lag spikes pull the composite down and threaten read-after-write consistency.
Database Disk Usage %	Capacity input with hard ceiling.	Disk above 90% can sink the score alone; a full disk halts writes entirely.
Galera Cluster Status	Existential quorum input on clustered instances.	A non-Primary status collapses the composite because the cluster has gone read-only.
Queries per Second (live)	Load context (not a direct input).	Read the score against QPS to separate “healthy under load” from “unhealthy at rest”.

Reconciling against the source

Where to look on the server: There is no single native command that emits a “health score”: it is a Vortex IQ composite. To reconcile, verify each input independently against MariaDB’s own tooling, then confirm the headline moves in step.

SHOW GLOBAL STATUS; for the raw counters (Threads_connected, Aborted_clients, Connection_errors_%, Innodb_buffer_pool_read_requests, Innodb_buffer_pool_reads). SHOW VARIABLES LIKE 'max_connections'; to confirm the saturation denominator. SHOW ALL SLAVES STATUS\G (or SHOW REPLICA STATUS\G on newer builds) for replication lag. SHOW STATUS LIKE 'wsrep_%'; for Galera quorum and flow-control inputs. SELECT DIGEST_TEXT, AVG_TIMER_WAIT FROM performance_schema.events_statements_summary_by_digest ORDER BY AVG_TIMER_WAIT DESC; for the latency inputs.

Why our number may legitimately differ from a hand calculation:

Reason	Direction	Why
Profile weights	Variable	The composite uses your configured input weights; a manual unweighted average will differ.
Normalisation curves	Variable	Each input is mapped through a band curve, not a linear scale; the midpoint is not 50 for every input.
Poll timing	Brief	The composite reuses the last polled value of each input; a sub-metric sampled seconds later can shift the score marginally.
Galera presence	Structural	On a non-clustered instance the Galera inputs are dropped and weights re-normalise across the remaining inputs.

Managed-service note: On Amazon RDS / Aurora, Azure Database for MariaDB, or MariaDB SkySQL the provider exposes the same underlying counters (DatabaseConnections, ReplicaLag, CPU and memory) as console metrics. There is no native composite to compare against; reconcile input by input.

Known limitations / FAQs

Why is my health score 74 when every alert is green? The composite turns amber before individual cards cross their hard alert thresholds. A 74 means one or more inputs are in the warning zone (for example pool saturation at 78%, below the 90% alert but well off the green band). That is the point of the score: it gives you runway to act before a sub-metric trips its own alert. The score dropped but I cannot tell which input caused it. Open the Executive Overview and scan the sub-metric cards for the one in amber or red. The composite is always explainable from its inputs; if two inputs moved together (latency and buffer-pool hit rate, say) they usually share a root cause. Use Vortex Mind to trace the upstream cause. Can I change which inputs count and how much they weigh? Yes. This is a sensitivity card. In the Sensitivity tab you can adjust each input’s weight and its threshold band per profile. Teams that run read-heavy reporting replicas often raise the buffer-pool and latency weights; teams on Galera raise the quorum weight to make a non-Primary state dominate. My instance is a single node with no replication or Galera. Does the score still work? Yes. Replication and Galera inputs are dropped and the weights re-normalise across the remaining inputs (saturation, errors, latency, buffer pool, disk). The score is then a clean read on a standalone server. Why does the score sometimes sit at 100 for days then drop sharply? A healthy instance inside every green band scores 100 and stays there until an input crosses into its warning zone. The drop is sharp because crossing a threshold moves that input’s sub-score quickly through the band curve. The 7-day trend makes these step changes easy to spot. Should I page on a score of 69? The <70 trigger is the default amber boundary, not an automatic page. Whether 69 pages depends on your sustained-duration setting in the Sensitivity tab. A momentary dip to 69 during a deploy is normal; a score parked below 70 for several poll cycles is worth waking someone for. Tune the sustained window to your tolerance. Does a high score guarantee there is no problem? No. The score only reflects the inputs it measures. A logical fault (a bad migration, a corrupt index, a runaway report query that has not yet pushed latency past its band) can exist at a score of 95. Treat the score as a strong negative signal (low score means definitely investigate) rather than an absolute all-clear.

Tracked live in Vortex IQ Nerve Centre

MariaDB Health Score is one of hundreds of KPI pulses Vortex IQ tracks across MariaDB and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

​At a glance

​Calculation

​Worked example

​Sibling cards

​Reconciling against the source

​Known limitations / FAQs

​Tracked live in Vortex IQ Nerve Centre