Database Disk Usage %, MariaDB - Vortex IQ Help Centre

Card class: Hero • Category: Executive Overview

At a glance

Database Disk Usage % is the proportion of the data volume that the MariaDB instance has consumed: data files, indexes, binary logs, the InnoDB redo log, temporary files, and the undo log, divided by the total size of the volume holding the data directory. It is the single most consequential capacity metric on the instance, because when a MariaDB data volume fills completely, the server cannot write. Writes fail, replication stalls, and on many builds the server stops accepting transactions entirely until space is freed. This card exists to make sure that never happens by surface.


What it tracks	Used space as a percentage of total space on the volume that holds the MariaDB data directory (`datadir`), for the selected period. Includes data, indexes, binary logs, redo and undo logs, and temp files.
Data source	The filesystem free/used figures for the `datadir` volume, cross-referenced with `information_schema.TABLES` (DATA_LENGTH + INDEX_LENGTH) and binary-log sizing. On managed services, the provider’s storage metric.
Time window	`RT` (real-time, refreshed on each poll).
Alert trigger	`>90%`. Above 90% the card flags red on the Executive Overview, because the remaining headroom can vanish fast under write load or a large binary-log accumulation.
Why it matters	A full data volume is one of the few failures that takes a database fully offline for writes. There is no graceful degradation: at 100% the server stops writing. The 90% alert is an early-warning line, not a comfort zone.
Sensitivity	Sensitivity card: the alert threshold is tunable per profile, but lowering it below 90 is more common than raising it, since fast-growing instances need more runway.
Roles	owner, engineering, operations

Calculation

The metric is used bytes / total bytes on the filesystem volume that contains the MariaDB datadir, expressed as a percentage. Crucially this is volume-level, not just the size of the tables, because several things share that volume and grow independently:

Consumer	What it is	Why it can grow unexpectedly
Table data + indexes	The `.ibd` files for InnoDB tables	Normal growth, plus bloat from deleted-but-not-reclaimed rows
Binary logs	The `binlog` files used for replication and point-in-time recovery	Accumulate until `expire_logs_days` / `binlog_expire_logs_seconds` purges them; a stalled replica can pin them indefinitely
InnoDB redo log	The fixed-size write-ahead log	Sized by config, usually stable
Undo log	Multi-version concurrency control rollback segments	Can balloon if a very long-running transaction prevents purge
Temp files	On-disk temporary tables and sort buffers	Spike during large sorts, `ALTER TABLE`, or schema migrations

Because of binary-log accumulation and undo-log growth, a volume can fill even when table sizes are flat. The card reads the live filesystem figure so it captures all of these at once. The >90% alert is set where it is because the last 10% can disappear in minutes during a bulk import, a long migration, or a replica outage that pins binary logs. Calculated automatically from your MariaDB data; see the worked example for a typical reading.

Worked example

A platform team runs a MariaDB primary on a 500 GB data volume backing a Shopify-connected order-history warehouse. Snapshot taken on 02 May 26 at 03:15 BST during an overnight batch load.

Consumer	Size	% of volume
Table data + indexes	360 GB	72%
Binary logs	78 GB	16%
Undo + redo + temp	18 GB	4%
Free	44 GB	8% (used = 92%)

The card headline reads 92% disk usage, red against the >90% band. The on-call engineer reads three things:

It is the binary logs, not the tables, that crossed the line. Table growth is steady; the jump came from 78 GB of binary logs. A read replica went offline two days ago and has not reconnected, so the primary is retaining every binary log the replica has not yet consumed. The logs cannot be purged while the replica still needs them.
The runway is short. At 8% free on a volume taking an overnight batch load, the headroom is hours, not days. If the batch writes another 50 GB before the logs are freed, the volume fills and the server stops writing, taking the order pipeline down.
There are two valid fixes, one fast and one correct. Fast: extend the volume (trivial on a managed service, a resize on self-hosted). Correct: bring the dead replica back so binary logs purge naturally, or, if the replica is gone for good, drop it from the topology so binlog_expire_logs_seconds can reclaim the 78 GB. The team does both: extends the volume to buy time, then fixes the replica.

Headroom framing for this snapshot:
  - Volume: 500 GB, used 92% (460 GB), free 44 GB
  - Dominant growth: binary logs 78 GB, pinned by an offline replica
  - Batch load remaining tonight: ~50 GB
  - Outcome if untouched: volume fills mid-batch -> writes stop -> order pipeline down
  - Fast fix: extend volume now
  - Correct fix: reconnect or remove the offline replica to purge binlogs

Three takeaways:

Disk usage is not the same as table size. Binary logs and undo logs can fill a volume while your tables are flat. Always check what is actually consuming the space before assuming you need a bigger database.
A pinned binary log is the classic silent filler. An offline or lagging replica stops binary logs from purging, and they grow without bound. Pair this card with the replication cards: a disk climb plus replication lag is almost always binlog retention.
The last 10% is the dangerous part. Below 90% you have planning time; above 90% you have response time. Treat the alert as “act now”, not “schedule a ticket”, because the consequence of hitting 100% is a write-down outage, not a slowdown.

Sibling cards

Card	Why pair it with Database Disk Usage	What the combination tells you
Async Replication Lag (seconds)	Lagging or dead replicas pin binary logs.	Disk climbing plus replication lag equals binlog retention; fix the replica to reclaim space.
Last Successful Backup (hours ago)	Backups can need temporary space and a full disk blocks them.	A full disk often coincides with a failed backup; both are capacity emergencies.
Memory Usage %	Large on-disk temp files spill from memory pressure.	High memory plus disk growth equals queries spilling to disk; tune sorts and temp tables.
MariaDB Health Score	Disk above 90% can sink the composite on its own.	A health-score drop with everything else green usually points straight here.
Failover Readiness	A standby that is also low on disk cannot safely take over.	Primary disk high plus standby disk high equals no safe failover target.
Slow-Query Rate %	Disk-bound temp tables slow queries.	Disk pressure plus slow queries equals on-disk sorts; the volume is now a performance bottleneck too.
Galera Cluster Size	A node that fills its disk drops out of the cluster.	Disk full on one node plus shrinking cluster size equals a node evicted for being out of space.

Reconciling against the source

Where to look on the server:

At the OS level, df -h on the volume holding datadir is the ground truth for used-versus-total. This is what the card reports against. SELECT table_schema, ROUND(SUM(data_length + index_length)/1024/1024/1024, 1) AS gb FROM information_schema.TABLES GROUP BY table_schema ORDER BY gb DESC; to attribute table-and-index space by schema. SHOW BINARY LOGS; to list binary logs and their sizes; sum them to see how much of the volume they hold. SHOW VARIABLES LIKE 'binlog_expire_logs_seconds'; (or expire_logs_days on older builds) to confirm the retention policy that should be purging them. SELECT * FROM information_schema.INNODB_TRX ORDER BY trx_started; to find a long-running transaction that may be inflating the undo log.

Why our number may legitimately differ:

Reason	Direction	Why
Volume vs table sum	Card higher	The card reports filesystem usage (data + binlogs + logs + temp); summing `information_schema.TABLES` alone undercounts because it omits binary and transaction logs.
Reserved blocks	Card higher	Some filesystems reserve a percentage of blocks for root; `df` accounts for them, a raw table sum does not.
Sparse / fragmented files	Variable	InnoDB tablespace files can hold free pages from deleted rows; on-disk size exceeds live data until `OPTIMIZE TABLE` reclaims it.
Poll timing	Brief	A bulk load between the card poll and your manual `df` will show different figures.

Managed-service note: On Amazon RDS / Aurora for MariaDB the equivalent is the FreeStorageSpace CloudWatch metric (Aurora storage auto-scales, RDS does not); on Azure Database for MariaDB it is the Storage percent metric. These provider metrics are the canonical figure on managed instances because you do not have OS shell access to run df.

Known limitations / FAQs

My tables only total 360 GB but the card shows 92% of a 500 GB volume. Where did the rest go? The volume holds more than tables. Binary logs (used for replication and point-in-time recovery), the InnoDB redo and undo logs, and on-disk temporary files all share the data volume. The most common surprise is binary-log accumulation: run SHOW BINARY LOGS; and sum the sizes. If they are large, an offline or lagging replica is usually pinning them. The disk filled overnight with no change in traffic. How? Three usual causes: (1) a replica went offline and binary logs stopped purging; (2) a very long-running transaction prevented InnoDB from purging undo-log history, which then grew; (3) a large ALTER TABLE or batch import wrote gigabytes of temporary files. Check SHOW BINARY LOGS;, information_schema.INNODB_TRX, and the temp directory in that order. What actually happens at 100%? Writes fail. InnoDB cannot extend tablespaces or write to its logs, so transactions error out, replication on the primary stalls, and on many configurations the server effectively halts write activity until space is freed. Reads may continue for a while, but the instance is functionally down for the application. This is why the alert is at 90, not 98. How do I free space quickly in an emergency? In order of speed and safety: (1) extend the volume (instant on managed services, a resize on self-hosted); (2) purge old binary logs with PURGE BINARY LOGS BEFORE ... once you have confirmed no replica still needs them; (3) drop or truncate disposable staging tables; (4) OPTIMIZE TABLE to reclaim space from heavily deleted tables (but this needs temporary space, so do not run it at 99%). Extending the volume is almost always the right first move because it is reversible and buys time. Can I just raise the alert above 90%? You can in the Sensitivity tab, but think carefully. The 90% line exists because the last 10% can vanish in minutes under write load or binlog growth. Fast-growing instances usually want a lower threshold for more runway, not a higher one. Raise it only if the volume is large enough that 10% is still many hours of headroom. Does shrinking a table reclaim disk immediately? Not necessarily. Deleting rows marks pages free inside the InnoDB tablespace file but does not return the space to the filesystem; the file stays the same size and reuses the free pages for new rows. To return space to the OS you must OPTIMIZE TABLE (which rebuilds the file) or, for whole tables, drop them. This is why the card can show high usage even after a big delete. Why is the managed-service storage metric slightly different from this card? On managed services the provider’s storage metric is the canonical figure and the card reports against it. Small differences come from poll timing and from the provider counting some internal overhead (snapshots, WAL on Aurora) that a raw datadir view would not. Treat the provider metric as truth on managed instances and reconcile the card to it.

Tracked live in Vortex IQ Nerve Centre

Database Disk Usage % is one of hundreds of KPI pulses Vortex IQ tracks across MariaDB and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

​At a glance

​Calculation

​Worked example

​Sibling cards

​Reconciling against the source

​Known limitations / FAQs

​Tracked live in Vortex IQ Nerve Centre