Database Disk Usage %, MongoDB - Vortex IQ Help Centre

Card class: Hero • Category: Executive Overview

At a glance

Database Disk Usage % is the percentage of provisioned storage your MongoDB deployment is consuming on its data volume. It is the single most consequential capacity number a DBA watches, because MongoDB does not degrade gracefully when it runs out of disk: it stops accepting writes. A full data volume is not a slowdown, it is an outage. This card is the early-warning gauge that turns a silent, weeks-long fill into an action with days of runway to spare.


What it tracks	Used storage as a percentage of provisioned capacity on the data volume, including data files, indexes, the oplog, and WiredTiger overhead.
Data source	`dbStats` and `db.stats()` storage figures (`storageSize`, `indexSize`, `fsTotalSize`, `fsUsedSize`) cross-checked against the host or volume capacity. On Atlas, corroborated by the `Disk Space Used` / `Disk Space Free` metrics and the cluster’s configured storage tier.
Time window	`RT` (real-time). The headline is the present fill level; the trend line shows the fill rate, which is what gives you runway.
Alert trigger	`> 90%`. Crossing 90% escalates to the Nerve Centre alert feed because the remaining headroom is now days, not weeks, and write availability is at risk.
Why it matters	A full disk halts writes. Orders, sessions, inventory updates, anything that writes stops cold. Reads may continue but the application is effectively down for any mutation.
Reading the value	Below 70 comfortable, 70 to 90 plan, above 90 act now. The rate of climb matters as much as the level: a steady 85% is calmer than an 80% climbing two points a day.
Roles	owner, engineering, operations

Calculation

The card expresses used storage as a fraction of provisioned capacity:

Disk Usage % = fsUsedSize / fsTotalSize x 100

fsUsedSize and fsTotalSize come from the filesystem stats exposed alongside dbStats, so the number reflects the actual data volume the mongod process writes to, not just the logical size of the documents. That distinction matters because the used figure includes several things beyond raw documents:

Data files (storageSize), the on-disk size of collections after WiredTiger compression.
Indexes (indexSize), which on write-heavy or many-indexed collections can rival or exceed the data itself.
The oplog, a capped collection that consumes a fixed slice of disk on every replica-set member.
WiredTiger overhead, including the journal and space held by the storage engine that has not yet been returned to the filesystem after deletes.

That last point is the one that surprises teams: deleting documents does not immediately shrink disk usage. WiredTiger marks the space as reusable internally but does not always release it back to the operating system, so a large delete can leave the percentage almost unchanged. Reclaiming it requires compact or a resync. The alert fires above 90% because that is the threshold where remaining runway compresses from weeks to days. The number alone is not the whole signal: the engine also surfaces the fill trend, so a flat 88% reads very differently from an 85% climbing steadily. Runway, not the instantaneous level, is what drives the action.

Worked example

A platform team runs a replica set on 500 GB volumes backing an event-logging and order service. The Database Disk Usage % card has crept from 71% to 86% over three weeks. Snapshot taken on 09 Jun 26 at 10:15 BST. It is below the 90% alert line, but the trend is the story.

Signal	Value	Note
`fsUsedSize`	430 GB	of 500 GB provisioned
Disk usage	86%	climbing ~2 points/week
Largest collection (`events`)	180 GB data + 95 GB index	indexes nearly as large as data
Oplog	50 GB	fixed
Estimated runway to 90%	~14 days	at current fill rate

The DBA does the runway maths and confirms about two weeks before the 90% alert and roughly four before the volume is full and writes stop.

Runway calculation:
  - free space now              = 500 - 430 = 70 GB
  - fill rate                   ~ 10 GB / week
  - headroom to 90% (450 GB)    = 20 GB -> ~2 weeks
  - headroom to 100%            = 70 GB -> ~7 weeks
  - decision deadline           = before the 90% alert, not after

Two paths are open. First, reclaim: the events collection has a 30-day retention policy but a chunk of old data was never expiring because the TTL index was dropped during an earlier migration. Restoring the TTL index lets MongoDB age out roughly 60 GB, though the space will only return to the filesystem after a compact or resync. Second, grow: provision the volume from 500 GB to 750 GB, which on a managed service is an online operation with no downtime. The team chooses both, fixing the TTL index for the long term and growing the volume for immediate breathing room. Two takeaways:

Deleting data does not free disk immediately. WiredTiger keeps reclaimed space internally. If the percentage barely moves after a big delete, that is expected; run compact (online on a secondary, then step down) or resync the member to return space to the OS.
Watch the rate, not just the level. A flat 88% with no growth is a managed situation. An 80% climbing two points a day is a fortnight from an outage. The alert line is a backstop; the trend is the planning tool.

Sibling cards

Card	Why pair it with Database Disk Usage %	What the combination tells you
MongoDB Health Score	Disk past 90% caps the composite.	A health-score dip driven by disk equals a durability risk, not a performance one.
WiredTiger Cache Hit Rate %	Storage engine companion.	Low cache hit plus high disk equals a working set that no longer fits in memory.
WiredTiger Dirty Cache %	Eviction pressure context.	High dirty cache plus full disk equals checkpoint and eviction stress.
Memory Resident (MB)	Working-set sizing.	Growing data with flat memory means more spilling to disk.
Last Successful Backup (hours ago)	Backups need their own headroom.	A full data volume can also block backup staging.
Operations per Second (live)	Write load drives the fill rate.	High write ops plus rising disk equals a faster runway burn.
Replica Lag (seconds)	A near-full secondary can stall replication.	Disk pressure on a secondary can present first as lag.

Reconciling against the source

Where to look in MongoDB’s own tooling:

db.stats() and db.runCommand({ dbStats: 1 }) give storageSize, indexSize, and the filesystem totals; fsUsedSize / fsTotalSize is the same ratio this card reports. db.collection.stats() breaks the usage down per collection so you can find the heavy hitter, including the index-to-data ratio. At the OS level, df -h on the data path is the ground truth for the volume; it will agree with fsTotalSize / fsUsedSize barring other tenants on the same volume. Atlas Metrics tab exposes Disk Space Used, Disk Space Free, and Disk Space Percent Free per node; the configured storage tier is the denominator.

Why our number may legitimately differ from a manual read:

Reason	Direction	Why
Logical vs physical	Confusion, not divergence	`dataSize` (logical) is smaller than `storageSize` (compressed on-disk); this card uses filesystem usage, the physical truth.
Reclaimed-but-not-released space	Card higher after deletes	WiredTiger holds freed space internally; `df` and this card still count it until `compact` or resync.
Per-node differences	Members can diverge	Oplog window and index build state differ per member, so two replica-set nodes can show different percentages.
Shared volume	Card higher than `dbStats`	If other processes share the data volume, filesystem usage exceeds what MongoDB alone accounts for.
Sampling interval	Marginal lag on spikes	A sudden bulk load between polls appears on the next sample, not instantly.

Known limitations / FAQs

I deleted millions of documents but the percentage barely moved. Is the card wrong? No, this is expected WiredTiger behaviour. Deleting documents marks the space as reusable inside the storage engine but does not return it to the filesystem, so df and this card still count it as used. To actually reclaim disk, run compact on the collection (do it on a secondary, then step down and repeat to avoid blocking the primary) or resync the member from scratch. Until then, the percentage reflects physical occupancy, which has not changed. What happens when the disk actually fills? MongoDB stops accepting writes. Reads may continue, but inserts, updates, and deletes fail, and on a replica set a full primary can trigger a step-down. This is why the alert is at 90% and not at 99%: by the time you are at 99% you may already be hours from a write outage. Treat the 90% crossing as a deadline to either reclaim space or grow the volume. Should I alert earlier than 90%? If your fill rate is fast or your provisioning lead time is long, yes. The 90% default is a backstop. Many teams set a profile-level threshold at 80% in the Sensitivity tab so the warning arrives with more runway. The right line depends on how quickly you can add storage: an Atlas cluster that resizes online in minutes can tolerate a higher threshold than a self-hosted volume that needs a maintenance window. Why does the indexed size sometimes rival the data size? On collections with many indexes, especially compound or multikey indexes on large documents, the index footprint can approach or exceed the data footprint. This is normal and not a bug, but it does mean disk planning must account for indexes, not just documents. Use db.collection.stats() to see the indexSizes breakdown and prune indexes that the query planner never uses; pair with COLLSCAN Operations (24h) to confirm an index is genuinely unused before dropping it. Does the oplog count toward this number? Yes. The oplog is a capped collection stored on the same data volume, so its fixed size is part of used storage on every replica-set member. A large oplog (sized for a long replication window) is a deliberate trade-off: more recovery runway in exchange for permanent disk. If disk is tight and your replication window is generous, shrinking the oplog is one lever, though it reduces how long a secondary can be offline before needing a full resync. On a sharded cluster, what does this card show? By default it reflects the deployment the connector is scoped to. For per-shard disk health, scope the connector to individual shard members or read the per-shard figures, because one shard can be near-full while the cluster average looks comfortable. An uneven fill across shards often points at a poor shard key; cross-reference Shard Balance Skew %. Can backups fail because of disk usage? Yes. Some backup methods stage data locally or need temporary space, and a near-full volume can cause a backup to fail or run long. If Last Successful Backup (hours ago) starts climbing at the same time disk usage approaches its ceiling, suspect a space-starved backup process and free headroom before assuming the backup job itself is broken.

Tracked live in Vortex IQ Nerve Centre

Database Disk Usage % is one of hundreds of KPI pulses Vortex IQ tracks across MongoDB and 70+ other ecommerce connectors. Nerve Centre runs the detection layer; Vortex Mind investigates the cause when something moves; Ask Viq lets you interrogate any number in plain English. Start for free or book a demo to see this metric running on your own data.

​At a glance

​Calculation

​Worked example

​Sibling cards

​Reconciling against the source

​Known limitations / FAQs

​Tracked live in Vortex IQ Nerve Centre