At a glance
Database Disk Usage % is the percentage of provisioned storage your MongoDB deployment is consuming on its data volume. It is the single most consequential capacity number a DBA watches, because MongoDB does not degrade gracefully when it runs out of disk: it stops accepting writes. A full data volume is not a slowdown, it is an outage. This card is the early-warning gauge that turns a silent, weeks-long fill into an action with days of runway to spare.
| What it tracks | Used storage as a percentage of provisioned capacity on the data volume, including data files, indexes, the oplog, and WiredTiger overhead. |
| Data source | dbStats and db.stats() storage figures (storageSize, indexSize, fsTotalSize, fsUsedSize) cross-checked against the host or volume capacity. On Atlas, corroborated by the Disk Space Used / Disk Space Free metrics and the cluster’s configured storage tier. |
| Time window | RT (real-time). The headline is the present fill level; the trend line shows the fill rate, which is what gives you runway. |
| Alert trigger | > 90%. Crossing 90% escalates to the Nerve Centre alert feed because the remaining headroom is now days, not weeks, and write availability is at risk. |
| Why it matters | A full disk halts writes. Orders, sessions, inventory updates, anything that writes stops cold. Reads may continue but the application is effectively down for any mutation. |
| Reading the value | Below 70 comfortable, 70 to 90 plan, above 90 act now. The rate of climb matters as much as the level: a steady 85% is calmer than an 80% climbing two points a day. |
| Roles | owner, engineering, operations |
Calculation
The card expresses used storage as a fraction of provisioned capacity:fsUsedSize and fsTotalSize come from the filesystem stats exposed alongside dbStats, so the number reflects the actual data volume the mongod process writes to, not just the logical size of the documents. That distinction matters because the used figure includes several things beyond raw documents:
- Data files (
storageSize), the on-disk size of collections after WiredTiger compression. - Indexes (
indexSize), which on write-heavy or many-indexed collections can rival or exceed the data itself. - The oplog, a capped collection that consumes a fixed slice of disk on every replica-set member.
- WiredTiger overhead, including the journal and space held by the storage engine that has not yet been returned to the filesystem after deletes.
compact or a resync.
The alert fires above 90% because that is the threshold where remaining runway compresses from weeks to days. The number alone is not the whole signal: the engine also surfaces the fill trend, so a flat 88% reads very differently from an 85% climbing steadily. Runway, not the instantaneous level, is what drives the action.
Worked example
A platform team runs a replica set on 500 GB volumes backing an event-logging and order service. The Database Disk Usage % card has crept from 71% to 86% over three weeks. Snapshot taken on 09 Jun 26 at 10:15 BST. It is below the 90% alert line, but the trend is the story.| Signal | Value | Note |
|---|---|---|
fsUsedSize | 430 GB | of 500 GB provisioned |
| Disk usage | 86% | climbing ~2 points/week |
Largest collection (events) | 180 GB data + 95 GB index | indexes nearly as large as data |
| Oplog | 50 GB | fixed |
| Estimated runway to 90% | ~14 days | at current fill rate |
events collection has a 30-day retention policy but a chunk of old data was never expiring because the TTL index was dropped during an earlier migration. Restoring the TTL index lets MongoDB age out roughly 60 GB, though the space will only return to the filesystem after a compact or resync. Second, grow: provision the volume from 500 GB to 750 GB, which on a managed service is an online operation with no downtime. The team chooses both, fixing the TTL index for the long term and growing the volume for immediate breathing room.
Two takeaways:
- Deleting data does not free disk immediately. WiredTiger keeps reclaimed space internally. If the percentage barely moves after a big delete, that is expected; run
compact(online on a secondary, then step down) or resync the member to return space to the OS. - Watch the rate, not just the level. A flat 88% with no growth is a managed situation. An 80% climbing two points a day is a fortnight from an outage. The alert line is a backstop; the trend is the planning tool.
Sibling cards
| Card | Why pair it with Database Disk Usage % | What the combination tells you |
|---|---|---|
| MongoDB Health Score | Disk past 90% caps the composite. | A health-score dip driven by disk equals a durability risk, not a performance one. |
| WiredTiger Cache Hit Rate % | Storage engine companion. | Low cache hit plus high disk equals a working set that no longer fits in memory. |
| WiredTiger Dirty Cache % | Eviction pressure context. | High dirty cache plus full disk equals checkpoint and eviction stress. |
| Memory Resident (MB) | Working-set sizing. | Growing data with flat memory means more spilling to disk. |
| Last Successful Backup (hours ago) | Backups need their own headroom. | A full data volume can also block backup staging. |
| Operations per Second (live) | Write load drives the fill rate. | High write ops plus rising disk equals a faster runway burn. |
| Replica Lag (seconds) | A near-full secondary can stall replication. | Disk pressure on a secondary can present first as lag. |
Reconciling against the source
Where to look in MongoDB’s own tooling:Why our number may legitimately differ from a manual read:db.stats()anddb.runCommand({ dbStats: 1 })givestorageSize,indexSize, and the filesystem totals;fsUsedSize / fsTotalSizeis the same ratio this card reports.db.collection.stats()breaks the usage down per collection so you can find the heavy hitter, including the index-to-data ratio. At the OS level,df -hon the data path is the ground truth for the volume; it will agree withfsTotalSize/fsUsedSizebarring other tenants on the same volume. Atlas Metrics tab exposesDisk Space Used,Disk Space Free, andDisk Space Percent Freeper node; the configured storage tier is the denominator.
| Reason | Direction | Why |
|---|---|---|
| Logical vs physical | Confusion, not divergence | dataSize (logical) is smaller than storageSize (compressed on-disk); this card uses filesystem usage, the physical truth. |
| Reclaimed-but-not-released space | Card higher after deletes | WiredTiger holds freed space internally; df and this card still count it until compact or resync. |
| Per-node differences | Members can diverge | Oplog window and index build state differ per member, so two replica-set nodes can show different percentages. |
| Shared volume | Card higher than dbStats | If other processes share the data volume, filesystem usage exceeds what MongoDB alone accounts for. |
| Sampling interval | Marginal lag on spikes | A sudden bulk load between polls appears on the next sample, not instantly. |
Known limitations / FAQs
I deleted millions of documents but the percentage barely moved. Is the card wrong? No, this is expected WiredTiger behaviour. Deleting documents marks the space as reusable inside the storage engine but does not return it to the filesystem, sodf and this card still count it as used. To actually reclaim disk, run compact on the collection (do it on a secondary, then step down and repeat to avoid blocking the primary) or resync the member from scratch. Until then, the percentage reflects physical occupancy, which has not changed.
What happens when the disk actually fills?
MongoDB stops accepting writes. Reads may continue, but inserts, updates, and deletes fail, and on a replica set a full primary can trigger a step-down. This is why the alert is at 90% and not at 99%: by the time you are at 99% you may already be hours from a write outage. Treat the 90% crossing as a deadline to either reclaim space or grow the volume.
Should I alert earlier than 90%?
If your fill rate is fast or your provisioning lead time is long, yes. The 90% default is a backstop. Many teams set a profile-level threshold at 80% in the Sensitivity tab so the warning arrives with more runway. The right line depends on how quickly you can add storage: an Atlas cluster that resizes online in minutes can tolerate a higher threshold than a self-hosted volume that needs a maintenance window.
Why does the indexed size sometimes rival the data size?
On collections with many indexes, especially compound or multikey indexes on large documents, the index footprint can approach or exceed the data footprint. This is normal and not a bug, but it does mean disk planning must account for indexes, not just documents. Use db.collection.stats() to see the indexSizes breakdown and prune indexes that the query planner never uses; pair with COLLSCAN Operations (24h) to confirm an index is genuinely unused before dropping it.
Does the oplog count toward this number?
Yes. The oplog is a capped collection stored on the same data volume, so its fixed size is part of used storage on every replica-set member. A large oplog (sized for a long replication window) is a deliberate trade-off: more recovery runway in exchange for permanent disk. If disk is tight and your replication window is generous, shrinking the oplog is one lever, though it reduces how long a secondary can be offline before needing a full resync.
On a sharded cluster, what does this card show?
By default it reflects the deployment the connector is scoped to. For per-shard disk health, scope the connector to individual shard members or read the per-shard figures, because one shard can be near-full while the cluster average looks comfortable. An uneven fill across shards often points at a poor shard key; cross-reference Shard Balance Skew %.
Can backups fail because of disk usage?
Yes. Some backup methods stage data locally or need temporary space, and a near-full volume can cause a backup to fail or run long. If Last Successful Backup (hours ago) starts climbing at the same time disk usage approaches its ceiling, suspect a space-starved backup process and free headroom before assuming the backup job itself is broken.