At a glance
How many database operations took longer than the slow-query threshold in the last 15 minutes. MongoDB’s database profiler records any operation whose execution time exceeds theslowmssetting (100 milliseconds by default), capturing the query shape, the collection, the duration, and whether an index was used. This card counts those profiler entries over a rolling 15-minute window. A handful of slow operations is normal on any busy cluster; a rising count is the earliest, cheapest signal that something is degrading, whether a missing index, a collection scan, lock contention, or cache pressure, usually well before it shows up as user-visible latency or errors. The card raises a sensitivity alert at>10: more than ten slow operations in fifteen minutes means the slowness is no longer an occasional outlier but a pattern worth investigating now.
| What it tracks | The number of operations recorded by the database profiler whose execution time exceeded the slowms threshold, counted over the most recent 15 minutes. |
| Data source | Profiler entries with millis > slowms threshold (default 100ms). The profiler writes one document per slow operation to the capped system.profile collection in each profiled database; Vortex IQ counts the entries whose millis field exceeds the configured slowms. |
| Time window | 15m (rolling 15-minute window). The count covers slow operations recorded in the last fifteen minutes, so it reflects current behaviour rather than a lifetime total. |
| Alert trigger | >10. More than ten slow operations within the 15-minute window raises a sensitivity alert: the slowness is sustained enough to be a pattern, not a one-off. |
| What counts | Any profiled operation over threshold: finds, updates, deletes, aggregations, getmores, and commands. The duration measured is the operation’s total time including any waiting for locks. |
| What does NOT count | Operations on databases where the profiler is off (profiling is per-database, not cluster-wide), operations faster than slowms, and operations that were evicted from the capped system.profile collection before they were read (a very high slow-op rate can roll the cap). |
| Roles | owner, engineering, platform, dba |
Calculation
The value is a count of profiler documents over the slow threshold within the window:slowmsis the threshold in milliseconds above which an operation is considered slow. The default is 100ms. It is set permongod(and visible via the profiling status), so the card reflects whatever threshold the deployment actually uses. If a team has loweredslowmsto 50ms to catch more, the count will be higher; if raised to 200ms, lower.- The profiling level controls what gets written. Level 1 logs only operations slower than
slowms(the usual production setting and exactly what this card needs). Level 2 logs every operation regardless of speed; the card still only counts those overslowms. Level 0 disables profiling, in which casesystem.profileis empty and this card reads zero.
system.profile, 1 MB by default), so on a deployment generating a very high rate of slow operations, the oldest entries can be overwritten before they are read. In that scenario the true slow-op rate is higher than the count shown; the card surfaces a lower bound. This is rare in practice because a deployment producing enough slow ops to roll a 1 MB cap inside 15 minutes is already deep in the red on the alert.
This card is the leading indicator for the Performance category. It moves before Query Latency p95 (ms) and well before Query Error Rate %, because a query is slow long before it is slow enough to time out and error.
Worked example
A platform team runs a replica set behind a catalogue and search service. Profiling is at level 1 with the defaultslowms of 100ms. Snapshot taken on 16 Apr 26 at 14:05 BST.
| 15-minute window | Slow ops | Top offending shape | Reading |
|---|---|---|---|
| 13:20 to 13:35 | 3 | find on orders by customerId | Normal background noise |
| 13:35 to 13:50 | 6 | find on products by tags | Edging up |
| 13:50 to 14:05 | 17 | find on products by tags (COLLSCAN) | Alert fires; one shape dominates |
-
One query shape accounts for most of it. Fourteen of the seventeen slow operations are the same
findon theproductscollection filtering bytags. The profiler entries showCOLLSCANin the plan, meaning every one of these queries is scanning the whole collection because there is no index ontags. This lines up with COLLSCAN Operations (24h) also climbing. - Why now? A marketing change started linking to tag-filtered product pages an hour ago, so a query shape that was previously rare is now running on every page load. The collection is large enough that the scan crosses 100ms every time.
-
The fix is an index, not more hardware. Each slow
findreads the entireproductscollection from cache (or disk if it spills), which is why WiredTiger Cache Hit Rate % is also dipping: the repeated full scans are churning the working set. An index ontagsturns each scan into an index lookup, dropping the per-query time from ~140ms to single-digit milliseconds.
Sibling cards
| Card | Why pair it with Slow Ops | What the combination tells you |
|---|---|---|
| Top 10 Slow Operations | The detailed breakdown behind the count. | The count tells you how many; this table tells you which shapes, so you know what to index or rewrite. |
| COLLSCAN Operations (24h) | The most common cause of slow ops. | A slow-ops spike that tracks a COLLSCAN spike equals a missing index, the easiest class of fix. |
| Query Latency p95 (ms) | The user-facing consequence. | Slow ops is the leading indicator; p95 latency is the lagging confirmation that users are feeling it. |
| Query Latency p99 (ms) | The tail that slow ops feeds. | A handful of very slow operations inflate p99 long before they move the median. |
| WiredTiger Cache Hit Rate % | The cache pressure that slow scans cause. | Repeated full scans evict the working set; a falling hit rate alongside slow ops points at scan-driven cache churn. |
| Query Error Rate % | The end-stage when slow becomes failed. | Slow ops that keep climbing eventually time out and convert to errors; watch both during an incident. |
| MongoDB Health Score | The composite that factors slow operations. | A sustained slow-ops breach drags the composite down before any single user complains. |
Reconciling against the source
Where to look in MongoDB’s own tooling:On MongoDB Atlas, the Performance Advisor and the Profiler tab surface slow queries with the samedb.system.profile.find({ millis: { $gt: 100 } }).sort({ ts: -1 })inmongoshagainst a profiled database returns the raw slow-operation entries this card counts, with the query shape, duration, plan summary, and timestamp.db.getProfilingStatus()confirms the profiling level and theslowmsvalue in effect, so you can verify the threshold the card is counting against. Themongodlog also records slow operations (lines taggedSlow query) independently of the profiler, controlled by the sameslowms; this is a useful cross-check if profiling is off but logging is on.db.currentOp({ "secs_running": { $gte: 1 } })shows operations that are slow right now and still running, which the profiler only records once they finish.
slowms basis, and the Query Profiler view groups them by shape, which is the managed equivalent of the Top 10 Slow Operations card.
Why our number may legitimately differ from the profiler:
| Reason | Direction | Why |
|---|---|---|
slowms value | Variable | If the deployment’s slowms differs from 100ms, the count reflects that threshold, not the nominal 100ms in the card title. Confirm with db.getProfilingStatus(). |
| Profiling per database | Our value lower | Profiling is enabled per database; operations on a database with profiling off are not counted. The mongod log may still show them. |
| Capped collection roll | Our value lower | A very high slow-op rate can overwrite the capped system.profile before entries are read, so the count is a lower bound during a severe event. |
| Window edges | Marginal | Our 15-minute window and your manual find time range rarely align to the second; entries near the boundary may be in one and not the other. |
Known limitations / FAQs
The card reads zero but I know my queries are slow. Why? The most likely reason is that profiling is disabled (level 0) on the database in question. Profiling is set per database, not cluster-wide, so a newly created database inherits the default and may not be profiled. Check withdb.getProfilingStatus() and enable level 1 with db.setProfilingLevel(1) to capture operations over slowms. The mongod log records slow queries independently, so cross-check there too.
Does enabling the profiler slow my database down?
Profiling at level 1 has negligible overhead because it only writes a document for operations that were already slow; the cost is one small insert per slow op into a capped collection. Level 2 (log everything) does add measurable overhead on a busy database because it writes a document for every operation, so it is a diagnostic setting, not a production one. This card works with level 1, the recommended production setting.
My slowms is set to 50ms, not 100ms. Does the card still make sense?
Yes, the card counts against whatever slowms the deployment actually uses; the 100ms in the title is the MongoDB default. A lower slowms will produce a higher count because it captures operations between 50ms and 100ms that the default would ignore. If you want the count to mean the same thing across deployments, standardise slowms and adjust the alert threshold accordingly.
Many different query shapes are slow, not just one. What does that mean?
A single dominant shape usually points at one missing index or a bad query, an easy fix. A count spread across many unrelated shapes points at a systemic cause: cache pressure (the working set no longer fits, so everything reads from disk), lock contention, replication lag stealing resources from a secondary, or an under-provisioned member. Escalate to WiredTiger Cache Hit Rate %, Connections In Use, and the capacity cards rather than chasing individual queries.
The count keeps climbing and now I am seeing errors. Are they the same problem?
Usually yes, in sequence. A query that is slow can become a query that times out as load increases or the working set grows; the slow op converts to an error. This is why Query Error Rate % often follows a slow-ops spike with a lag. Fix the slow operations (index, rewrite, or capacity) and the errors typically resolve with them.
Can I change the alert threshold of 10?
Yes. Ten slow ops per 15 minutes is the generic default. Sensitivity thresholds are configurable per profile in the Sensitivity tab. A small, lightly loaded cluster may want a lower threshold so it catches degradation earlier; a large, high-throughput cluster that always carries some slow tail may want a higher one so the alert reflects a genuine change rather than its normal baseline.