At a glance
This card reports how long it has been, in hours, since your most overdue busy table was last vacuumed. PostgreSQL relies on autovacuum to clean up dead row versions and, critically, to advance the transaction-ID horizon that keeps the database from hitting wraparound. When a hot table has not been vacuumed in more than 24 hours, autovacuum is starved: it is not keeping up, has been disabled, or is being blocked. This is a PostgreSQL-distinctive risk with no exact equivalent in other databases, and at its extreme it ends in the single scariest PostgreSQL failure mode: transaction-ID wraparound, where the database forces itself into a read-only shutdown to protect your data. For a DBA this is a slow-burning, high-consequence alarm.
| What it tracks | The maximum time since last (auto)vacuum across the instance’s busy tables, expressed in hours. The “hottest table not vacuumed in >24h” is the headline. |
| Data source | last_autovacuum and last_vacuum from pg_stat_user_tables, weighted by write activity (n_tup_ins + n_tup_upd + n_tup_del) so the “hottest” table is the one that both changes a lot and has gone longest without a vacuum. TXID horizon is read from age(relfrozenxid) in pg_class. |
| Time window | RT (real-time, evaluated on each poll, roughly every 60 seconds). |
| Alert trigger | >24h. A busy table that has gone more than 24 hours without a vacuum fires the card; a rarely-written table that legitimately does not need vacuuming does not, because it carries no autovacuum debt. |
| Roles | dba, platform, sre |
Calculation
The card finds the busy table that has gone longest without a vacuum:last_autovacuum of “never” or weeks ago, and that is perfectly fine: with no dead tuples accumulating, it does not need a vacuum. The card weights by write activity so it surfaces the table that is both changing fast and not being cleaned, which is the table that is accumulating dead tuples and, separately, letting its frozen-XID horizon drift.
There are two distinct dangers the age signals, and they share a cause:
- Bloat. Every
UPDATEandDELETEleaves a dead row version behind. Vacuum reclaims that space for reuse. Without it, the table and its indexes grow, sequential scans read more pages, and the buffer cache holds more dead weight. This shows up next in Top Tables by Dead Tuples. - Transaction-ID wraparound. PostgreSQL uses a 32-bit transaction counter. Vacuum “freezes” old rows so their transaction IDs can be safely reused. If freezing falls far enough behind (
age(relfrozenxid)approachingautovacuum_freeze_max_age, default 200 million), PostgreSQL escalates to an aggressive anti-wraparound autovacuum, and if that still cannot keep up, it shuts the database into read-only mode to prevent data loss. The age-in-hours is the early, human-readable warning that this process is not keeping pace.
Worked example
A platform team runs PostgreSQL 14 backing an order-processing service. Theorder_events table is append-and-update heavy: roughly 4 million writes a day. Snapshot taken on 18 Apr 26 at 06:00 BST.
| Table | Writes/day | last_autovacuum | Age (h) | age(relfrozenxid) | State |
|---|---|---|---|---|---|
| order_events | 4.0M | 16 Apr, 03:12 | 50.8 | 121M | BREACH |
| order_line_items | 3.6M | 18 Apr, 04:55 | 1.1 | 18M | healthy |
| customers | 40k | 17 Apr, 22:30 | 7.5 | 31M | healthy |
| country_codes (static) | 0 | never | n/a | 0.2M | ignored (not busy) |
order_events. The headline reads Oldest Autovacuum Age 50.8h (BREACH). The DBA reads:
- The hottest table is the starved one.
order_eventstakes 4 million writes a day and has not been vacuumed in over two days. Every one of those updates and deletes left a dead tuple behind that has not been reclaimed. That is a lot of bloat building quietly. - The frozen-XID horizon is drifting.
age(relfrozenxid)of 121 million is over half of the 200-million defaultautovacuum_freeze_max_age. It is not an emergency yet, but it is heading the wrong way, and on the busiest table. Left alone, an anti-wraparound autovacuum will eventually kick in hard. - Autovacuum is not disabled globally, just losing on this table. Other busy tables (
order_line_itemsat 1.1h) are being vacuumed fine. So the cause is table-specific: eitherorder_eventsexceeds the cost-limit budget autovacuum will spend per run, or something is blocking it.
- Age-in-hours is the early warning; wraparound is the catastrophe. You will never want to read this card at the wraparound stage. 24 hours on a busy table is the nudge to look now, while the fix is a one-line
ALTER TABLEand not an emergency single-user-modeVACUUM. - A single starved table usually points to a single blocker. When one busy table lags while others are clean, suspect a long-lived transaction pinning the snapshot, or per-table autovacuum settings that are too conservative for that table’s write rate.
- This is PostgreSQL-specific. Engines without PostgreSQL’s MVCC-plus-32-bit-XID design do not have this exact failure mode. Treat autovacuum health as a first-class operational signal, not an afterthought.
Sibling cards
| Card | Why pair it with Oldest Autovacuum Age | What the combination tells you |
|---|---|---|
| Top Tables by Dead Tuples | The bloat consequence of a starved vacuum. | Old age plus high dead tuples on the same table confirms real accumulation. |
| Idle-in-Transaction Backends | The most common reason vacuum cannot reclaim. | A long idle-in-tx session pins the snapshot and starves vacuum database-wide. |
| Database Disk Usage % | Bloat from un-vacuumed tables consumes disk. | Rising age plus rising disk equals dead tuples eating storage. |
| Buffer Cache Hit Rate % | Bloated tables waste cache on dead pages. | A falling hit rate alongside high bloat suggests dead weight in the cache. |
| Query Latency p95 (ms) | Bloat slows scans and index lookups. | Latency creeping up on a bloated table is the performance tax of starved vacuum. |
| WAL Lag Bytes (primary -> standby) | Aggressive catch-up vacuums generate WAL. | A vacuum storm clearing backlog can briefly push WAL lag up. |
| PostgreSQL Health Score | The composite includes autovacuum-current as a factor. | A starved table pulls the composite down before any query feels it. |
Reconciling against the source
Where to look in PostgreSQL’s own tooling:RunWhy our number may legitimately differ from PostgreSQL’s own view:SELECT relname, last_autovacuum, last_vacuum, n_dead_tup, n_live_tup, autovacuum_count FROM pg_stat_user_tables ORDER BY last_autovacuum ASC NULLS FIRST;. The top rows are your oldest vacuums; ignore the static tables with no dead tuples. Check the wraparound horizon withSELECT relname, age(relfrozenxid) AS xid_age FROM pg_class WHERE relkind = 'r' ORDER BY xid_age DESC LIMIT 10;and compare againstSHOW autovacuum_freeze_max_age;. See whether an autovacuum is running right now withSELECT pid, query, now() - xact_start AS running_for FROM pg_stat_activity WHERE query LIKE 'autovacuum:%';. On a managed service, the provider surfaces aMaximumUsedTransactionIDsmetric (RDS / Aurora) or an equivalent XID-age chart (Cloud SQL, Azure), which is the same wraparound horizon the card watches.
| Reason | Direction | Why |
|---|---|---|
| Busy-table weighting | Card ignores some old timestamps | A static table with an ancient last_autovacuum is excluded because it carries no autovacuum debt; a raw query sorted by timestamp would list it first. |
| last_autovacuum vs last_vacuum | Card uses the more recent | A manual VACUUM you ran resets last_vacuum, not last_autovacuum; the card takes the greater of the two, so a recent manual run clears the age. |
| Stats reset | Card may read “fresh” after a reset | pg_stat_reset() or a stats-file loss blanks last_autovacuum; the card then has no history until the next vacuum runs. |
| Poll timing | Brief skew | An autovacuum completing between polls updates the timestamp; the card reflects it on the next cycle. |
| Managed-service abstraction | Different framing | RDS / Cloud SQL headline the wraparound horizon (XID age) rather than hours-since-vacuum; both describe the same starvation, just on different axes. |
Known limitations / FAQs
My static lookup table shows last_autovacuum of “never”. Why isn’t the card firing? Because that table is not busy. A table with no inserts, updates, or deletes accumulates no dead tuples and does not need vacuuming, so “never vacuumed” is correct and harmless. The card weights by write activity and only fires on tables that are both changing and overdue. A “never” on a static table is expected, not a problem. Autovacuum is enabled and other tables are fine, but one busy table keeps ageing. Why? Two usual suspects. First, a long-running transaction (often idle-in-transaction) holds an old snapshot open, and vacuum cannot reclaim any dead tuple newer than that snapshot anywhere in the database; the hottest table notices first. Second, the table is large enough that autovacuum’s cost limit throttles each run before it finishes. Check for long transactions first, then tune the table’s per-table autovacuum settings to trigger sooner and spend more cost budget. What is transaction-ID wraparound and should I worry now at 24 hours? At 24 hours of vacuum age you are not at wraparound; you are at the early-warning stage where fixing it is cheap. Wraparound is the end state where the 32-bit transaction counter cannot safely advance because too many old rows are unfrozen; PostgreSQL protects your data by forcing the database read-only until an emergency vacuum completes. The whole point of this card is to keep you so far from that cliff that you never see it. Can I just disable autovacuum and run VACUUM on a schedule instead? You can, but it is rarely a good idea, and it makes this card more important, not less. Autovacuum exists precisely so freezing keeps pace with write rate automatically. A fixed schedule cannot adapt to traffic spikes, and a missed scheduled run is exactly how wraparound incidents happen. If you do run manual vacuums, the card creditslast_vacuum, so it will reflect your scheduled runs.
Does a manual VACUUM clear this card?
Yes. The card uses the more recent of last_autovacuum and last_vacuum, so a manual VACUUM on the offending table resets the age to near zero immediately. For a persistent breach, the manual vacuum is the short-term relief and per-table autovacuum tuning is the durable fix.
Why does the age sometimes jump back up right after a vacuum finishes?
If a long-running transaction is still open, the vacuum runs but cannot remove the newest dead tuples (those visible to the old snapshot), so on a very hot table the dead-tuple count and effective debt climb again quickly. The age timestamp resets, but the underlying starvation returns until the blocking transaction ends. This is why the card pairs so tightly with Idle-in-Transaction Backends.
On a managed service, does the provider handle vacuum for me?
The provider runs standard autovacuum with sensible defaults, but it does not tune per-table settings for your workload, and it cannot remove the long-transaction blocker that is the most common cause of starvation. Managed services still suffer autovacuum starvation and wraparound risk; the provider exposes the XID-age metric for exactly this reason. Treat this card as your in-context view of a risk the managed service does not solve on its own.