At a glance
WAL Lag Bytes is the volume of write-ahead log, measured in bytes, that the primary has generated but a streaming standby has not yet received or replayed. It is computed as pg_wal_lsn_diff(primary_lsn, standby_lsn): the byte distance between the primary’s current write position and where the standby has caught up to. A small, stable number is healthy. A number that climbs and keeps climbing means the standby is falling behind, and the further behind it gets, the more data you stand to lose in a failover and the longer a promotion will take to replay.
| Source columns | pg_stat_replication on the primary (sent_lsn, write_lsn, flush_lsn, replay_lsn) compared against pg_current_wal_lsn(), differenced with pg_wal_lsn_diff(). The card surfaces the gap to each connected standby. |
| Metric basis | Byte distance in the WAL stream, not time. Bytes are the truth source; seconds-of-lag (see Replication Lag (seconds)) is derived and can read zero on an idle primary even when bytes are outstanding. |
| What “lag” means here | By default the card measures the send/flush gap (bytes not yet shipped to the standby). The replay gap (bytes shipped but not yet applied) is tracked separately and surfaced when it diverges from the send gap, which is the signature of a standby that is receiving fine but replaying slowly. |
| Aggregation window | Real-time, sampled every refresh cycle. Sustained growth across consecutive samples is the concerning pattern, not a single spike during a bulk write. |
| Multiple standbys | One reading per connected standby. The headline shows the worst (largest) lag across all standbys, since failover readiness is gated by your best-positioned replica. |
| Time window | RT (real-time, sampled every refresh cycle). |
| Alert trigger | > 1GB of outstanding WAL to any standby. A gigabyte of unshipped WAL is roughly the point where a wal_keep_size overrun and slot bloat become real risks. |
| Roles | DBA, platform engineering, SRE. |
Calculation
The engine queriespg_stat_replication on the primary and, for each connected standby, computes the byte gap with pg_wal_lsn_diff():
send_lag_bytes is what the headline reports by default (bytes the primary has not yet sent). flush_lag_bytes adds the bytes received but not yet durably written on the standby. replay_lag_bytes adds the bytes written but not yet applied. The three values are nested: replay lag is always greater than or equal to flush lag, which is greater than or equal to send lag. When the replay gap balloons while the send gap stays small, the network is fine but the standby cannot keep up with apply, often because a long-running read query on the standby is blocking WAL replay (a hot-standby conflict).
A LSN (log sequence number) is a position in the WAL stream expressed as a hex pair such as 3A/7F2C8B40. pg_wal_lsn_diff() subtracts two LSNs and returns the byte distance, so the card is reading the literal number of bytes between two points in the log.
Worked example
A platform team runs a primary with two streaming standbys:standby-a (same availability zone, serves read traffic) and standby-b (cross-region, kept for disaster recovery). Snapshot taken on 14 Apr 26 at 09:40 BST during a nightly bulk reindex job.
| Standby | State | Send lag | Flush lag | Replay lag |
|---|---|---|---|---|
| standby-a | streaming | 12 MB | 14 MB | 18 MB |
| standby-b | streaming | 1.4 GB | 1.4 GB | 1.4 GB |
> 1GB has tripped on standby-b. The team reads the two rows very differently:
- standby-a is healthy. Twelve megabytes of send lag with eighteen megabytes of replay lag is normal churn during a bulk write. The replay gap is only marginally larger than the send gap, so apply is keeping pace. No action.
- standby-b has fallen behind across the board. Send, flush, and replay are all stuck at 1.4 GB and equal to each other. Because the send gap itself is large, this is not a slow-apply problem on the standby; it is a shipping problem. The bytes are not leaving the primary fast enough for the cross-region link.
max_wal_size and confirms the replication slot is reserving WAL (so the standby is not abandoned), or accepts that standby-b will need a rebuild. The DR standby being 1.4 GB behind means a failover to it right now would lose the last several minutes of writes, which is the recovery-point objective (RPO) breach this card exists to catch.
Three takeaways:
- Bytes, not seconds, are the honest measure. On an idle primary the seconds-of-lag card can read zero while gigabytes sit unshipped, because no new commits are arriving to timestamp. Always read WAL lag in bytes when you are sizing failover risk.
- The shape of the three numbers tells you where the bottleneck is. Large send gap = shipping/network problem. Small send gap but large replay gap = apply problem on the standby, usually a hot-standby query conflict.
- A reserved replication slot is double-edged. It guarantees the primary keeps WAL the standby still needs, which protects the standby, but if the standby never catches up the primary’s
pg_waldirectory grows without bound and you risk filling the data disk. Pair this card with Database Disk Usage %.
Sibling cards
| Card | Why pair it with WAL Lag Bytes | What the combination tells you |
|---|---|---|
| Replication Lag (seconds) | The time-based view of the same gap. | Bytes high but seconds low equals an idle primary with unshipped WAL; bytes and seconds both high equals a standby genuinely behind on live traffic. |
| Replication Lag Exceeds Threshold or Standby Unreachable | The alert-list escalation of this metric. | When WAL lag crosses threshold or a standby drops to state=BROKEN, this is where the page fires. |
| Active Streaming Replicas | The topology count. | If a standby disappears from the replica count, its WAL lag stops being reported, which can mask the problem rather than resolve it. |
| Failover Readiness | The promotion-readiness composite. | High WAL lag on every standby means no replica is safe to promote without data loss. |
| Database Disk Usage % | The disk pressure a reserved slot creates. | A reserved slot feeding a stuck standby grows pg_wal; watch disk as lag persists. |
| PostgreSQL Health Score | The executive composite that weights replication health. | Sustained WAL lag pulls the composite down even while latency and errors look fine. |
Reconciling against the source
Where to look in PostgreSQL directly:RunWhy our number may legitimately differ from a raw query:SELECT application_name, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn) AS replay_lag_bytes FROM pg_stat_replication;on the primary for the per-standby byte gap. On a standby,SELECT pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn();shows received vs replayed positions locally. Check the configured replication slots withSELECT slot_name, active, restart_lsn, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS retained_bytes FROM pg_replication_slots;to see how much WAL the primary is retaining for each consumer.
| Reason | Direction | Why |
|---|---|---|
| Sample timing | Marginal | The card samples on its refresh cycle; a query you run by hand a second later sees a different LSN, especially under heavy write load. |
| Send vs replay basis | Variable | The headline defaults to the send gap; a query selecting replay_lsn reports the larger replay gap. Compare like for like. |
| Worst-standby headline | Higher | The card shows the largest lag across all standbys; a query filtered to one standby shows only that standby’s gap. |
| Managed-service metric | Variable | On RDS / Aurora the console’s ReplicaLag CloudWatch metric is reported in seconds, not bytes, and is computed differently from pg_wal_lsn_diff. Treat it as a corroborating signal, not an exact match. |
ReplicaLag / AuroraReplicaLag metrics (seconds) and the read-replica list in the RDS console; Cloud SQL surfaces replication/replica_lag in Cloud Monitoring; Azure Database for PostgreSQL exposes physical_replication_delay_in_bytes, which is the closest managed-service match to this card. Use the byte-based metric where the provider offers one.
Known limitations / FAQs
The seconds-of-lag card reads zero but WAL Lag Bytes shows 800 MB. Which is right? Both are right; they measure different things. Seconds-of-lag is derived from the timestamp of the last replayed transaction. On an idle or low-write primary there are no fresh commits to timestamp, so the seconds value collapses to zero even though unshipped bytes exist. Bytes are the honest measure of how much data a failover would lose. When sizing recovery-point risk, trust the bytes. My send gap is tiny but the replay gap is huge. What does that mean? The standby is receiving WAL fine but cannot apply it fast enough. The usual cause is a hot-standby conflict: a long-running read query on the standby holds a snapshot that blocks WAL replay (PostgreSQL pauses apply rather than cancelling the query, unlessmax_standby_streaming_delay forces a cancellation). Check pg_stat_activity on the standby for long-running queries, and review max_standby_streaming_delay.
Why is the threshold 1 GB rather than a time?
Bytes are the basis the card actually measures, and a byte threshold behaves consistently regardless of write rate. One gigabyte is the point where typical wal_keep_size settings start to overrun and where a reserved replication slot begins materially growing the primary’s pg_wal directory. Tune it to your own wal_keep_size and disk headroom in the Sensitivity tab.
A standby vanished from the card entirely. Is lag zero now?
No, it is unknown, which is worse. pg_stat_replication only lists connected standbys, so a disconnected standby produces no row and no lag reading. The disappearance itself is the alarm. Cross-check with Active Streaming Replicas and the standby-unreachable alert.
Does a replication slot make this safe to ignore?
A reserved replication slot guarantees the primary will not recycle WAL the standby still needs, so the standby will eventually catch up rather than requiring a rebuild. But it shifts the risk: the retained WAL accumulates in the primary’s pg_wal directory and can fill the data disk if the standby never recovers. Watch disk usage whenever lag persists with a reserved slot.
Can WAL lag be negative?
Briefly, a standby’s reported LSN can appear ahead of the value sampled from the primary because of sampling skew between the two reads. The engine clamps such transient negatives to zero; a persistent negative would indicate a clock or sampling fault and is treated as a no-read.