At a glance
The state of the Galera cluster as the polled node sees it, read from thewsrep_cluster_statusstatus variable. The healthy value isPrimary, meaning the node is part of a quorum-holding majority and is allowed to accept writes. Any other value (Non-PrimaryorDisconnected) means the node has lost contact with a majority of the cluster and will refuse writes to protect data consistency. For a DBA this is the binary “can my application still write to the database?” verdict, and it is one of the most distinctive signals MariaDB Galera offers over standalone MySQL.
| Status variable | wsrep_cluster_status from SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status'. A string: Primary, Non-Primary, or Disconnected. |
| Metric basis | Galera Primary-Component membership verdict, NOT a connection test or ping. It reflects whether the node believes it is in the quorum-holding partition. |
| Aggregation window | Real-time, polled on the Nerve Centre refresh cycle. The value is instantaneous. |
| Healthy value | Primary. The node is in the majority partition and writes are accepted. |
| What it means | Primary = healthy quorum; Non-Primary = split-brain risk, node refuses writes; Disconnected = node cannot reach the group at all. |
| What does NOT change it | (1) High query latency; (2) a full disk (that is a separate failure mode); (3) async-replica lag; (4) router health. The variable is purely about Galera group membership. |
| Time window | RT (real-time, polled each refresh cycle) |
| Alert trigger | != Primary, any value other than Primary is an immediate write-availability incident. |
| Roles | owner, engineering, operations |
Calculation
The card runsSHOW GLOBAL STATUS LIKE 'wsrep_cluster_status' against the connected node and surfaces the string verbatim. There is no derivation; Galera sets this value itself as the group-communication layer evaluates quorum.
The three possible values map to states as follows:
Primary. This is deliberately strict: a Non-Primary node is not a degraded-but-usable state, it is a node that has stopped serving writes entirely. The card therefore reads as a clean binary for dashboards: green when Primary, red otherwise. It pairs naturally with Galera Cluster Size, which explains why a node has gone Non-Primary (membership fell below the quorum floor).
Worked example
A platform team runs a 3-node MariaDB Galera cluster split across two availability zones: db-galera-01 and db-galera-02 in zone A, db-galera-03 in zone B. On 22 May 26 at 14:05 BST a network partition severs zone A from zone B.| Node | Zone | Peers it can see | wsrep_cluster_status | Writes? |
|---|---|---|---|---|
| db-galera-01 | A | db-galera-02 (2 of 3) | Primary | Yes |
| db-galera-02 | A | db-galera-01 (2 of 3) | Primary | Yes |
| db-galera-03 | B | none (1 of 3) | Non-Primary | No |
- Galera is doing exactly the right thing. The 2-node majority in zone A retained quorum and stays writable. The lone node in zone B correctly went Non-Primary rather than accept conflicting writes. This is split-brain prevention working as designed, not a database bug.
- The application must route to the majority. If the load balancer or MaxScale is still sending writes to db-galera-03, those writes are now being rejected with
WSREP has not yet prepared node for application use. The fix is routing, not the database: point writes at the zone-A Primary partition. - There is no data loss, only a stalled minority. When the network heals, db-galera-03 rejoins the Primary Component, performs an IST to catch the writes it missed, and returns to
Primary. The team should not force-bootstrap zone B as its own Primary, doing so would create a genuine split-brain with two divergent datasets.
Primary again, the card returns to green, and writes resume cluster-wide. The lesson the team should carry: a Non-Primary reading is a routing emergency, not a repair emergency; never force a minority node back to Primary to “fix” the card.
Sibling cards to reference together
| Card | Why pair it with Galera Cluster Status | What the combination tells you |
|---|---|---|
| Galera Cluster Size | Explains why a node went Non-Primary. | Size below quorum floor is the usual cause of a Non-Primary status. |
| Galera Cluster Not in Primary State or Node Lost | The alert-list card that fires on this exact condition. | A Non-Primary reading should always appear as a row in this feed. |
| Galera Flow Control Paused % | Pre-cursor signal: a struggling node before it drops out. | Sustained flow control can precede a node leaving and a status flip. |
| Failover Readiness | Whether a standby can take over writes. | Non-Primary plus no healthy standby equals a hard write outage. |
| MariaDB Health Score | The composite that takes cluster state as a major input. | A Non-Primary node sharply drops the composite. |
| Connection Errors (24h) | Apps hitting a Non-Primary node log write rejections. | A status flip often co-occurs with a spike in connection/write errors. |
| Async Replication Lag (seconds) | Downstream async replicas read from the cluster. | A Non-Primary source can stall async replicas feeding from it. |
Reconciling against the source
Where to look in MariaDB’s own tooling:RunWhy our reading may legitimately differ between nodes:SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';on each node, this is the exact variable the card reads. RunSHOW GLOBAL STATUS LIKE 'wsrep_ready';(ON/OFF) andLIKE 'wsrep_connected';for the companion readiness flags. Checkwsrep_local_state_commentfor the human-readable node state (Synced,Donor/Desynced,Joining,Initialized). On a managed service, the provider console (for example SkySQL or your cloud MariaDB cluster view) shows the same Primary/Non-Primary topology.
| Reason | Direction | Why |
|---|---|---|
| Which node you query | Can differ during a partition | Each node reports its own view. During a split, majority nodes read Primary while the minority reads Non-Primary. Vortex IQ polls the configured endpoint. |
| Poll timing | Brief lag | A status flip between polls is not reflected until the next refresh cycle. |
| Just-started node | Transient Disconnected/Joining | A node booting reads Disconnected then Joining before reaching Primary; this is normal startup, not a fault. |
| Router masking | None to value | MaxScale may stop routing to a Non-Primary node, but the backend node still reports its true status. |
| Source | Expected relationship | What causes divergence |
|---|---|---|
wsrep_ready | Should be ON whenever status is Primary. | If status is Primary but wsrep_ready is OFF, the node is in a transitional state (for example desynced as a donor) and is not serving normally. |
| Provider console topology | Should agree on which partition is Primary. | A console may lag the live group-comms decision by a few seconds during a fast partition. |
Known limitations / FAQs
My node says Non-Primary but the database process is running fine. Is it broken? The process is healthy; the node has simply lost quorum and is refusing writes on purpose to prevent split-brain. This is Galera protecting your data, not a crash. The fix is to restore connectivity so the node rejoins a majority, or to route your application at the majority partition. Never confuse “process up” with “writable”. Reads still work on a Non-Primary node, why? By default a Non-Primary node rejects both reads and writes (it returnsWSREP has not yet prepared node). If reads appear to work, you likely have wsrep_dirty_reads=ON set, which permits stale reads from a node that has fallen out of the cluster. That is acceptable for some reporting use cases but dangerous for anything that then writes back; understand the trade-off before relying on it.
Can I force a Non-Primary node back to Primary?
You can, with SET GLOBAL wsrep_provider_options='pc.bootstrap=YES', but you almost never should. Forcing a minority node to bootstrap creates a second independent Primary with a divergent write history, which is the genuine split-brain catastrophe Galera was preventing. Only bootstrap deliberately when you have confirmed all other nodes are truly dead and you are intentionally recovering from the most-advanced survivor.
What is the difference between Non-Primary and Disconnected?
Non-Primary means the node can talk to some peers but they do not form a majority. Disconnected means the node cannot reach the Galera group at all (every peer is unreachable, or the node has just started and not yet connected). Both are write-unavailable; Disconnected usually points at a network/firewall problem on the Galera ports, while Non-Primary points at a quorum split.
How fast does the card detect a flip?
As fast as the poll cycle. Galera itself decides quorum within its group-communication timeout (sub-second to a few seconds), and the card surfaces the new value on the next Nerve Centre refresh. For the strictest real-time signal, pair this card with the alert-list card, which is designed to page on the transition.
Does this card apply to a standalone MariaDB server?
No. wsrep_cluster_status only exists when the Galera (wsrep) provider is loaded. A standalone server has no concept of cluster status; for single-server write availability rely on uptime, disk, and connection-error cards instead.
During a rolling upgrade one node briefly shows Joining, not Primary. Should I worry?
No. A rejoining node passes through Disconnected then Joining (during IST/SST) before reaching Primary. That sequence is the expected recovery path. Only worry if a node stays stuck in Joining for an unusually long time, which usually signals a slow or failing SST.