Some lesser known quorum options
Proxmox do not really cater much for cluster deployments at a small scale of 2-4 nodes and always assume High Availability could be put to use in their approach to the out-of-the-box configuration. It is very likely for this reason that some great features of Corosync configuration 1 are left out of the official documentation entirely.
Tip
You might want to read more on how Proxmox utilise Corosync in a separate post.
Quorum provider service
Proxmox need a quorum provider service votequorum
2 to prevent data corruption in situations when two or more partitions were to form in a cluster of which a member would be about to modify the same data unchecked by the (from the viewpoint of the modifying member) missing members (of a detached partition). This is signified by the always populated corosync.conf
section:
quorum {
provider: corosync_votequorum
}
Other key: value
pairs could be specified here. One of the notable values of importance is expected_votes
, in standard PVE deployment not explicit:
votequorum
requires anexpected_votes
value to function, this can be provided in two ways. The number of expected votes will be automatically calculated when thenodelist { }
section is present incorosync.conf
orexpected_votes
can be specified in thequorum { }
section.
The quorum value is then calculated as majority out of the sum of nodelist { node { quorum_votes: } }
values. You can see the live calculated value on any node: 3
corosync-quorumtool
---8<---
Votequorum information
----------------------
Expected votes: 4
Highest expected: 4
Total votes: 4
Quorum: 3
Flags: Quorate
---8<---
Tip
The Proxmox-specific tooling 4 makes use of this output as well with pve status
. It is also this value you are temporarily changing with pvecm expected
which actually makes use of corosync-quorumtool -e
.
The options
These can be added to the quorum {}
section:
The two-node cluster
The option two_node: 1
is meant for clusters made up of 2 nodes, it causes each node to assume it is in the quorum ever after successfully booting up and having seeing the other node at least once. This has quite some merit considering that a disappearing node could be considered having gone down and it is therefore safe to continue operating on its own. If you run this simple cluster setup, your remaining node does not have to lose quorum when the other one is down.
Auto tie-breaker
The option auto_tie_breaker: 1
(ATB) allows two equally size partitions to decide which one retains quorum deterministically, having e.g. a 4-node cluster split into two 2-node partitions would not allow either to become quorate, but ATB allows one of these to be picked as quorate, by default the one with the lowest nodeid
in the partition. This can be tweaked with tunable auto_tie_breaker_node: lowest|highest|<list of node IDs>
.
This could be also your go-to option in case you are running a 2-node cluster with one of the nodes in a “master” role and the other one almost invariably off.
Last man standing
The option last_man_standing: 1
(LMS) allows to dynamically adapt to scenarios when nodes go down for prolonged periods by recalculating the expected_votes
value. In a 10-node cluster where e.g. 3 nodes have not been seen for longer than a specified period (by default 10 seconds - tunable option last_man_standing_window
in milliseconds), the new expected_votes
value becomes 7. This can cascade down to as few as 2 nodes left being quorate. If you also enable ATB, it could go to even just down to a single node.
Warning
This option should not be used in HA clusters as implemented by Proxmox.
Notes
If you are also looking how to safely disable HA on a Proxmox cluster, there is a guide.