What disk layout should I use for my ZFS HA pool and how does this impact reservations and heartbeat drives?

~ 0 min
2021-04-07 19:43

For ZFS file systems there are essentially two main approaches in use, RAID Z2 and a mirrored stripe. To give a brief overview of these two schemes lets see how they layout when we have six drives, 1TB each (note that within any pool, any drives used for reservations or to heartbeat through are still usable for data, i.e. NO dedicated drives are required; the cluster software happily co-exists with ZFS pools).

RAID Z2

Raid Z2 uses two parity drives and at least two data drives, so the minimum amount of drives is four. With six drives this then equates to the following layout with roughly 4TB of usable space:


D4

D3

D2

D1

P2

P1
 

With this configuration up to two drives (parity or data) can be lost and pool integrity still maintained; however, any more drive losses will result in the pool becoming faulted (essentially unreadable/unimportable).

In order to place reservations on this drive layout it is necessary to reserve three drives (say P1, P2, D1) - in this way no other node will be able to successfully import the pool as there are not enough unreserved drives to read valid data from.

With resertions in place on drives P1, P2 and D1, this leaves drives D2, D3 and D4 free to use for disk heartbeats. The RSF-1 cluster software is aware of the ondisk ZFS structure and is able to heartbeat through the drives without affecting pool integrity.

RAID 10

Raid 10 is a combination of mirroring and striping; firstly mirrored vdevs are created (RAID 1) then striped together (RAID 0). With six drives we have a choice on the mirror layout depending on the amount of redundancy desired. These two schemas can be visualised as follows, firstly two three way mirrors striped together:


D0
 
D3

D1
 
D4

D2
 
D5
vdev1 + vdev2

In this example two mirrors have been created (D0/D1/D2 and D3/D4/D5) giving a total capacity of 2TB. This layout allows a maximum of two drives to fail in any single vdev (for example D0 and D2 in vdev1, D0 and D3 in vdev 1 and 2, etc.); the pool could survive four drive failures as long as a single drive is left in vdev1 and vdev2, but if all fail in one side of the stripe (for example D3, D4 and D5) then the pool would fault.

The reservations for this layout would be placed on all drives in either vdev1 or vdev2, leaving three drives free for heartbeats.

Alternatively the drives could be layed out as three two way mirrors striped together:


D0
 
D2
 
D4

D1
 
D3
 
D5
vdev1 + vdev2 + vdev3

In this example three mirrors have been created (D0/D1, D2/D3 and D4/D5) giving a total capacity of 3TB, with a maximum of one drive failure in any single vdev. Reservations will be placed on either vdev1, vdev2 or vdev3 leaving four drives available for heartbeating.

In all of the above scenarios it is NOT necessary to configure reservations or heartbeats; when a pool is added to a cluster, the cluster software will interrogate the pool structure and automatically work out the amount of drives it needs to reserve with any remaining drives utilised for heartbeats. Note that for each clustered pool only a maximum of two heartbeat drives are configured (any more is overkill). 

 

Average rating 0 (0 Votes)

You cannot comment on this entry

Tags