ID #1070

What are the prop_zpool_fail_mode properties for?

The failmode property of a zpool controls how the pool handles I/O after it has gone into a 'faulted' state. There are 3 options:

  1. wait - all I/O from clients will hang
  2. continue - clients will get I/O errors for all I/O operations to the pool
  3. panic - as soon as the pool goes faulted, ZFS triggers a kernel panic

For RSF clusters, panic should be used (not the default). The panic mode means if a pool goes faulted because of a faulty controller card, broken fibre cable, etc. the active node will panic, and the service can automatically fail over to the other node.

RSF-1 is configured by default to change the failmode of all zpools to 'panic' each time a service starts. If this behaviour is not wanted for any reason, it can be changed by altering RSF database properties:

The property prop_zpool_fail_mode controls the failmode on a cluster-wide basis. If it is necessary to have one pool use a different failmode, then a new property can be created with the format prop_zpool_fail_mode_<pool> (note that this is a pool name, not a service name. If more than one pool is attached to a service, then a separate property can be used for each pool).

To modify a property (e.g. to change the global setting to wait), run:

rsfcdb update prop_zpool_fail_mode wait

To add a new property (continue) for zpool tank, run:

rsfcdb create prop_zpool_fail_mode_tank continue

Possible values for the global setting are 'wait', 'continue', 'panic' and 'none'. 'none' means RSF will not set the failmode of pools on import, so they will retain the failmode setting that they already had.

Possible values for the pool specific settings are 'wait', 'continue', 'panic', 'none' and 'default'. 'default' effectively disables the setting and causes that pool to use the global value prop_zpool_fail_mode. 'none' causes RSF not to set the failmode of this pool at all.


For example, if there are 5 pools in a cluster, pool1, 2, 3, 4 and 5, the properties:

prop_zpool_fail_mode       : panic
prop_zpool_fail_mode_pool1 : wait
prop_zpool_fail_mode_pool2 : default
prop_zpool_fail_mode_pool3 : none
prop_zpool_fail_mode_pool4 : continue

mean that the following failmodes are applied:

pool1 - wait

pool2 - panic

pool3 - no failmode setting used (keeps its original setting)

pool4 - continue

pool5 - panic

Last update: 2014-01-29 15:49
Author: Matt
Revision: 1.1

