Skip to content
Linux Cluster Software | FAQ's | High Availability

General FAQs - Linux

Multiple kernel partition messages appearing in syslog from the Udev sub-system

By default, the Udev daemon (systemd-udevd) communicates with the kernel and receives device uevents directly from it each time a device is removed or added, or a device changes its state.

Because of the way RSF-1 writes its heartbeats using the ZFS label, the udev sub-system sees this as a state change and erroneously updates syslog each time a heartbeat is transmitted. This can result in multiple messages appearing in syslog of the form:

Aug 10 17:22:24 nodea kernel: [2422456.906302]  sdf: sdf1 sdf9
Aug 10 17:22:24 nodea kernel: [2422456.013538]  sdg: sdg1 sdg9
Aug 10 17:22:25 nodea kernel: [2422458.418906]  sdf: sdf1 sdf9
Aug 10 17:22:25 nodea kernel: [2422458.473936]  sdg: sdg1 sdg9
Aug 10 17:22:25 nodea kernel: [2422459.427251]  sdf: sdf1 sdf9
Aug 10 17:22:25 nodea kernel: [2422459.487747]  sdg: sdg1 sdg9

The underlying reason for this is because Udev watches block devices by binding to the IN_CLOSE_WRITE event from inotify and each time it receives this event a rescan of the device is triggered.

Furthermore newer versions of the ZFS Event Daemon listen to udev events (to manage disk insertion/removal etc.) and catches the udev events generated due to the disk heartbeats, and then attempts to find to which pool (if any) the disk belongs to, resulting in unnecessary I/O.

The solution to this is to add a udev rule that overrides this default behaviour and disables monitoring of the sd* block devices. Add the following to the udev rules file /etc/udev/rules.d/50-rsf.rules1:

ACTION!="remove", KERNEL=="sd*", OPTIONS:="nowatch"

Finally, reload the udev rules to activate the fix.

Thanks to Hervé BRY of Geneanet for this submission.


REST service fails to start due to port conflict

The RSF-1 REST service (rsf-rest) uses port 4330 by default. If this port is in use by another service (for example pmlogger sometimes attempts to bind to port 4330) then the RSF-1 REST service will fail to start.

To check the service status run the command systemctl status rsf-rest.service and check the resulting output for any errors; here is an example where port 4330 is already in use:

# systemctl status rsf-rest.service

● rsf-rest.service - RSF-1 REST API Service
   Loaded: loaded (/usr/lib/systemd/system/rsf-rest.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2022-07-14 08:23:00 EDT; 5s ago
  Process: 4271 ExecStart=/opt/HAC/RSF-1/bin/python /opt/HAC/RSF-1/lib/python/rest_api_app.pyc >/dev/null (code=exited, status=1/FAILURE)
 Main PID: 4271 (code=exited, status=1/FAILURE)

Jul 14 08:23:00 mgc81 python[4271]:     return future.result()
Jul 14 08:23:00 mgc81 python[4271]:   File "/opt/HAC/Python/lib/python3.9/site-packages/aiohttp/web.py", line 413, in _run_app
Jul 14 08:23:00 mgc81 python[4271]:     await site.start()
Jul 14 08:23:00 mgc81 python[4271]:   File "/opt/HAC/Python/lib/python3.9/site-packages/aiohttp/web_runner.py", line 121, in start
Jul 14 08:23:00 mgc81 python[4271]:     self._server = await loop.create_server(
Jul 14 08:23:00 mgc81 python[4271]:   File "/opt/HAC/Python/lib/python3.9/asyncio/base_events.py", line 1506, in create_server
Jul 14 08:23:00 mgc81 python[4271]:     raise OSError(err.errno, 'error while attempting '
Jul 14 08:23:00 mgc81 python[4271]: OSError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 4330): address already in use
Jul 14 08:23:00 mgc81 systemd[1]: rsf-rest.service: Main process exited, code=exited, status=1/FAILURE
Jul 14 08:23:00 mgc81 systemd[1]: rsf-rest.service: Failed with result 'exit-code'.

The simplest way to resolve this is to change the port the RFS-1 REST service listens on. To do this run the following commands on each node in the cluster (in this example the port is changed to 4335):

# /opt/HAC/RSF-1/bin/rsfcdb update privPort 4335
# systemctl restart rsf-rest.service

The RSF-1 REST service will now restart and listen on the new port. A status check should now show the service as active and running:

    # systemctl status rsf-rest.service
● rsf-rest.service - RSF-1 REST API Service
   Loaded: loaded (/usr/lib/systemd/system/rsf-rest.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-07-14 09:22:57 EDT; 2s ago
 Main PID: 52579 (python)
    Tasks: 1 (limit: 49446)
   Memory: 31.8M
   CGroup: /system.slice/rsf-rest.service
           └─52579 /opt/HAC/RSF-1/bin/python /opt/HAC/RSF-1/lib/python/rest_api_app.pyc >/dev/null

Jul 14 09:22:57 mgc81 systemd[1]: Started RSF-1 REST API Service.

This can be confirmed by navigating to the Webapp via the new port https://<ip of node>:4335


  1. Udev rules are defined into files with the .rules extension. There are two main locations in which those files can be placed: /usr/lib/udev/rules.d is the directory used for system-installed rules, /etc/udev/rules.d/ is reserved for custom made rules. In this example we've used the name 50-rsf.rules, but any suitable file name can be use.