General FAQs - Linux
Multiple kernel partition messages appearing in syslog from the Udev sub-system
By default, the Udev daemon (systemd-udevd) communicates with the kernel and receives device uevents directly from it each time a device is removed or added, or a device changes its state.
Because of the way RSF-1 writes its heartbeats using the ZFS label, the udev sub-system sees this as a state change and erroneously updates syslog each time a heartbeat is transmitted. This can result in multiple messages appearing in syslog of the form:
Aug 10 17:22:24 nodea kernel: [2422456.906302] sdf: sdf1 sdf9
Aug 10 17:22:24 nodea kernel: [2422456.013538] sdg: sdg1 sdg9
Aug 10 17:22:25 nodea kernel: [2422458.418906] sdf: sdf1 sdf9
Aug 10 17:22:25 nodea kernel: [2422458.473936] sdg: sdg1 sdg9
Aug 10 17:22:25 nodea kernel: [2422459.427251] sdf: sdf1 sdf9
Aug 10 17:22:25 nodea kernel: [2422459.487747] sdg: sdg1 sdg9
The underlying reason for this is because Udev watches block devices by binding to the IN_CLOSE_WRITE event from inotify
and each time it receives this event a rescan of the device is triggered.
Furthermore newer versions of the ZFS Event Daemon listen to udev events (to manage disk insertion/removal etc.) and catches the udev events generated due to the disk heartbeats, and then attempts to find to which pool (if any) the disk belongs to, resulting in unnecessary I/O.
The solution to this is to add a udev rule that overrides this default behaviour and disables monitoring of the sd* block devices. Add the following to the udev rules file /etc/udev/rules.d/50-rsf.rules
1:
ACTION!="remove", KERNEL=="sd*", OPTIONS:="nowatch"
Finally, reload the udev rules to activate the fix.
Thanks to Hervé BRY of Geneanet for this submission.
REST service fails to start due to port conflict
The RSF-1 REST service (rsf-rest
) uses port 4330
by default.
If this port is in use by another service (for example pmlogger
sometimes attempts to bind to port 4330) then the RSF-1 REST service will
fail to start.
To check the service status run the command
systemctl status rsf-rest.service
and check the
resulting output for any errors; here is an example
where port 4330 is already in use:
# systemctl status rsf-rest.service
● rsf-rest.service - RSF-1 REST API Service
Loaded: loaded (/usr/lib/systemd/system/rsf-rest.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Thu 2022-07-14 08:23:00 EDT; 5s ago
Process: 4271 ExecStart=/opt/HAC/RSF-1/bin/python /opt/HAC/RSF-1/lib/python/rest_api_app.pyc >/dev/null (code=exited, status=1/FAILURE)
Main PID: 4271 (code=exited, status=1/FAILURE)
Jul 14 08:23:00 mgc81 python[4271]: return future.result()
Jul 14 08:23:00 mgc81 python[4271]: File "/opt/HAC/Python/lib/python3.9/site-packages/aiohttp/web.py", line 413, in _run_app
Jul 14 08:23:00 mgc81 python[4271]: await site.start()
Jul 14 08:23:00 mgc81 python[4271]: File "/opt/HAC/Python/lib/python3.9/site-packages/aiohttp/web_runner.py", line 121, in start
Jul 14 08:23:00 mgc81 python[4271]: self._server = await loop.create_server(
Jul 14 08:23:00 mgc81 python[4271]: File "/opt/HAC/Python/lib/python3.9/asyncio/base_events.py", line 1506, in create_server
Jul 14 08:23:00 mgc81 python[4271]: raise OSError(err.errno, 'error while attempting '
Jul 14 08:23:00 mgc81 python[4271]: OSError: [Errno 98] error while attempting to bind on address ('0.0.0.0', 4330): address already in use
Jul 14 08:23:00 mgc81 systemd[1]: rsf-rest.service: Main process exited, code=exited, status=1/FAILURE
Jul 14 08:23:00 mgc81 systemd[1]: rsf-rest.service: Failed with result 'exit-code'.
The simplest way to resolve this is to change the port the RFS-1 REST service listens on. To do this run the following commands on each node in the cluster (in this example the port is changed to 4335):
# /opt/HAC/RSF-1/bin/rsfcdb update privPort 4335
# systemctl restart rsf-rest.service
The RSF-1 REST service will now restart and listen on the new port. A status check should now show the service as active and running:
# systemctl status rsf-rest.service
● rsf-rest.service - RSF-1 REST API Service
Loaded: loaded (/usr/lib/systemd/system/rsf-rest.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-07-14 09:22:57 EDT; 2s ago
Main PID: 52579 (python)
Tasks: 1 (limit: 49446)
Memory: 31.8M
CGroup: /system.slice/rsf-rest.service
└─52579 /opt/HAC/RSF-1/bin/python /opt/HAC/RSF-1/lib/python/rest_api_app.pyc >/dev/null
Jul 14 09:22:57 mgc81 systemd[1]: Started RSF-1 REST API Service.
This can be confirmed by navigating to the Webapp via the new port https://<ip of node>:4335
-
Udev rules are defined into files with the .rules extension. There are two main locations in which those files can be placed:
/usr/lib/udev/rules.d
is the directory used for system-installed rules,/etc/udev/rules.d/
is reserved for custom made rules. In this example we've used the name50-rsf.rules
, but any suitable file name can be use. ↩