When dealing with network-based filesystems like GlusterFS exports mounted via NFS, we often face a race condition between service startup and mount attempts. The systemd dependency chain (Requires
+ After
) only ensures the process started, not that the service is fully operational.
Your existing unit definitions show proper structural dependencies:
# glusterfsd.service
[Unit]
After=network.target glusterd.service
# remote-fs.target
[Unit]
Requires=glusterfsd.service
After=glusterfsd.service remote-fs-pre.target
Systemd marks units as "active" when their main process starts, but GlusterFS NFS requires additional initialization time. We need to implement a readiness check.
Create a helper service that actively verifies NFS availability:
# /etc/systemd/system/gluster-nfs-ready.service
[Unit]
Description=GlusterFS NFS Readiness Check
After=glusterfsd.service
[Service]
Type=oneshot
ExecStart=/usr/bin/bash -c 'until showmount -e localhost &>/dev/null; do sleep 1; done'
TimeoutSec=300
[Install]
WantedBy=remote-fs.target
Then modify remote-fs.target:
# /etc/systemd/system/remote-fs.target.d/10-gluster-wait.conf
[Unit]
Requires=gluster-nfs-ready.service
After=gluster-nfs-ready.service
For mount units specifically, systemd v240+ supports automatic retries:
# /etc/systemd/system/stor.mount
[Unit]
After=glusterfsd.service
ConditionPathExists=/stor
[Mount]
What=node04:/stor
Where=/stor
Type=nfs
Options=retry=10,timeo=30
[Install]
WantedBy=remote-fs.target
For enterprise deployments, consider combining both approaches:
# /etc/systemd/system/gluster-nfs-probe.service
[Unit]
After=glusterfsd.service
Before=remote-fs.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/gluster-nfs-probe.sh
TimeoutSec=0
[Install]
WantedBy=multi-user.target
Sample probe script (/usr/local/bin/gluster-nfs-probe.sh):
#!/bin/bash
MAX_RETRIES=30
INTERVAL=2
for i in $(seq 1 $MAX_RETRIES); do
if showmount -e localhost | grep -q '/stor'; then
exit 0
fi
sleep $INTERVAL
done
exit 1
Use these commands to verify the solution:
systemd-analyze critical-chain remote-fs.target
journalctl -u gluster-nfs-ready.service -u remote-fs.target --since "5 minutes ago"
When working with distributed filesystems like GlusterFS, we often encounter timing issues where NFS exports aren't immediately available after the service starts. The standard After=
and Requires=
directives in systemd only ensure the service process has started, not that it's fully operational.
Your current unit files show good practice with proper dependency declarations:
[Unit]
Description=GlusterFS brick processes (stopping only)
After=network.target glusterd.service
And in remote-fs.target:
[Unit]
Requires=glusterfsd.service
After=glusterfsd.service remote-fs-pre.target
The logs clearly show the race condition:
Apr 14 16:16:22 node04 systemd[1]: Started GlusterFS
Apr 14 16:16:22 node04 systemd[1]: Mounting /stor...
Apr 14 16:16:23 node04 mount[2960]: mount.nfs: mounting node04:/stor failed
Systemd considers the service "started" when the process launches, but Gluster's NFS exports take additional time to become available.
Solution 1: Health Check Script
Create a helper service that verifies NFS availability:
[Unit]
Description=GlusterFS NFS readiness check
After=glusterfsd.service
Before=remote-fs.target
[Service]
Type=oneshot
ExecStart=/usr/local/bin/check-gluster-nfs.sh
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
Sample check script:
#!/bin/bash
for i in {1..30}; do
if showmount -e localhost | grep -q /stor; then
exit 0
fi
sleep 1
done
exit 1
Solution 2: Mount Unit Retry Logic
Modify your mount unit to include retries:
[Unit]
Description=GlusterFS NFS Mount
After=glusterfsd.service
Requires=glusterfsd.service
[Mount]
What=node04:/stor
Where=/stor
Type=nfs
Options=soft,retry=5,timeo=10,retrans=1
Solution 3: Systemd Path Unit
Trigger the mount when a readiness file appears:
[Unit]
Description=Watch for Gluster NFS readiness
[Path]
PathExists=/var/run/gluster-nfs.ready
[Install]
WantedBy=multi-user.target
For production systems, I recommend combining approaches:
[Unit]
Description=GlusterFS NFS Mount
After=glusterfs-nfs-ready.service
Requires=glusterfs-nfs-ready.service
[Mount]
What=node04:/stor
Where=/stor
Type=nfs
Options=soft,retry=5,timeo=10
With the readiness service:
[Unit]
Description=GlusterFS NFS readiness
After=glusterfsd.service
Before=remote-fs.target
[Service]
Type=oneshot
ExecStart=/usr/bin/bash -c 'until showmount -e localhost | grep -q /stor; do sleep 1; done'
ExecStart=/usr/bin/touch /var/run/gluster-nfs.ready
RemainAfterExit=yes
This ensures proper sequencing while providing monitoring capabilities through the systemd journal.