Troubleshooting GlusterFS “Failed to Get Volume File” Error During Remote Mount


2 views

When attempting to mount a GlusterFS volume from a remote server (eros in this case), the client fails with the critical error:

[2015-02-04 15:02:56.065574] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2015-02-04 15:02:56.065650] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/storage)

Before diving deeper, let's verify basic connectivity:

# Check basic connectivity to gluster server
telnet eros 24007
# Verify gluster volume status
gluster volume status storage
# Check volume info from server
gluster volume info storage

The error typically occurs due to:

  • Network connectivity issues (firewall, routing)
  • Version mismatch between client and server
  • Permission/authentication problems
  • Incorrect volume configuration
  • DNS resolution failures

Run these on both client and server:

# Check active ports
ss -tulnp | grep gluster
# Verify brick status
gluster volume status storage detail
# Check client-server handshake
tcpdump -i eth0 port 24007 -vv -nn

Verify these critical files:

# /etc/glusterfs/glusterd.vol (server)
volume management
    type mgmt/glusterd
    option working-directory /var/lib/glusterd
    option transport-type socket,rdma
    option transport.socket.keepalive-time 10
    option transport.socket.keepalive-interval 2
    option transport.socket.keepalive-count 5
end-volume

For CentOS/RHEL systems:

# Required ports for GlusterFS
firewall-cmd --add-port=24007-24008/tcp --permanent
firewall-cmd --add-port=49152-49251/tcp --permanent
firewall-cmd --reload

Try these mount variations:

# Basic mount
mount -t glusterfs eros:/storage /mnt/storage

# Using FUSE directly
glusterfs --volfile-server=eros --volfile-id=/storage /mnt/storage

# With debug logging
glusterfs --log-level=DEBUG --volfile-server=eros --volfile-id=/storage /mnt/storage

While you mentioned version 3.6.2, be aware of these compatibility points:

  • Clients should ideally match server version
  • Protocol changes occurred between 3.6.x and 3.7.x
  • Consider upgrading to a supported version if possible
  1. Verify /etc/hosts entries on all nodes
  2. Check SELinux contexts if enabled
  3. Test with both IP and hostname
  4. Review /var/log/glusterfs logs on both ends
  5. Try a different network path if possible

When attempting to mount a GlusterFS volume from a remote fileserver to a local development machine, you might encounter the frustrating "failed to get the 'volume file' from server" error. This issue typically manifests when the client cannot retrieve the volume specification from the glusterd management daemon.

The error appears in logs with these telltale messages:

[2015-02-04 15:02:56.065574] E [glusterfsd-mgmt.c:1494:mgmt_getspec_cbk] 0-glusterfs: failed to get the 'volume file' from server
[2015-02-04 15:02:56.065650] E [glusterfsd-mgmt.c:1596:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:/storage)

Before diving deeper, perform these basic checks:

# Test TCP connectivity to glusterd
telnet eros 24007

# Verify service is running
systemctl status glusterd.service

# Check volume info directly
gluster volume info storage --remote-host=eros

When basic checks pass, try these deeper diagnostics:

# Increase client log level for detailed debugging
glusterfs --volfile-server=eros --volfile-id=/storage /mnt/gluster -l DEBUG

# Check server-side logs simultaneously
journalctl -u glusterd -f

# Verify volume file existence
ls -la /var/lib/glusterd/vols/storage

Common misconfigurations that trigger this error include:

  • Mismatched versions between client and server
  • Incorrect volume permissions in /var/lib/glusterd
  • Firewall rules blocking auxiliary ports (24007-24008 plus ephemeral ports)
  • DNS resolution failures between nodes

For a typical resolution, follow these steps:

# On the Gluster server
sudo chown -R glusterfs:glusterfs /var/lib/glusterd
sudo systemctl restart glusterd

# On the client machine
sudo rm -rf /var/lib/glusterd/vols/storage
sudo gluster volume info storage --remote-host=eros --mode=script
sudo mount -t glusterfs eros:/storage /mnt/gluster

When dealing with cross-DC deployments:

# Adjust TCP timeouts for high-latency networks
echo 1800 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl

# For AWS environments, ensure security groups allow:
# - Inbound TCP 24007-24008 from client IPs
# - Ephemeral ports 49152-65535 bidirectional

Create this diagnostic script for future troubleshooting:

#!/bin/bash
SERVER="eros"
VOLUME="storage"

check_connectivity() {
    timeout 2 telnet $SERVER 24007 &>/dev/null
    [ $? -eq 0 ] || { echo "Connection failed"; exit 1; }
}

check_volume() {
    gluster volume info $VOLUME --remote-host=$SERVER &>/dev/null
    [ $? -eq 0 ] || { echo "Volume inaccessible"; exit 1; }
}

check_mount() {
    mount | grep -q "eros:/$VOLUME"
    [ $? -eq 0 ] || { echo "Not mounted"; exit 1; }
}

check_connectivity
check_volume
check_mount
echo "All checks passed"