When your DRBD cluster shows either WFConnection
(Waiting for Connection) or StandAlone
states, it indicates a fundamental communication breakdown between nodes. The key indicators in your /proc/drbd
output reveal:
// Node 1 (Primary)
1: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
ns:0 nr:0 dw:0 dr:912 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:20
// Node 2 (Secondary)
1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:48
Before diving deep, verify basic network connectivity:
# On both nodes:
ping <peer-ip>
nc -zv <peer-ip> <drbd-port>
ss -tulnp | grep drbd
iptables -L -n | grep <drbd-port>
Common pitfalls include:
- Firewall rules blocking the DRBD port (default 7789)
- Network interfaces not being properly initialized
- Incorrect IP addresses in
/etc/drbd.d/r1.res
1. Reset DRBD States
First, completely tear down the resource on both nodes:
drbdadm down r1
rmmod drbd
modprobe drbd
drbdadm up r1
2. Force Primary Designation
On what should be your primary node:
drbdadm primary r1 --force
3. Establish Connection
On the secondary node, watch the connection attempt:
drbdadm connect r1
watch -n1 cat /proc/drbd
If still failing, enable detailed logging:
echo 1 > /proc/sys/drbd/debug_level
tail -f /var/log/messages | grep drbd
Common log patterns to watch for:
# Connection timeout:
"Handshake unsuccessful"
# Authentication failures:
"Packet authentication failed"
# Network issues:
"sendmsg failed with 111"
Double-check your /etc/drbd.d/r1.res
:
resource r1 {
protocol C;
startup {
become-primary-on both;
}
net {
cram-hmac-alg "sha1";
shared-secret "your-secret";
after-sb-0pri discard-zero-changes;
}
on node1 {
device /dev/drbd1;
disk /dev/sdb1;
address 192.168.1.10:7789;
meta-disk internal;
}
on node2 {
device /dev/drbd1;
disk /dev/sdb1;
address 192.168.1.11:7789;
meta-disk internal;
}
}
For persistent issues, consider packet capture:
tcpdump -i eth0 port 7789 -w drbd.pcap
# Analyze with Wireshark or:
tcpdump -r drbd.pcap -n
Key things to verify in packet captures:
- Are DRBD handshake packets being exchanged?
- Is there any TCP retransmission?
- Are packets being dropped at either end?
When your DRBD cluster gets stuck in WFConnection (Waiting For Connection) or StandAlone states, it indicates a fundamental communication breakdown between nodes. The key indicators in your /proc/drbd
output show:
Primary node: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown
Secondary node: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown
Before attempting fixes, verify these critical points:
# Check network connectivity between nodes
ping <partner_ip>
nc -zv <partner_ip> <drbd_port>
# Verify DRBD config consistency
diff /etc/drbd.d/r1.res /etc/drbd.d/r1.res
# Check for kernel module issues
lsmod | grep drbd
dmesg | grep -i drbd
1. Force Disconnect and Reconnect
First attempt a clean restart sequence:
# On both nodes:
drbdadm disconnect r1
drbdadm down r1
modprobe -r drbd
modprobe drbd
drbdadm up r1
2. Manual Connection Establishment
If automatic connection fails, manually initiate:
# On primary node:
drbdadm primary r1 --force
drbdadm connect r1 --discard-my-data
# On secondary node:
drbdadm connect r1
When standard recovery fails, enable detailed logging:
# Increase DRBD debug level
echo 7 > /proc/sys/debug/drbd
# Monitor connection attempts in real-time
watch -n 1 'cat /proc/drbd; drbd-overview'
Ensure your resource configuration contains these critical elements:
resource r1 {
protocol C;
startup {
wfc-timeout 30;
outdated-wfc-timeout 20;
}
net {
cram-hmac-alg "sha1";
shared-secret "your-secret";
after-sb-0pri discard-zero-changes;
}
on node1 {
address 10.0.0.1:7788;
device /dev/drbd1;
disk /dev/sda1;
meta-disk internal;
}
on node2 {
address 10.0.0.2:7788;
device /dev/drbd1;
disk /dev/sda1;
meta-disk internal;
}
}
Network-level blocks often cause persistent WFConnection states:
# For firewalld (RHEL/CentOS)
firewall-cmd --add-port=7788/tcp --permanent
firewall-cmd --reload
# For iptables (Debian/Ubuntu)
iptables -A INPUT -p tcp --dport 7788 -j ACCEPT
iptables-save > /etc/iptables/rules.v4
After successful reconnection, verify proper sync status:
drbdadm status r1
cat /proc/drbd
# Expected healthy output:
1: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r-----