Understanding and Resolving DRBD “UpToDate/Diskless” Status in Linux Cluster Configurations


4 views

When working with DRBD (Distributed Replicated Block Device) in high-availability clusters, one might encounter the "UpToDate/Diskless" status that seems counterintuitive at first glance. Let's examine this specific scenario where one node shows "UpToDate/Diskless" while its peer shows "Diskless/UpToDate".

DRBD uses a specific notation for its connection states:

[local node state]/[peer node state] [local disk state]/[peer disk state]

In our example:

server1# drbd1: Connected Primary/Secondary UpToDate/Diskless
server2# drbd1: Connected Secondary/Primary Diskless/UpToDate

Yes, this is completely normal in specific DRBD configurations. The "Diskless" state indicates that:

  • The node has no local backing storage for this resource
  • It's still fully capable of participating in replication
  • All I/O is forwarded to the peer node

Several legitimate configurations can result in this status:

# Example /etc/drbd.conf configuration that might cause this:
resource drbd1 {
  device /dev/drbd1;
  disk /dev/sdb1;  # Only on server1
  meta-disk internal;
  on server1 {
    address 10.0.0.1:7789;
  }
  on server2 {
    address 10.0.0.2:7789;
    # Note: No 'disk' parameter defined here
  }
}

When working with Diskless nodes:

  • All writes must traverse the network
  • Read performance depends on network latency
  • Consider using DRBD proxy if latency is high

If this status appears unexpectedly:

# Check DRBD configuration differences:
diff <(ssh server1 cat /etc/drbd.conf) <(ssh server2 cat /etc/drbd.conf)

# Verify device existence:
lsblk | grep -A10 sdb

# Check kernel messages:
dmesg | grep -i drbd

The Diskless state becomes problematic when:

  • It appears suddenly in previously working setups
  • Accompanied by "StandAlone" connection state
  • Disk failures are reported in system logs

When examining your DRBD cluster status using drbd-overview, you're seeing a mixed state that warrants explanation:

[root@server1~]# drbd-overview
1:drbd   Connected Secondary/Primary UpToDate/UpToDate C r----
2:drbd1  Connected Primary/Secondary UpToDate/Diskless C r----

[root@server2~]# drbd-overview
1:drbd   Connected Primary/Secondary UpToDate/UpToDate C r----
2:drbd1  Connected Secondary/Primary Diskless/UpToDate C r----

The UpToDate/Diskless status indicates a perfectly normal operation in specific DRBD configurations. Here's the breakdown:

  • UpToDate: The local node has the most recent data version
  • Diskless: The remote node isn't using local disk storage for this resource

This configuration often appears in these setups:

# Example drbd.conf configuration that might lead to Diskless nodes
resource drbd1 {
    protocol C;
    disk {
        on-io-error detach;
    }
    on server1 {
        device /dev/drbd1;
        disk /dev/sdb1;
        address 192.168.1.1:7789;
        meta-disk internal;
    }
    on server2 {
        device /dev/drbd1;
        # No disk parameter means diskless operation
        address 192.168.1.2:7789;
        meta-disk internal;
    }
}

While often normal, Diskless state can indicate issues when:

  • The node should have storage assigned
  • Unexpected failover occurred
  • Disk errors forced DRBD into diskless mode

Check disk status with:

# Verify underlying disk devices
lsblk
cat /proc/drbd
dmesg | grep -i error

If intentionally running diskless nodes:

  • Ensure proper failover configuration
  • Monitor network latency between nodes
  • Consider using DRBD proxy for WAN connections

To manually change states (use with caution):

# On primary node:
drbdadm primary drbd1 --force

# On secondary node:
drbdadm secondary drbd1
drbdadm connect drbd1

Always verify changes with drbd-overview after state modifications.