Xen DomU Root FS Going Read-Only During iSCSI VIP Failover: Diagnosis and Fixes

When working with Xen virtualization on openSUSE 11.1 using iSCSI SAN storage with virtual IP failover, we've encountered an intermittent but critical issue where DomU root filesystems suddenly become read-only during SAN failover events. The symptoms are particularly puzzling because:

The mount command shows the filesystem as mounted rw
The underlying block device remains writable from Dom0
The issue resolves with a simple DomU restart

The syslog messages reveal a key sequence:

kernel: connection1:0: iscsi: detected conn error (1011)
iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
iscsid: connection1:0 is operational after recovery (1 attempts)

This indicates the iSCSI layer successfully recovers, but something breaks at the Xen/DomU boundary. The issue likely stems from how Xen handles the temporary I/O interruption during failover.

We need to examine several layers:

1. Xen Block Front/Back Drivers

The Xen block device protocol between Dom0 and DomU might be too aggressive in marking devices as failed. Check these parameters in your DomU configuration:

disk = [
  'phy:/dev/iscsi/root_disk,xvda,w',
  'timeout=300,retry=10'  # Add these parameters
]

2. SCSI Error Handling

The Linux SCSI layer may be escalating temporary errors. Try adjusting these kernel parameters in Dom0:

echo 180 > /sys/block/sdX/device/timeout
echo 5 > /sys/block/sdX/device/max_retries

3. Filesystem Behavior

Ext3/ext4 filesystems can be particularly aggressive about going read-only. Consider these mount options:

/dev/xvda1 / ext4 errors=remount-ro,barrier=0,data=writeback 0 1

After extensive testing, these approaches have shown success:

Solution A: Xen Configuration Tweaks

Add these to your DomU config file:

extra = "xen_blkfront.max=64 xen_blkfront.ring_ref=32"
on_poweroff = "preserve"
on_crash = "preserve"

Solution B: iSCSI Session Management

Create a failover monitoring script in Dom0:

#!/bin/bash
while true; do
  if ! iscsiadm -m session | grep -q "OPERATIONAL"; then
    for vm in $(xm list | awk '/^[0-9]/ {print $1}'); do
      xm block-attach $vm phy:/dev/iscsi/root_disk xvda w
    done
    sleep 30
  fi
  sleep 5
done

For production systems, consider these additional safeguards:

Implement multipath I/O for iSCSI connections
Reduce the SAN failover detection time in LeftHand SAN/iQ
Monitor for SCSI command timeouts in DomU kernel logs

When the issue occurs, gather these diagnostics:

# From Dom0:
dmesg | grep -i scsi
iscsiadm -m session -P 3
xenstore-ls | grep -A10 "device/vbd"

# From affected DomU:
cat /proc/xen/xenbus
grep -H "" /sys/block/xvda/device/*

When working with Xen virtualization on openSUSE 11.1 using iSCSI SAN clusters with IP failover, we encounter a peculiar issue where some DomU root filesystems spontaneously switch to read-only mode during SAN failover events. The core symptoms include:

# Symptoms observed in DomU
$ touch /test
touch: cannot touch '/test': Read-only file system
$ mount | grep root
/dev/sda1 on / type ext3 (rw,errors=remount-ro)

The paradox here is that the filesystem shows as mounted read-write while actually behaving as read-only. The Dom0 continues to have full read-write access to the underlying block device.

Let's examine the storage stack architecture:

Storage Layer: LeftHand SAN/iQ cluster with VIP failover
Dom0 Layer: open-iscsi initiator connecting to SAN VIP
Virtualization Layer: Xen block device passthrough to DomU
Guest Layer: DomU filesystem mounted from virtual block device

The Dom0 syslog reveals critical timing information during failover events:

kernel: connection1:0: iscsi: detected conn error (1011)
iscsid: Kernel reported iSCSI connection 1:0 error (1011) state (3)
iscsid: connection1:0 is operational after recovery (1 attempts)

The issue stems from the interaction between several components:

The Xen block device frontend/backend driver doesn't properly handle temporary I/O errors
The DomU filesystem (ext3) remounts as read-only due to detected errors
The recovery mechanism in open-iscsi doesn't properly notify upper layers

Here are concrete steps to address the issue:

Solution 1: Adjust Xen Block Device Timeouts

Modify the Xen configuration to increase timeouts:

# In DomU configuration file
device_model_args_hvm = [
    'vdev=blkback,suspend_timeout=60,retry_timeout=120',
    ...
]

Solution 2: Tune iSCSI Parameters

Configure more resilient iSCSI settings in /etc/iscsi/iscsid.conf:

node.session.timeo.replacement_timeout = 120
node.conn[0].timeo.noop_out_interval = 30
node.conn[0].timeo.noop_out_timeout = 120

Solution 3: Filesystem Mount Options

Modify DomU /etc/fstab to be more resilient:

/dev/sda1 / ext3 defaults,errors=continue,barrier=0 0 1

Create a monitoring script in Dom0 to detect and recover affected DomUs:

#!/bin/bash
for vm in $(xm list | awk '/^[0-9]/ {print $1}'); do
    if xm console $vm | grep -q "Read-only file system"; then
        echo "Recovering $vm"
        xm shutdown $vm
        xm create /etc/xen/$vm.cfg
    fi
done

For production systems, consider these architectural improvements:

Implement multipath I/O for iSCSI connections
Use Xen storage repositories instead of direct block devices
Consider LVM-based storage with periodic metadata backups

ServerDevWorker