How to Force FSCK on KVM Guest VM When Losing Console/Root Access


2 views

When a KVM guest VM's filesystem gets corrupted and requires fsck, but you've lost both console access and root credentials, the situation becomes tricky. The hypervisor-level approach becomes necessary since traditional in-guest methods aren't available.

Before proceeding, ensure:

  • The VM is powered off (virsh destroy vm_name)
  • You have root access to the hypervisor
  • You know the VM's disk image location (check with: virsh dumpxml vm_name | grep "source file")

Here's how to access the guest's disk from the hypervisor:

# Install necessary tools (CentOS 6.1)
yum install -y kpartx e2fsprogs

# Create mount point
mkdir /mnt/guest_fs

# Map partitions (for qcow2 images)
modprobe nbd max_part=8
qemu-nbd -c /dev/nbd0 /var/lib/libvirt/images/vm_disk.qcow2
kpartx -a /dev/nbd0

# Now check which partition contains the root filesystem
fdisk -l /dev/nbd0

# Mount the partition (example for /dev/nbd0p1)
mount /dev/mapper/nbd0p1 /mnt/guest_fs

With the filesystem mounted, run fsck with appropriate options:

# For ext3/ext4 filesystems
fsck -y /dev/mapper/nbd0p1

# For XFS (requires different approach)
xfs_repair /dev/mapper/nbd0p1

For more advanced scenarios, consider using libguestfs tools:

# Install guestfish
yum install -y libguestfs-tools

# Run interactive repair
guestfish --rw -a /var/lib/libvirt/images/vm_disk.qcow2

# Inside guestfish shell:
> run
> list-filesystems
> e2fsck /dev/sda1 forceall:true
> exit

For frequent needs, create a repair script:

#!/bin/bash
VM_NAME=$1
VM_DISK=$(virsh dumpxml $VM_NAME | grep -oP "source file='\K[^']+")

virsh destroy $VM_NAME
qemu-nbd -c /dev/nbd0 $VM_DISK
kpartx -a /dev/nbd0
fsck -y /dev/mapper/nbd0p1
kpartx -d /dev/nbd0
qemu-nbd -d /dev/nbd0
virsh start $VM_NAME

Remember that:

  • Always backup VM disks before attempting repairs
  • Some filesystems may require special handling (LVM, btrfs, etc.)
  • For production systems, consider snapshotting before repair
  • Monitor the VM closely after repair for any residual issues

If you encounter:

# "Device or resource busy" errors:
umount /mnt/guest_fs
kpartx -d /dev/nbd0
qemu-nbd -d /dev/nbd0

# Filesystem type not recognized:
file -s /dev/mapper/nbd0p1

When a Linux guest VM crashes and requires filesystem checking (fsck), the typical approach would be to access the console or SSH into the machine. However, when neither console access nor root credentials are available, we need to leverage hypervisor-level solutions.

Before proceeding, ensure you have:

1. Root access to the KVM host

2. Sufficient disk space for potential recovery operations

3. The guest VM powered off (critical for filesystem integrity)


# Check VM status
virsh list --all
# Shut down the VM if running
virsh destroy vm_name

The most reliable approach is using libguestfs tools which are specifically designed for manipulating VM disk images without booting the guest:


# Install libguestfs-tools on CentOS 6
yum install libguestfs-tools

# Run fsck on the guest disk
guestfish --rw -a /var/lib/libvirt/images/vm_disk.qcow2

Once in guestfish shell:


> run
> list-filesystems
> fsck /dev/sda1
> exit

For cases where libguestfs isn't available, we can use qemu-nbd:


# Load nbd module
modprobe nbd max_part=8

# Connect disk image
qemu-nbd --connect=/dev/nbd0 /var/lib/libvirt/images/vm_disk.qcow2

# Check partition table
fdisk -l /dev/nbd0

# Run fsck
fsck -y /dev/nbd0p1  # Adjust partition number as needed

# Disconnect when done
qemu-nbd --disconnect /dev/nbd0

For guests using LVM, additional steps are required:


# Scan for LVM volumes
pvscan --cache /dev/nbd0pX
vgchange -ay

# Now fsck each logical volume
fsck -y /dev/mapper/vg_name-lv_name

Always create a backup before attempting recovery operations. For qcow2 images:


# Create backup
qemu-img convert -O qcow2 vm_disk.qcow2 vm_disk_backup.qcow2

# For raw images:
cp --sparse=always vm_disk.raw vm_disk_backup.raw

To force fsck on next boot (when you regain access):


# Method 1: Create /forcefsck
guestfish -i -a vm_disk.qcow2 touch /forcefsck

# Method 2: Modify fstab options
guestfish -i -a vm_disk.qcow2 vi /etc/fstab
# Change pass number from 0 to 1 or 2 for desired partitions

If encountering "Device or resource busy" errors:


# Ensure all handles are released
lsof | grep /dev/nbd0
# Or force unmount
umount -l /mnt/guest