When a KVM guest VM's filesystem gets corrupted and requires fsck, but you've lost both console access and root credentials, the situation becomes tricky. The hypervisor-level approach becomes necessary since traditional in-guest methods aren't available.
Before proceeding, ensure:
- The VM is powered off (virsh destroy vm_name)
- You have root access to the hypervisor
- You know the VM's disk image location (check with: virsh dumpxml vm_name | grep "source file")
Here's how to access the guest's disk from the hypervisor:
# Install necessary tools (CentOS 6.1) yum install -y kpartx e2fsprogs # Create mount point mkdir /mnt/guest_fs # Map partitions (for qcow2 images) modprobe nbd max_part=8 qemu-nbd -c /dev/nbd0 /var/lib/libvirt/images/vm_disk.qcow2 kpartx -a /dev/nbd0 # Now check which partition contains the root filesystem fdisk -l /dev/nbd0 # Mount the partition (example for /dev/nbd0p1) mount /dev/mapper/nbd0p1 /mnt/guest_fs
With the filesystem mounted, run fsck with appropriate options:
# For ext3/ext4 filesystems fsck -y /dev/mapper/nbd0p1 # For XFS (requires different approach) xfs_repair /dev/mapper/nbd0p1
For more advanced scenarios, consider using libguestfs tools:
# Install guestfish yum install -y libguestfs-tools # Run interactive repair guestfish --rw -a /var/lib/libvirt/images/vm_disk.qcow2 # Inside guestfish shell: >run > list-filesystems > e2fsck /dev/sda1 forceall:true > exit
For frequent needs, create a repair script:
#!/bin/bash VM_NAME=$1 VM_DISK=$(virsh dumpxml $VM_NAME | grep -oP "source file='\K[^']+") virsh destroy $VM_NAME qemu-nbd -c /dev/nbd0 $VM_DISK kpartx -a /dev/nbd0 fsck -y /dev/mapper/nbd0p1 kpartx -d /dev/nbd0 qemu-nbd -d /dev/nbd0 virsh start $VM_NAME
Remember that:
- Always backup VM disks before attempting repairs
- Some filesystems may require special handling (LVM, btrfs, etc.)
- For production systems, consider snapshotting before repair
- Monitor the VM closely after repair for any residual issues
If you encounter:
# "Device or resource busy" errors: umount /mnt/guest_fs kpartx -d /dev/nbd0 qemu-nbd -d /dev/nbd0 # Filesystem type not recognized: file -s /dev/mapper/nbd0p1
When a Linux guest VM crashes and requires filesystem checking (fsck), the typical approach would be to access the console or SSH into the machine. However, when neither console access nor root credentials are available, we need to leverage hypervisor-level solutions.
Before proceeding, ensure you have:
1. Root access to the KVM host
2. Sufficient disk space for potential recovery operations
3. The guest VM powered off (critical for filesystem integrity)
# Check VM status
virsh list --all
# Shut down the VM if running
virsh destroy vm_name
The most reliable approach is using libguestfs tools which are specifically designed for manipulating VM disk images without booting the guest:
# Install libguestfs-tools on CentOS 6
yum install libguestfs-tools
# Run fsck on the guest disk
guestfish --rw -a /var/lib/libvirt/images/vm_disk.qcow2
Once in guestfish shell:
> run
> list-filesystems
> fsck /dev/sda1
> exit
For cases where libguestfs isn't available, we can use qemu-nbd:
# Load nbd module
modprobe nbd max_part=8
# Connect disk image
qemu-nbd --connect=/dev/nbd0 /var/lib/libvirt/images/vm_disk.qcow2
# Check partition table
fdisk -l /dev/nbd0
# Run fsck
fsck -y /dev/nbd0p1 # Adjust partition number as needed
# Disconnect when done
qemu-nbd --disconnect /dev/nbd0
For guests using LVM, additional steps are required:
# Scan for LVM volumes
pvscan --cache /dev/nbd0pX
vgchange -ay
# Now fsck each logical volume
fsck -y /dev/mapper/vg_name-lv_name
Always create a backup before attempting recovery operations. For qcow2 images:
# Create backup
qemu-img convert -O qcow2 vm_disk.qcow2 vm_disk_backup.qcow2
# For raw images:
cp --sparse=always vm_disk.raw vm_disk_backup.raw
To force fsck on next boot (when you regain access):
# Method 1: Create /forcefsck
guestfish -i -a vm_disk.qcow2 touch /forcefsck
# Method 2: Modify fstab options
guestfish -i -a vm_disk.qcow2 vi /etc/fstab
# Change pass number from 0 to 1 or 2 for desired partitions
If encountering "Device or resource busy" errors:
# Ensure all handles are released
lsof | grep /dev/nbd0
# Or force unmount
umount -l /mnt/guest