How to Force Check and Repair /dev/xvda1 Filesystem Errors on a Mounted Ubuntu EC2 Instance


3 views

When logging into an Ubuntu EC2 instance, you might encounter the warning:

*** /dev/xvda1 should be checked for errors ***

This indicates potential filesystem corruption that needs attention. The challenge arises because:

  • The root filesystem (/dev/xvda1) is mounted
  • Standard fsck operations fail on mounted filesystems
  • Unmounting fails due to active processes

Attempting sudo umount /dev/xvda1 typically fails because:

umount: /: target is busy

Even when using lsof to identify processes:

jbd2/xvda  172  root  cwd  DIR  202,1  4096  2 /
jbd2/xvda  172  root  rtd  DIR  202,1  4096  2 /
jbd2/xvda  172  root  txt  unknown  /proc/172/exe

Killing these processes (kill -SIGKILL 172) often proves ineffective as they're kernel-level journaling processes.

Method 1: Using AWS Systems Manager (Recommended)

For production instances, the safest approach is:

1. Create an AMI backup of your instance
2. Launch the AWS Systems Manager (SSM) console
3. Navigate to "Automation" and select "AWS-FileSystemCheck"
4. Specify your instance ID and the device (/dev/xvda1)
5. Execute the automation

This performs an offline filesystem check without manual intervention.

Method 2: Manual Forced Check on Next Boot

For instances where SSM isn't available:

# Create a forcefsck flag file
sudo touch /forcefsck

# Schedule a filesystem check on next reboot
sudo tune2fs -c 1 /dev/xvda1

# Reboot the instance
sudo reboot

After reboot, check the results:

sudo cat /var/log/boot.log | grep -i fsck

Method 3: Using EC2 Serial Console

For critical situations:

1. Enable EC2 Serial Console for the instance
2. Connect via serial console
3. Reboot into single-user mode:
   - Interrupt boot process
   - Edit kernel command line to add "single"
   - Continue boot
4. Run fsck manually:
   fsck -fy /dev/xvda1
5. Reboot normally

Implement these best practices:

  • Schedule regular checks:
    sudo tune2fs -c 100 /dev/xvda1  # Check every 100 mounts
  • Monitor disk health:
    sudo smartctl -a /dev/xvda
  • Configure CloudWatch alarms for disk metrics
  • Implement proper shutdown procedures

If you still encounter issues:

# Check filesystem type
sudo file -sL /dev/xvda1

# Verify mount options
sudo mount | grep xvda1

# Check for read-only mounts (common after errors)
sudo mount -o remount,rw /

Remember that forced checks on production systems should be scheduled during maintenance windows due to potential downtime.


When logging into an Ubuntu EC2 instance, you might encounter this nagging message:

*** /dev/xvda1 should be checked for errors ***

This occurs when the system detects potential filesystem inconsistencies during previous boots, but cannot automatically run fsck because the partition is mounted as root.

The usual solutions don't work here:

# These won't work on a mounted root partition
sudo fsck /dev/xvda1
sudo umount /dev/xvda1

Attempting to lsof reveals system processes like jbd2 (journaling block device) holding the filesystem:

jbd2/xvda  172  root  cwd  DIR  202,1  4096  2 /
jbd2/xvda  172  root  rtd  DIR  202,1  4096  2 /

Even kill -SIGKILL won't terminate these essential kernel threads.

For root filesystems, we need to schedule filesystem check at boot time:

# Schedule filesystem check on next reboot
sudo touch /forcefsck

Alternatively, you can set the fsck pass count in /etc/fstab:

# Edit /etc/fstab and ensure the pass field is not 0
/dev/xvda1 / ext4 defaults 0 1  # Change last 0 to 1

On AWS EC2, you can:

1. Stop the instance (not terminate)
2. Detach the root volume
3. Attach it to another instance as secondary
4. Run fsck on the unmounted volume:
   sudo fsck -y /dev/xvdf1  # Adjust device name
5. Reattach and restart original instance

Consider these best practices:

# Enable automatic filesystem checks
sudo tune2fs -c 100 /dev/xvda1  # Check every 100 mounts
sudo tune2fs -i 30d /dev/xvda1  # Check every 30 days

Regularly monitor filesystem health:

sudo smartctl -H /dev/xvda
sudo dmesg | grep -i error