Optimizing KVM Guest I/O Performance: Troubleshooting Slow Disk Operations Compared to Host


3 views

When running virtualized environments, we typically expect near-native performance from KVM guests. However, your benchmark results showing 30-70% slower I/O performance in guests versus the host indicates a configuration issue worth investigating.

Your current setup uses:

<disk type='file' device='disk'>
  <driver name='qemu' type='raw'/>
  <source file='/dev/vgkvmnode/lv2'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>

Several factors could contribute to the performance gap:

  • Cache Settings: Missing cache configuration in the disk definition
  • IO Threads: Not using dedicated I/O threads
  • Virtio Queue Depth: Default queue settings might be suboptimal
  • NUMA Alignment: Potential NUMA node misalignment

First, modify your disk configuration to include cache settings and IO threads:

<disk type='file' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native'/>
  <source file='/dev/vgkvmnode/lv2'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
  <iothreads>1</iothreads>
</disk>

Add these parameters to your guest's XML configuration for better performance:

<domain type='kvm'>
  ...
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
    </hyperv>
  </features>
  <cpu mode='host-passthrough' check='none'/>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  ...
</domain>

On the host machine, consider these optimizations:

# Set virtio queue depth
echo 256 > /sys/block/vdX/queue/nr_requests

# Adjust swappiness
sysctl vm.swappiness=10

# Disable transparent hugepages for the VM process
echo never > /sys/kernel/mm/transparent_hugepage/enabled

After applying these changes, rerun your benchmarks to verify improvements. The most critical metrics to watch are:

  • Single-threaded read/write operations
  • Concurrent I/O performance
  • Latency under load

If performance remains suboptimal, consider these alternatives:

# For raw performance:
<driver name='qemu' type='raw' cache='writeback'/>

# For safety with decent performance:
<driver name='qemu' type='qcow2' cache='writethrough'/>

# For LVM passthrough:
<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native'/>
  <source dev='/dev/vgkvmnode/lv2'/>
  <target dev='vda' bus='virtio'/>
</disk>

Verify your virtio drivers are properly installed in the guest:

# On guest system:
lsmod | grep virtio
modprobe virtio_blk virtio_net virtio_pci virtio_ring virtio

When comparing raw storage performance between my KVM host and guest systems, I consistently observed 30-70% slower I/O operations in the guest environment. The host runs CentOS 6.3 with four 1TB SATA HDDs in software RAID10 configuration, while the guest uses LVM storage on virtio.

I conducted comprehensive tests using both iozone and dd to measure different aspects of storage performance:

# Single process read test
dd if=/dev/vgkvmnode/lv2 of=/dev/null bs=1M count=1024 iflag=direct

# Concurrent read test
for i in {1..4}; do
  dd if=/dev/vgkvmnode/lv2 of=/dev/null bs=1M count=1024 iflag=direct skip=$((i*1024)) &
done

The current virtio disk configuration in libvirt XML shows:

<disk type='file' device='disk'>
  <driver name='qemu' type='raw'/>
  <source file='/dev/vgkvmnode/lv2'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>

After extensive testing, these areas showed potential for improvement:

  1. Virtio Queue Depth:
    <driver name='qemu' type='raw' queues='4' ioeventfd='on'/>
    
  2. CPU Pinning:
    <vcpu placement='static' cpuset='0-3'>4</vcpu>
    <cputune>
      <vcpupin vcpu='0' cpuset='0'/>
      <vcpupin vcpu='1' cpuset='1'/>
    </cputune>
    
  3. Cache Mode:
    <driver name='qemu' type='raw' cache='none'/>
    

For enterprise environments, these kernel parameters improved performance:

# Add to /etc/sysctl.conf
vm.dirty_ratio = 20
vm.dirty_background_ratio = 10
vm.swappiness = 10

After implementing these changes, concurrent write performance improved from 21.5MB/s to 38.7MB/s in the guest environment. The key was balancing virtio queues with available CPU cores and optimizing the host's I/O scheduler.