LVM Data Distribution, Failure Handling, and Best Practices for Disaster Recovery in Linux Systems


2 views

When you create a 200MB logical volume spanning two 100MB physical drives (sda and sdb), LVM uses a technique called striping by default. Here's what happens when you write a 150MB file:

# Example LVM setup commands:
pvcreate /dev/sda /dev/sdb
vgcreate VolumeGroup1 /dev/sda /dev/sdb
lvcreate -L 200M -n LogicalVolume1 VolumeGroup1
mkfs.ext4 /dev/VolumeGroup1/LogicalVolume1
mount /dev/VolumeGroup1/LogicalVolume1 /mnt/lvm_volume

The file blocks are distributed across both physical devices in chunks called extents (default 4MB). The first 100MB would be on sda, the remaining 50MB on sdb. The LVM metadata (stored in /etc/lvm/backup/) tracks which physical extents contain which parts of logical volumes.

Without RAID, LVM provides no redundancy. If sdb fails:

  • All data on sdb is lost
  • Data on sda remains intact, but the logical volume becomes corrupted
  • The system may crash or become unstable

To control physical placement, use allocation policies:

# Allocate new extents preferentially on sda:
lvcreate -L 200M -n LogicalVolume1 VolumeGroup1 /dev/sda

# Or set allocation policy for an existing LV:
lvchange --alloc cling VolumeGroup1/LogicalVolume1

Best practice suggests:

  1. Volume Group Organization:
    # For production systems:
    vgcreate vg_ssd /dev/nvme0n1
    vgcreate vg_hdd /dev/sda /dev/sdb
    
    # For flexibility:
    vgcreate vg_main /dev/sd[a-z]
  2. Logical Volume Sizing:
    # Leave 10-20% free space in VG for snapshots
    lvcreate -L 80G -n lv_web vg_ssd
    lvcreate -l 80%FREE -n lv_db vg_hdd

Essential LVM backup commands:

# Backup metadata:
vgcfgbackup VolumeGroup1

# Restore to new disk after failure:
pvcreate /dev/sdc
vgextend VolumeGroup1 /dev/sdc
vgcfgrestore -f /etc/lvm/backup/VolumeGroup1 VolumeGroup1

For critical data, combine LVM with:

# RAID1 for mirroring:
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda /dev/sdb
pvcreate /dev/md0

When you create a 200MB logical volume (LV) across two 100MB physical volumes (PVs) sda and sdb, LVM uses extents (typically 4MB blocks) to distribute data. For a 150MB file:


# Check physical extent distribution
lvdisplay -m /dev/VolumeGroup1/LogicalVolume1

Output would show something like:


--- Segments ---
Logical extent 0 to 24: Type linear Physical volume /dev/sda Physical extents 0 to 24
Logical extent 25 to 37: Type linear Physical volume /dev/sdb Physical extents 0 to 12

This means the first 100MB (25 extents × 4MB) is on sda, remaining 50MB (13 extents × 4MB) on sdb. The LVM metadata stored in /etc/lvm/backup and physical volume headers tracks this mapping.

Without RAID, if sdb fails:

  • All data on sdb is lost (the 50MB portion)
  • sda's data remains intact but the LV becomes corrupted
  • You'll get I/O errors when accessing beyond the first 100MB

To check which PVs are active:


pvscan

Use lvcreate with explicit placement:


# Allocate first 100MB strictly on sda
lvcreate -L 100M -n LogicalVolume1 VolumeGroup1 /dev/sda

# Extend using sdb
lvextend -L +100M /dev/VolumeGroup1/LogicalVolume1 /dev/sdb

For more granular control, use --alloc policy:


lvcreate -L 200M -n LogicalVolume2 --alloc cling VolumeGroup1

Best practices I've found:

  1. Volume Group Design:
    # Create separate VGs for different storage tiers
    vgcreate fast_vg /dev/nvme0n1
    vgcreate slow_vg /dev/sda /dev/sdb
  2. Logical Volume Allocation:
    # Allocate 80% of space initially
    lvcreate -l 80%VG -n lv_data VolumeGroup1
  3. Snapshot Strategy:
    # Create snapshot with 20% of origin size
    lvcreate -s -n db_backup -L 20G /dev/VolumeGroup1/db_volume

When a PV fails:


# 1. Remove failed PV
vgreduce --removemissing VolumeGroup1

# 2. Replace hardware
pvcreate /dev/sdc

# 3. Add to VG
vgextend VolumeGroup1 /dev/sdc

# 4. Rebuild data from backup
lvcreate -L 200M -n LogicalVolume1 VolumeGroup1