Live Migration and Cloning of a Running Linux Server Without Downtime: Practical Solutions


2 views

When you need to replicate a production Linux server while it's actively serving users, traditional imaging tools like dd or clonezilla won't work because:

  • Open files and running processes may cause corruption
  • File system changes during copy lead to inconsistencies
  • Network services can't be interrupted

The most reliable approach is using rsync with snapshot capabilities:


# First pass (initial sync)
rsync -aAXHv --delete --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / root@newserver:/

# Subsequent passes (delta sync)
while true; do
    rsync -aAXHv --delete --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / root@newserver:/
    sleep 300
done

If your server uses LVM, you can create snapshots for consistency:


# Create snapshot volume
lvcreate -L10G -s -n root_snapshot /dev/vg/root

# Mount snapshot
mkdir /mnt/snapshot
mount -o ro /dev/vg/root_snapshot /mnt/snapshot

# Rsync from snapshot
rsync -aAXHv --delete /mnt/snapshot/ root@newserver:/

# Cleanup
umount /mnt/snapshot
lvremove /dev/vg/root_snapshot

For servers with databases, additional steps are needed:


# MySQL example
mysqldump --single-transaction --all-databases | ssh root@newserver "mysql"

# PostgreSQL example
pg_dumpall -U postgres | ssh root@newserver "psql -U postgres"

After replication, you'll need to:

  1. Update network configuration (/etc/network/interfaces or /etc/sysconfig/network-scripts/)
  2. Modify hostname (/etc/hostname and /etc/hosts)
  3. Regenerate SSH host keys (rm /etc/ssh/ssh_host_* && dpkg-reconfigure openssh-server)
  4. Check service configurations (Apache, Nginx, etc.) for hardcoded IPs

For enterprise environments consider:

  • DRBD for block-level replication
  • GlusterFS for distributed storage
  • Cluster solutions like Pacemaker/Corosync



When you need to clone a production Linux server that can't be shut down, traditional imaging tools like dd fall short because they require unmounting filesystems. The solution lies in live migration techniques and filesystem-aware cloning tools.

For servers using LVM, we can create consistent snapshots while the system runs:

# Create snapshot volume (adjust size as needed)
lvcreate -L10G -s -n server_snapshot /dev/vg00/lv_root

# Create raw image file
dd if=/dev/vg00/server_snapshot of=/mnt/backup/server.img bs=4M

# On target server:
dd if=server.img of=/dev/sdX bs=4M

For non-LVM systems, rsync combined with a brief filesystem freeze provides an alternative:

# On source server:
mkdir /backup
fsfreeze -f / && \
rsync -aAXHSv --delete --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found"} / /backup/ && \
fsfreeze -u /

# On target server (after basic OS installation):
rsync -aAXHSv --delete backup_user@source:/backup/ /

For mission-critical systems requiring near-real-time cloning:

# Install DRBD on both servers
apt install drbd-utils

# Configure /etc/drbd.d/server-clone.res
resource server-clone {
  protocol C;
  device /dev/drbd0;
  meta-disk internal;
  
  on source-server {
    address 192.168.1.100:7788;
    disk /dev/vg00/lv_root;
  }
  
  on target-server {
    address 192.168.1.101:7788;
    disk /dev/sdb1;
  }
}

# Initialize and start replication
drbdadm create-md server-clone
drbdadm up server-clone

After cloning, critical checks include:

  • Comparing md5sum of critical binaries (/bin, /sbin, /usr)
  • Verifying service configurations match
  • Testing network connectivity and service availability

For large enterprise environments:

  • Schedule cloning during low-traffic periods
  • Consider using enterprise tools like Veeam or Rubrik
  • Implement verification scripts to ensure clone integrity

Example verification script:

#!/bin/bash
# Compare critical directories
diff -rq /etc/ /mnt/clone/etc/ > /tmp/diff_results
if [ -s /tmp/diff_results ]; then
  echo "Configuration differences detected!"
  cat /tmp/diff_results
fi