During a recent server migration, I encountered a counterintuitive phenomenon when transferring a 20GB KVM vdisk image between CentOS 6.5 servers. Despite expectations that compression would accelerate transfers, the reality proved otherwise:
# Baseline uncompressed transfer
scp vm1-root.img root@192.168.161.62:/mnt/vdisks/ → 23 MB/s
# With compression (-C flag)
scp -C vm1-root.img root@192.168.161.62:/mnt/vdisks/ → 11 MB/s
Further experiments revealed even more performance variations:
# bzip2 pipeline approach
bzip2 -c vm1-root.img | ssh root@192.168.161.62 "bzip2 -d -c > /mnt/vdisks/vm1-root.img" → 5 MB/s
# netcat (nc) method
# Receiver:
nc -l 5678 > /mnt/vdisks/vm1-root.img
# Sender:
nc 192.168.161.62 5678 < vm1-root.img → 40 MB/s
The performance degradation occurs due to several computational factors:
- Single-threaded compression in SSH/SCP (typically zlib at level 6)
- CPU-bound operations competing with network I/O
- Buffering limitations in the compression pipeline
For maximum throughput with compressible data:
# Parallel bzip2 approach using pbzip2
pbzip2 -c vm1-root.img | ssh root@destination "pbzip2 -d -c > target.img"
# For uncompressed transfers, consider:
rsync -av --progress vm1-root.img root@destination:/mnt/vdisks/
# Network-optimized transfer with mbuffer
mbuffer -m 1G -I 5678 < vm1-root.img | ssh root@destination "mbuffer -m 1G -O target:5678"
For virtual disk images with significant empty space:
# Detect sparse blocks first
fallocate --dig holes vm1-root.img
# Alternative transfer method for sparse files:
tar -cSf - vm1-root.img | ssh root@destination "tar -xSf - -C /mnt/vdisks"
Method | Throughput | CPU Load | Best For |
---|---|---|---|
scp (default) | 23 MB/s | Low | General transfers |
scp -C | 11 MB/s | High | Text/log files |
netcat | 40 MB/s | Low | High-speed LAN |
pbzip2 pipe | 18-25 MB/s* | High | Highly compressible data |
*Varies significantly with CPU cores and compression ratio
When transferring a 20GB KVM vdisk image between CentOS 6.5 servers, I encountered surprising performance characteristics:
# Compression benchmarks
scp -C vm1-root.img user@host:/path/ # 11 MB/s
bzip2 -c file | ssh host "bzip2 -d -c > out" # 5 MB/s
scp -c arcfour -C file user@host:/path/ # 13 MB/s
scp file user@host:/path/ # 23 MB/s (fastest)
# Alternative methods
cat file | ssh host "cat > out" # 26 MB/s
nc -l 5678 > out (receiver) # 40 MB/s (winner)
nc host 5678 < file (sender)
Compression shines in specific scenarios:
- Highly compressible text files (logs, configs)
- Slow network connections (WAN transfers)
- When using fast algorithms like LZ4
The performance degradation occurs because:
- SCP's compression algorithm (zlib) is CPU-intensive
- Modern CPUs can saturate gigabit networks without compression
- Encryption (AES) competes with compression for CPU cycles
- Vdisk images often contain compressed filesystems
For large binary files:
# Parallel compression with pigz (multi-threaded gzip)
pigz -c file | ssh host "pigz -d -c > out"
# Fastest raw transfer (netcat)
receiver$ nc -l 5678 | pv > out.img
sender$ pv file | nc receiver 5678
# Modern alternative (rsync with no compression)
rsync -a --progress --no-compress file user@host:/path/
For production environments:
- Consider
bbcp
(point-to-point parallel transfer) - Use
fpart
+rsync
for massive file sets - Implement
iperf3
testing before large transfers - Leverrate filesystem snapshots (LVM/ZFS) for consistent copies
Always benchmark with your specific workload - results vary dramatically between empty vdisk images and fully allocated ones.