SCP Compression Performance Analysis: When -C Slows Down Large File Transfers


2 views

During a recent server migration, I encountered a counterintuitive phenomenon when transferring a 20GB KVM vdisk image between CentOS 6.5 servers. Despite expectations that compression would accelerate transfers, the reality proved otherwise:

# Baseline uncompressed transfer
scp vm1-root.img root@192.168.161.62:/mnt/vdisks/ → 23 MB/s

# With compression (-C flag)
scp -C vm1-root.img root@192.168.161.62:/mnt/vdisks/ → 11 MB/s

Further experiments revealed even more performance variations:

# bzip2 pipeline approach
bzip2 -c vm1-root.img | ssh root@192.168.161.62 "bzip2 -d -c > /mnt/vdisks/vm1-root.img" → 5 MB/s

# netcat (nc) method
# Receiver:
nc -l 5678 > /mnt/vdisks/vm1-root.img
# Sender:
nc 192.168.161.62 5678 < vm1-root.img → 40 MB/s

The performance degradation occurs due to several computational factors:

  • Single-threaded compression in SSH/SCP (typically zlib at level 6)
  • CPU-bound operations competing with network I/O
  • Buffering limitations in the compression pipeline

For maximum throughput with compressible data:

# Parallel bzip2 approach using pbzip2
pbzip2 -c vm1-root.img | ssh root@destination "pbzip2 -d -c > target.img"

# For uncompressed transfers, consider:
rsync -av --progress vm1-root.img root@destination:/mnt/vdisks/

# Network-optimized transfer with mbuffer
mbuffer -m 1G -I 5678 < vm1-root.img | ssh root@destination "mbuffer -m 1G -O target:5678"

For virtual disk images with significant empty space:

# Detect sparse blocks first
fallocate --dig holes vm1-root.img

# Alternative transfer method for sparse files:
tar -cSf - vm1-root.img | ssh root@destination "tar -xSf - -C /mnt/vdisks"

Method Throughput CPU Load Best For
scp (default) 23 MB/s Low General transfers
scp -C 11 MB/s High Text/log files
netcat 40 MB/s Low High-speed LAN
pbzip2 pipe 18-25 MB/s* High Highly compressible data

*Varies significantly with CPU cores and compression ratio


When transferring a 20GB KVM vdisk image between CentOS 6.5 servers, I encountered surprising performance characteristics:

# Compression benchmarks
scp -C vm1-root.img user@host:/path/        # 11 MB/s
bzip2 -c file | ssh host "bzip2 -d -c > out" # 5 MB/s
scp -c arcfour -C file user@host:/path/      # 13 MB/s
scp file user@host:/path/                   # 23 MB/s (fastest)

# Alternative methods
cat file | ssh host "cat > out"              # 26 MB/s
nc -l 5678 > out (receiver)                 # 40 MB/s (winner)
nc host 5678 < file (sender)

Compression shines in specific scenarios:

  • Highly compressible text files (logs, configs)
  • Slow network connections (WAN transfers)
  • When using fast algorithms like LZ4

The performance degradation occurs because:

  1. SCP's compression algorithm (zlib) is CPU-intensive
  2. Modern CPUs can saturate gigabit networks without compression
  3. Encryption (AES) competes with compression for CPU cycles
  4. Vdisk images often contain compressed filesystems

For large binary files:

# Parallel compression with pigz (multi-threaded gzip)
pigz -c file | ssh host "pigz -d -c > out"

# Fastest raw transfer (netcat)
receiver$ nc -l 5678 | pv > out.img
sender$ pv file | nc receiver 5678

# Modern alternative (rsync with no compression)
rsync -a --progress --no-compress file user@host:/path/

For production environments:

  • Consider bbcp (point-to-point parallel transfer)
  • Use fpart + rsync for massive file sets
  • Implement iperf3 testing before large transfers
  • Leverrate filesystem snapshots (LVM/ZFS) for consistent copies

Always benchmark with your specific workload - results vary dramatically between empty vdisk images and fully allocated ones.