Optimizing Large File Transfers Between Linux Servers: Solutions for 75GB MySQL Snapshots Over 10MB Links


1 views

Transferring multi-gigabyte files between data centers presents unique challenges, especially when dealing with database snapshots. Based on my recent experience moving a 75GB MySQL LVM snapshot from LA to NY over a 10MB MPLS link, I encountered transfer rates as low as 20-30KB/s using standard tools like rsync and scp.

First, let's verify the actual network capacity with some controlled tests:

# Create 4.8GB test file
dd if=/dev/zero of=test_file bs=1M count=4800

# Measure transfer speed
time scp -C test_file user@remote:/path/to/destination

1. Parallelized rsync

Traditional rsync can be enhanced with parallel processing:

# Install parallel if needed
sudo apt-get install parallel

# Split and transfer in parallel
find /source -type f | parallel -j 4 rsync -az --progress {} user@remote:/destination

2. BBCP (Advanced File Copy)

BBCP is specifically designed for high-latency transfers:

# Install bbcp
wget ftp://ftp.slac.stanford.edu/software/bbcp/bbcp.tgz
tar xvf bbcp.tgz
cd bbcp/src
make

# Transfer with bbcp
./bbcp -z -s 16 -w 2M -T 'ssh -x -a -oFallBackToRsh=no' /path/to/large/file user@remote:/destination

3. Tar over Netcat (For Resilient Transfers)

While often recommended for small files, we can adapt this for large transfers:

# On receiver:
nc -l 12345 | tar xzvf -

# On sender:
tar czf - /path/to/files | nc remote_host 12345

These sysctl adjustments significantly improved my transfer rates:

# Add to /etc/sysctl.conf
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_low_latency = 0
net.ipv4.route.flush = 1

# Apply changes
sysctl -p

For truly massive transfers, sometimes physical media is the most reliable solution. Consider:

# Create checksum for verification
sha256sum large_file.tgz > large_file.tgz.sha256

# After transfer:
sha256sum -c large_file.tgz.sha256

For any method, proper monitoring is crucial:

# Progress viewer for any file operation
pv large_file.tgz | ssh user@remote "cat > /destination/large_file.tgz"

# Or with rsync:
rsync --progress --stats -az /source user@remote:/destination

When transferring large files (like your 75GB MySQL LVM snapshot) between data centers over a 10MB link, you're experiencing painfully slow speeds of 20-30KB/s with SCP/rsync. Here's a comprehensive technical dive into solving this.

First, validate the actual bandwidth using basic tools:


# Create test file
dd if=/dev/zero of=testfile bs=1M count=1024

# Measure transfer speed
time scp testfile user@remote:/path/

1. Parallelized SCP

Break the file into chunks and transfer in parallel:


# Split file (on source)
split -b 1G largefile.tar.gz largefile.part.

# Parallel transfer
for part in largefile.part.*; do
  scp -C $part user@remote:/destination/ &
done
wait

# Reassemble (on destination)
cat largefile.part.* > largefile.tar.gz

2. Tar Over SSH (No Intermediate File)

Stream directly without creating intermediate files:


# Single command pipeline
tar czf - /path/to/data | ssh user@remote "tar xzf - -C /destination"

3. BBCP (Alternative to SCP)

Install and use BBCP for better performance:


# On both servers
wget ftp://ftp.slac.stanford.edu/software/bbcp/bbcp.tgz
tar xvf bbcp.tgz
cd bbcp/src
make

# Transfer
./bbcp -s 16 -w 2M -P 5 user@source:/path/file user@dest:/path/

TCP Tuning Parameters

Try these sysctl settings on both servers:


# Add to /etc/sysctl.conf
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_sack = 1
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.route.flush = 1

# Apply changes
sysctl -p

For unreliable long-haul transfers:

  • UDP-based tools: Aspera, UDPcast
  • Resumable protocols: lftp, axel
  • Compression alternatives: pigz (parallel gzip) with netcat

If network limitations persist:


# Checksum verification before/after physical transfer
sha256sum largefile.tar.gz > largefile.sha256
# Ship drive...
sha256sum -c largefile.sha256

Remember that WAN acceleration devices might be needed for consistent high-speed transfers between data centers.