Optimizing Rsync Performance for Small File Transfers: Advanced Techniques for System Administrators


1 views

When dealing with thousands of small files (typically <1KB each), rsync's default configuration becomes inefficient due to:

  • High protocol overhead per file
  • Excessive metadata operations
  • Frequent checksum calculations

For my production servers, I use this optimized command:

rsync -azHX --delete --numeric-ids \
      --info=progress2 --no-i-r \
      --partial-dir=.rsync-partial \
      --bwlimit=0 --compress-level=1 \
      /home/user/ user@10.1.1.1::backup

Compression Strategy (-z with level 1):

--compress-level=1 # Faster than default (6)

Transfer Protocol Choice:

# For LAN transfers (faster but insecure):
rsync:// protocol

# For WAN transfers (slower but secure):
rsync -e "ssh -T -c aes128-gcm@openssh.com -o Compression=no -x"

For critical deployments, I combine rsync with GNU parallel:

find /home/user/ -type f | parallel -j 8 -X rsync -azHX {} user@10.1.1.1::backup
Tool Small Files (10k) Transfer Time
rsync (optimized) ~500MB 3m42s
tar over ssh ~500MB 2m15s
fpart + rsync ~500MB 1m53s

For our enterprise backup system, we use this wrapper script:

#!/bin/bash
TARGET="user@10.1.1.1::backup"
THREADS=$(nproc)
LOG="/var/log/rsync_$(date +%Y%m%d).log"

fpart -f 1000 -x "*.tmp" -s 10M /home/user/ \
| parallel -j $THREADS -X rsync -azHX \
    --files-from={} --delete / $TARGET >> $LOG 2>&1

When dealing with thousands of small files, rsync's default behavior can become inefficient due to:

  • High overhead from per-file metadata operations
  • Excessive protocol negotiation
  • Unnecessary checksum calculations

Here are the most effective rsync flags for small file optimization:

rsync -az --partial --inplace --no-whole-file \
--max-size=1M --min-size=1 \
--info=progress2 --human-readable \
--delete --compress-level=1 \
user@10.1.1.1::backup /home/user/

Key optimizations:

  • --compress-level=1: Faster compression with minimal CPU overhead
  • --inplace: Avoids temporary file creation
  • --no-whole-file: Enables delta-transfer algorithm
  • --info=progress2: Better progress reporting

For local networks, the rsync protocol is generally faster than SSH. However, consider these benchmarks:

# SSH version (slower but more secure)
rsync -e "ssh -T -c aes128-gcm@openssh.com -o Compression=no -x" \
-azP /source/ user@remote:/dest/

# rsync daemon version (faster)
rsync -azP rsync://user@remote/module/path /local/path

File Batching with tar

Combine small files into a tar stream during transfer:

# On source:
tar -cf - /source/dir | pv | \
ssh user@remote "tar -xf - -C /destination"

# With progress and compression:
tar -zcf - /source | pv -s $(du -sb /source | awk '{print $1}') | \
ssh user@remote "tar -zxf - -C /dest"

Parallel rsync with xargs

Process multiple files simultaneously:

find /source -type f -print0 | \
xargs -0 -n 100 -P 8 rsync -az --relative --files-from=- ./ user@remote:/dest

Optimize these filesystem parameters on both source and destination:

  • Increase inotify watchers: sysctl fs.inotify.max_user_watches=524288
  • Use modern filesystems (XFS, ZFS) with proper sector sizes
  • Disable atime updates: mount -o remount,noatime /path

For truly massive small-file operations, evaluate these tools:

  • Unison: Bidirectional sync with conflict resolution
  • lsyncd: Real-time synchronization daemon
  • Rclone: Cloud-optimized file transfers