Optimizing rsync for Handling Renamed Files/Directories While Maintaining Sync Efficiency


2 views

When using rsync for directory replication, one common frustration occurs when dealing with renamed files or directories. The standard -u (update) flag alone doesn't detect renames - it treats them as new files at the destination and deleted files at the source. Here's why this happens:

# Source directory contains:
# renamedFile.txt (previously existing as newfile2.txt)
# Destination still has:
# newfile2.txt
rsync -zvru --bwlimit=1024 /mymounts/test1/ /mymounts/test2

rsync's --fuzzy option helps match renamed files by comparing content similarity when no perfect match exists. Combine this with --delete to maintain sync accuracy:

rsync -zvru --fuzzy --delete --bwlimit=1024 /mymounts/test1/ /mymounts/test2

Important considerations:

  • --fuzzy has a performance cost as it checks file contents
  • Works best when combined with --delete to remove obsolete files
  • May produce false positives with very similar files

For critical operations where you need to preserve renamed files, combine with backup functionality:

rsync -zvru --fuzzy --delete --backup --backup-dir=/mymounts/VerControl \
--bwlimit=1024 /mymounts/test1/ /mymounts/test2

This configuration will:

  • Properly handle renamed files via --fuzzy
  • Move original versions to VerControl before deletion
  • Maintain your bandwidth limitation

When working with large directories or slow connections, consider these optimizations:

# Use checksum comparison only when necessary
rsync -zvru --fuzzy --delete --size-only --bwlimit=1024 \
/mymounts/test1/ /mymounts/test2

# For directories with many similar files
rsync -zvru --fuzzy --delete --compare-dest=/mymounts/VerControl \
--bwlimit=1024 /mymounts/test1/ /mymounts/test2

For automated syncing during business hours:

#!/bin/bash
LOG_FILE="/var/log/rsync_$(date +%Y%m%d).log"
echo "Starting sync at $(date)" >> $LOG_FILE

rsync -zvru --fuzzy --delete --backup --backup-dir=/mymounts/VerControl \
--bwlimit=1024 /mymounts/test1/ /mymounts/test2 >> $LOG_FILE 2>&1

echo "Sync completed at $(date)" >> $LOG_FILE

Remember to:

  • Set appropriate permissions for the log directory
  • Implement log rotation for long-term use
  • Consider locking mechanisms for concurrent execution

When using rsync for directory synchronization, one common frustration occurs when dealing with renamed files or directories. The default behavior treats renames as separate operations: deletion of the old name and creation under the new name. This results in unnecessary data transfers, especially problematic under bandwidth constraints.

Given your example directory structures:

# Source:
/mymounts/test1/some stuff/
new directory  newfile1.txt  newfile3.txt  renamedFile.txt

# Destination: 
/mymounts/test2/some stuff/
new directory  newfile1.txt  newfile2.txt  newfile3.txt

The standard rsync command:

rsync -zvru --bwlimit=1024 /mymounts/test1/ /mymounts/test2

Would unnecessarily transfer the contents of renamedFile.txt as if it were a new file, rather than recognizing it as simply a renamed version of newfile2.txt.

Rsync's --fuzzy (-y) option enables rename detection by comparing file contents:

rsync -zvruy --bwlimit=1024 /mymounts/test1/ /mymounts/test2

This makes rsync:

  1. Check for missing files in destination
  2. Compare their contents against new files in source
  3. Perform rename operations instead of delete+copy when matches are found

For more control over rename detection, combine with --delete:

rsync -zvruy --delete --bwlimit=1024 /mymounts/test1/ /mymounts/test2

This ensures:

  • Renamed files are properly tracked
  • Truly deleted files are removed from destination
  • Bandwidth limits are respected

To archive renamed files in /mymounts/VerControl:

rsync -zvruy --backup --backup-dir=/mymounts/VerControl \
--bwlimit=1024 /mymounts/test1/ /mymounts/test2

This approach provides:

  1. Efficient synchronization of renamed files
  2. Automatic archiving of overwritten files
  3. Complete version history preservation

With large directories, --fuzzy can increase CPU usage. For better performance:

rsync -zvruy --fuzzy --no-whole-file --inplace \
--bwlimit=1024 /mymounts/test1/ /mymounts/test2

Key optimizations:

  • --no-whole-file: Enables delta transfers
  • --inplace: Reduces disk I/O
  • Still maintains rename detection