How to Use rsync with find to Transfer Files Modified After a Specific Date


1 views

When managing backups or synchronizing files between systems, we often need to transfer only files modified after a certain date. This is particularly useful for incremental backups or when dealing with large directories where transferring everything would be inefficient.

The most efficient approach combines find to filter files by modification time and rsync to handle the actual transfer. Here's the basic command structure:

find /source/path -type f -newermt "2023-10-01" -print0 | rsync -av --files-from=- --from0 /source/path /destination/path

Let's examine each component:

  • -type f: Only look for files (exclude directories)
  • -newermt "2023-10-01": Files modified after October 1, 2023
  • -print0: Null-terminated output for handling filenames with spaces
  • --files-from=-: Read file list from stdin
  • --from0: Input is null-terminated

For files modified in the last 3 days:

find /data/projects -type f -mtime -3 -print0 | rsync -av --files-from=- --from0 /data/projects backup-server:/backups

For a specific date range (between 7 and 3 days ago):

find /var/log -type f -mtime +3 -mtime -7 -print0 | rsync -av --files-from=- --from0 /var/log archive-server:/old-logs

Include directories while preserving structure:

find /source -type d -newermt "2023-10-01" -print0 | rsync -av --files-from=- --from0 /source /destination

Exclude certain file patterns:

find /source -type f -newermt "2023-10-01" ! -name "*.tmp" -print0 | rsync -av --files-from=- --from0 /source /destination

For very large directories, you might want to:

  • Run find during off-peak hours
  • Consider using -maxdepth to limit directory traversal
  • Add -x to rsync to stay on one filesystem

If you have GNU rsync 3.1.0+, you can use its built-in time filtering:

rsync -av --include="*" --include="*/" --exclude="*" --min-age=3d /source/ /destination/

When managing backups or synchronizing files across systems, you might need to copy only files modified after a certain date. This is common in scenarios like incremental backups or syncing recent changes. The rsync command alone doesn't support filtering by date directly, but we can combine it with find to achieve this.

The key is to use find to identify files modified after your target date, then pass these files to rsync. Here's the basic approach:

find /source/path -type f -newermt "2023-10-01" -print0 | rsync -av --files-from=- --from0 /source/path /destination/path

Let's break this down:

  • -newermt "2023-10-01" finds files newer than October 1, 2023
  • -print0 handles filenames with spaces correctly
  • --files-from=- tells rsync to read files from stdin
  • --from0 indicates null-terminated input

For your specific case of syncing files modified in the last 3 days:

find /home/user/documents -type f -mtime -3 -print0 | rsync -av --files-from=- --from0 /home/user/documents /backup/location

Here, -mtime -3 finds files modified within the last 3 days (72 hours).

If you need to preserve directory structure, add --relative to rsync:

find /source -type f -newermt "3 days ago" -print0 | rsync -av --relative --files-from=- --from0 / /destination

For more control, you can:

  1. Exclude certain file types:
    find /source -type f -newermt "2023-10-01" ! -name "*.tmp" -print0 | rsync -av --files-from=- --from0 /source /dest
    
  2. Include subdirectories:
    find /source -type d -newermt "2023-10-01" -print0 | rsync -av --dirs --files-from=- --from0 /source /dest
    
  • Test with --dry-run first to verify which files will be copied
  • For large directory trees, this method might be slower than traditional rsync
  • Remember that -mtime uses 24-hour periods (not calendar days)

While less precise, you can approximate date filtering with rsync's pattern matching:

rsync -av --include="*/" --include="*_202310[0-9][0-9]_*" --exclude="*" /source/ /dest/

This would match files with dates in October 2023 in their names.