How to Use cp Command to Only Copy Newer or Modified Files in Linux


2 views

When copying files between directories or disks using the standard cp command in Linux, it will overwrite all destination files by default. This becomes inefficient when you want to perform incremental backups or synchronize directories where only some files have changed.

The cp command has a built-in -u (update) flag that does exactly what you need:

cp -u -r /source/directory/ /destination/directory/

Key behaviors of -u flag:

  • Only copies files that don't exist in destination
  • Only overwrites files if source is newer than destination
  • Preserves existing files that haven't changed

For more robust file synchronization, consider using rsync:

rsync -avh --update /source/directory/ /destination/directory/

Benefits over cp -u:

  • Shows progress information
  • Handles large directory structures more efficiently
  • Provides more control through additional flags

Imagine you're backing up your development project daily. After the initial full backup, subsequent backups should only transfer changed files:

# First full backup
cp -r ~/projects/myapp/ /backups/project_2023-11-15/

# Next day's incremental backup
cp -u -r ~/projects/myapp/ /backups/project_2023-11-16/

You can check which files were actually copied by using the -v (verbose) flag:

cp -uvr /source/ /destination/ | grep -v "skipped"

When performing disk-to-disk backups or synchronizing directories, we often need to copy only files that have changed or are newer than their destination counterparts. The standard cp command doesn't have built-in functionality for this, but there are several effective solutions.

The most robust tool for this task is rsync, which is specifically designed for efficient file transfers:

rsync -av --update source_directory/ destination_directory/

Key options:
- -a: Archive mode (preserves permissions, timestamps)
- -v: Verbose output
- --update: Skip files that are newer on the receiver

For systems without rsync, you can combine find and cp:

find source_dir -type f -exec sh -c '
  for file do
    dest="destination_dir/${file#source_dir/}"
    if [ ! -f "$dest" ] || [ "$file" -nt "$dest" ]; then
      cp -p "$file" "$dest"
    fi
  done
' sh {} +

If you're using GNU coreutils (common on Linux systems), cp has a built-in update flag:

cp -rup source_directory/* destination_directory/

Where:
- -u: Copy only when source is newer than destination
- -r: Recursive copy
- -p: Preserve file attributes

For large directories:
1. rsync is generally fastest due to its delta-transfer algorithm
2. The find method may be slower but works on all Unix-like systems
3. GNU cp -u offers a good balance of simplicity and speed

For more control over the copy process:

rsync -avh --progress --delete --update source/ destination/

Additional options:
- --delete: Remove files in destination not present in source
- --progress: Show transfer progress
- -h: Human-readable output

Special scenarios to consider:

# Preserve hard links
rsync -avH --update source/ destination/

# Handle sparse files efficiently
rsync -avS --update source/ destination/

# Cross-filesystem copies
rsync -avx --update source/ destination/