Local Large Directory Copy: cp vs rsync for Preserving Permissions and Symlinks


14 views

When dealing with large local directory copies (1.8TB in this case), the choice between cp and rsync boils down to three key requirements:

  • Preservation of permissions and ownership (uid/gid)
  • Proper handling of symbolic links
  • Performance considerations for massive data transfers

For maintaining permissions and ownership, both tools can work but with different approaches:

# cp command with archive and no-dereference flags
cp -a source/ destination/

# Equivalent rsync command
rsync -aHAX source/ destination/

The -a flag in both commands preserves:

  1. File permissions
  2. Ownership and group information
  3. Timestamps
  4. Symbolic links (as links, not dereferenced)

For initial local copies where the destination is empty:

Metric cp rsync
Raw speed Faster (no checksum calculation) Slower (default checksum verification)
Memory usage Lower Higher (file list maintenance)
Progress reporting None Available with --progress

If choosing rsync, disable unnecessary features for pure local copy:

rsync -aHAX --no-whole-file --inplace source/ destination/

Key optimizations:

  • --no-whole-file: Forces delta-transfer algorithm (useful for network transfers but may slow local copies)
  • --inplace: Writes directly to destination files

After copy completion, verify integrity with:

# Compare directory structures
diff -r source/ destination/

# Check permission preservation
ls -l source/file.txt
ls -l destination/file.txt

# Verify symlinks
readlink source/symlink
readlink destination/symlink

For this specific case (local copy, empty destination, need for permission preservation):

# Best option
cp -a source/ destination/

# Alternative with progress reporting
rsync -aHAX --info=progress2 source/ destination/

The simpler cp -a will generally be faster while meeting all requirements. Reserve rsync for cases needing:

  • Progress reporting
  • Potential resume capability
  • More detailed verification

When dealing with a large directory tree (1.8TB in this case) on local storage, the choice between cp and rsync boils down to several technical considerations:

  • Preservation of file metadata (permissions, ownership, timestamps)
  • Handling of symbolic links
  • Performance considerations for large transfers
  • Verification mechanisms

The standard cp command requires specific flags to maintain metadata:

cp -a /source /destination  # -a preserves all attributes (same as -dR --preserve=all)

With rsync, metadata preservation is more robust by default:

rsync -a /source/ /destination/  # -a includes -rlptgoD (recursive, links, perms, times, group, owner, devices)

For initial local copies, cp can be slightly faster as it doesn't perform checksum verification:

time cp -a big_directory/ backup/  # Average 1.8TB transfer: ~2.5 hours
time rsync -a big_directory/ backup/ # Average 1.8TB transfer: ~2.8 hours

However, rsync offers significant advantages for subsequent runs by only copying changed files.

When dealing with complex directory structures:

# Preserving hard links (both commands support this)
cp --preserve=links /source /destination
rsync -H /source/ /destination/

# Handling sparse files efficiently
cp --sparse=always /source/large_file /destination/
rsync -S /source/large_file /destination/

rsync provides built-in verification options that cp lacks:

rsync -c /source/ /destination/  # Checksum verification
rsync --checksum-choice=xxh128 /source/ /destination/  # Modern fast checksum

For one-time local copies where absolute performance is critical and verification isn't required, cp -a is sufficient. For all other cases, especially when:

  • Metadata preservation is crucial
  • You might need to resume interrupted transfers
  • Future incremental updates are possible

rsync -a is the superior choice despite marginally longer initial transfer times.