When dealing with large local directory copies (1.8TB in this case), the choice between cp
and rsync
boils down to three key requirements:
- Preservation of permissions and ownership (uid/gid)
- Proper handling of symbolic links
- Performance considerations for massive data transfers
For maintaining permissions and ownership, both tools can work but with different approaches:
# cp command with archive and no-dereference flags
cp -a source/ destination/
# Equivalent rsync command
rsync -aHAX source/ destination/
The -a
flag in both commands preserves:
- File permissions
- Ownership and group information
- Timestamps
- Symbolic links (as links, not dereferenced)
For initial local copies where the destination is empty:
Metric | cp | rsync |
---|---|---|
Raw speed | Faster (no checksum calculation) | Slower (default checksum verification) |
Memory usage | Lower | Higher (file list maintenance) |
Progress reporting | None | Available with --progress |
If choosing rsync, disable unnecessary features for pure local copy:
rsync -aHAX --no-whole-file --inplace source/ destination/
Key optimizations:
--no-whole-file
: Forces delta-transfer algorithm (useful for network transfers but may slow local copies)--inplace
: Writes directly to destination files
After copy completion, verify integrity with:
# Compare directory structures
diff -r source/ destination/
# Check permission preservation
ls -l source/file.txt
ls -l destination/file.txt
# Verify symlinks
readlink source/symlink
readlink destination/symlink
For this specific case (local copy, empty destination, need for permission preservation):
# Best option
cp -a source/ destination/
# Alternative with progress reporting
rsync -aHAX --info=progress2 source/ destination/
The simpler cp -a
will generally be faster while meeting all requirements. Reserve rsync for cases needing:
- Progress reporting
- Potential resume capability
- More detailed verification
When dealing with a large directory tree (1.8TB in this case) on local storage, the choice between cp
and rsync
boils down to several technical considerations:
- Preservation of file metadata (permissions, ownership, timestamps)
- Handling of symbolic links
- Performance considerations for large transfers
- Verification mechanisms
The standard cp
command requires specific flags to maintain metadata:
cp -a /source /destination # -a preserves all attributes (same as -dR --preserve=all)
With rsync
, metadata preservation is more robust by default:
rsync -a /source/ /destination/ # -a includes -rlptgoD (recursive, links, perms, times, group, owner, devices)
For initial local copies, cp
can be slightly faster as it doesn't perform checksum verification:
time cp -a big_directory/ backup/ # Average 1.8TB transfer: ~2.5 hours
time rsync -a big_directory/ backup/ # Average 1.8TB transfer: ~2.8 hours
However, rsync
offers significant advantages for subsequent runs by only copying changed files.
When dealing with complex directory structures:
# Preserving hard links (both commands support this)
cp --preserve=links /source /destination
rsync -H /source/ /destination/
# Handling sparse files efficiently
cp --sparse=always /source/large_file /destination/
rsync -S /source/large_file /destination/
rsync
provides built-in verification options that cp
lacks:
rsync -c /source/ /destination/ # Checksum verification
rsync --checksum-choice=xxh128 /source/ /destination/ # Modern fast checksum
For one-time local copies where absolute performance is critical and verification isn't required, cp -a
is sufficient. For all other cases, especially when:
- Metadata preservation is crucial
- You might need to resume interrupted transfers
- Future incremental updates are possible
rsync -a
is the superior choice despite marginally longer initial transfer times.