How to Copy Files While Preserving Directory Structure in Linux/Unix Systems


3 views

When working with version control systems like SVN, developers often need to copy modified files to another location while maintaining their original directory hierarchy. The naive approach of simply using cp fails because it doesn't automatically create missing parent directories in the target path.

Many developers first attempt solutions like:

svn status -q | awk '{ print $2 }' | xargs -d \\\\n -I '{}' cp '{}' /tmp/xen/'{}'

This fails because:

  1. The target path concatenation isn't handled properly
  2. cp doesn't automatically create intermediate directories
  3. Path parsing can break with spaces or special characters

Here are three robust approaches to solve this problem:

1. Using tar (Recommended)

This method preserves all attributes and handles complex directory structures:

svn status -q | awk '{ print $2 }' | xargs tar cf - | (cd /tmp/xen/; tar xvf -)

2. With rsync

For more control over the copy operation:

svn status -q | awk '{ print $2 }' | xargs -d '\n' -I {} rsync -R {} /tmp/xen/

3. Using cp with mkdir

A more manual approach with shell scripting:

svn status -q | awk '{ print $2 }' | while read file; do
    mkdir -p "/tmp/xen/$(dirname "$file")"
    cp "$file" "/tmp/xen/$file"
done

For production use, consider these additional factors:

  • File names with spaces or special characters
  • Symbolic links preservation
  • Permission maintenance
  • Large directory trees

The tar method generally performs best for:

  • Large numbers of files
  • Deep directory structures
  • Distributed systems


When working with SVN repositories, we often need to copy modified files while preserving their directory structure. The naive approach:

svn status -q | awk '{ print $2 }' | xargs -d \\\\n -I '{}' cp '{}' /tmp/xen/

fails because it doesn't create the necessary parent directories. Attempting to include the path in the destination:

cp '{}' /tmp/xen/'{}'

results in "no such file or directory" errors since cp doesn't automatically create directories.

The most elegant solution uses tar to preserve directory structures:

svn status -q | awk '{ print $2 }' | xargs tar cf - | (cd /tmp/xen/; tar xvf -)

This approach:

  • Lists modified files from SVN
  • Creates a tar archive on stdout
  • Extracts to target directory while preserving paths

For those preferring rsync (though slightly less efficient for this specific case):

svn status -q | awk '{ print $2 }' | rsync -av --files-from=- . /tmp/xen/

Or using GNU cp with --parents flag (Linux only):

svn status -q | awk '{ print $2 }' | xargs -I '{}' cp --parents '{}' /tmp/xen/

The tar pipe method is particularly efficient because:

  1. It handles the entire operation in a single pipeline
  2. No intermediate files are created
  3. Minimal process spawning compared to multiple cp commands

To make the command more robust:

svn status -q | awk '{ print $2 }' | grep -v '^$' | xargs tar cf - | (cd /tmp/xen/ && tar xvf -)

The added grep filters empty lines, and using && ensures the extraction only happens if cd succeeds.

For a Python project where you want to copy only modified .py files:

svn status -q | awk '$2 ~ /\\.py$/ { print $2 }' | xargs tar cf - | (cd /tmp/pybackup/; tar xvf -)

This demonstrates how to combine file type filtering with the directory preservation technique.