When managing large directory structures in Linux, there are frequent scenarios where we need to:
- Migrate folder hierarchies between storage locations
- Prepare template directory structures for new projects
- Create test environments mirroring production folder layouts
The naive approach of cp -r
would copy all files, which becomes problematic when dealing with thousands of files across hundreds of directories.
This is the most reliable POSIX-compliant approach:
find /source/path -type d -exec mkdir -p /destination/path/{} \;
Breakdown:
find /source/path
: Starts searching from the source directory-type d
: Only matches directories-exec mkdir -p
: Creates each directory (with parents if needed)/destination/path/{}
: Maintains relative path structure
When you need more control over permissions and attributes:
rsync -a -f"+ */" -f"- *" /source/path/ /destination/path/
Advantages:
- Preserves original timestamps and permissions
- Filters can be adjusted for more complex inclusion/exclusion
- Efficient for very large directory trees
Let's say you want to recreate a web application's directory structure without copying node_modules:
# Create staging environment
find /var/www/production -type d -not -path "*node_modules*" -exec mkdir -p ~/staging/{} \;
# Verify structure
tree -d ~/staging | head -15
For directories with thousands of subfolders:
- Method 1 (find) is generally faster for simple cases
- Method 2 (rsync) is better when maintaining metadata is crucial
- Both methods consume minimal disk space as they don't copy file contents
For extremely large directory trees (10k+ folders), use GNU parallel:
find /source -type d -print0 | parallel -0 mkdir -p /destination/{}
This significantly speeds up the process on multi-core systems.
When working with large codebases or data projects, we often need to replicate directory structures without transferring the actual files. This becomes crucial when:
- Setting up identical project structures for new team members
- Creating test environments without duplicating large datasets
- Migrating directory hierarchies between servers with limited storage
For complete directory structure replication, rsync is the most robust tool:
rsync -av -f"+ */" -f"- *" /source/path/ /destination/path/
Breakdown of parameters:
-a
: Archive mode (preserves attributes)-v
: Verbose output-f"+ */"
: Include only directories-f"- *"
: Exclude all files
For systems without rsync or when needing more control:
find /source/path -type d -printf "%P\n" | xargs -I {} mkdir -p "/destination/path/{}"
This pipeline:
- Finds all directories recursively
- Prints relative paths
- Creates matching directory structure
When dealing with thousands of directories:
- Rsync is generally faster for large hierarchies
- The find approach may be more memory-efficient on low-resource systems
- For extremely large structures, consider splitting the operation
Here's how I recently used this technique to migrate a Node.js project structure:
# Preserve the node_modules structure but skip actual files
rsync -av -f"+ */" -f"- *" ./old_project/node_modules/ ./new_project/node_modules/
# Verify the structure
tree -d ./new_project/node_modules | head -10