When working with directory structures in Linux, you'll often encounter scenarios where you need to compress multiple subdirectories separately. The standard tar -czf
approach creates a single archive containing all subdirectories, which isn't always what we want.
Here's the most efficient method I've found after years of sysadmin work:
find /path/to/directory -maxdepth 1 -mindepth 1 -type d -exec tar -czf {}.tar.gz {} \;
Let's break down why this works:
-maxdepth 1
: Only processes immediate subdirectories-mindepth 1
: Excludes the parent directory itself-type d
: Only matches directories- The
-exec
parameter runs tar for each found directory
For those who prefer more readable shell scripts:
for dir in /path/to/directory/*/; do
dirname=$(basename "$dir")
tar -czf "${dirname}.tar.gz" "$dir"
done
Some edge cases to consider:
# For directories with spaces:
find . -maxdepth 1 -mindepth 1 -type d -print0 | while IFS= read -r -d '' dir; do
tar -czf "${dir//\//_}.tar.gz" "$dir"
done
# Excluding certain directories:
find . -maxdepth 1 -mindepth 1 -type d ! -name "exclude_this*" -exec tar ... \;
When dealing with thousands of subdirectories, consider:
- Using GNU parallel to process multiple directories simultaneously
- Adding
--use-compress-program=pigz
for faster compression (requires pigz installed) - Using
-T
flag with tar to exclude unnecessary files
# Parallel processing example:
find . -maxdepth 1 -mindepth 1 -type d | parallel -j 4 'tar -czf {}.tar.gz {}'
Always verify your archives:
for archive in *.tar.gz; do
if ! tar -tzf "$archive" >/dev/null; then
echo "Corrupt archive: $archive" >&2
fi
done
When working with directory structures in Linux, there are frequent scenarios where you need to compress each subdirectory into its own archive file while preserving the original directory hierarchy. This is particularly useful for:
- Creating incremental backups
- Distributing modular components
- Preparing datasets for transfer
The traditional method involves using a combination of find
and tar
commands:
find directory/ -maxdepth 1 -mindepth 1 -type d -exec tar -czvf {}.tar.gz {} \;
This command breaks down as:
-maxdepth 1
: Only process immediate subdirectories-mindepth 1
: Exclude the parent directory itself-exec
: Execute the following command for each match
For more complex scenarios with nested directories, we might want to preserve the full path:
find path/to/parent -type d -exec sh -c 'tar -czvf "${1%/}.tar.gz" "$1"' _ {} \;
This approach properly handles directories containing spaces and special characters.
When dealing with numerous subdirectories, we can leverage GNU Parallel for faster processing:
find directory/ -maxdepth 1 -mindepth 1 -type d | parallel tar -czvf {}.tar.gz {}
This distributes the compression tasks across available CPU cores.
For frequent use, consider adding this to your .bashrc
:
function tar-subdirs() {
if [ -z "$1" ]; then
echo "Usage: tar-subdirs /path/to/directory"
return 1
fi
find "$1" -maxdepth 1 -mindepth 1 -type d -print0 | while IFS= read -r -d '' dir; do
tar -czvf "${dir}.tar.gz" "$dir"
done
}
Usage becomes simply:
tar-subdirs /path/to/directory
Always verify your archives with:
for archive in *.tar.gz; do
echo "Checking $archive..."
tar -tzvf "$archive" | head -5
done