When working with large sets of files in Linux/Unix environments, we often need to compress each file individually while maintaining their original names in the archive filenames. This differs from standard bulk compression where multiple files get bundled into a single archive.
The simplest solution uses a basic for loop in bash:
for file in *; do
if [ -f "$file" ]; then
tar -czvf "${file}.tar.gz" "$file"
fi
done
For more robust handling of filenames with spaces or special characters:
find . -maxdepth 1 -type f -print0 | while IFS= read -r -d '' file; do
tar -czvf "${file}.tar.gz" "$file"
done
When dealing with thousands of files, GNU parallel can significantly speed up the process:
find . -maxdepth 1 -type f -print0 | parallel -0 -j8 'tar -czvf {}.tar.gz {}'
To prevent re-compressing existing .tar.gz files:
for file in *; do
if [ -f "$file" ] && [[ ! "$file" =~ \.tar\.gz$ ]]; then
tar -czvf "${file}.tar.gz" "$file"
fi
done
For different compression needs, consider these variations:
# Using bzip2 compression
for f in *; do tar -cjvf "$f.tar.bz2" "$f"; done
# Using xz compression (higher ratio)
for f in *; do tar -cJvf "$f.tar.xz" "$f"; done
# Creating uncompressed tar archives
for f in *; do tar -cvf "$f.tar" "$f"; done
For regular use, create a reusable bash script:
#!/bin/bash
if [ $# -ne 1 ]; then
echo "Usage: $0 directory"
exit 1
fi
cd "$1" || exit 1
for file in *; do
if [ -f "$file" ] && [[ ! "$file" =~ \.tar\.gz$ ]]; then
echo "Compressing $file..."
tar -czvf "${file}.tar.gz" "$file"
fi
done
When working with log files, data exports, or any collection of files where each needs to be compressed separately while maintaining its original filename structure, this technique becomes essential. Common scenarios include:
- Preparing attachments for email systems with size limits
- Archiving server logs by individual date
- Creating compressed backups of configuration files
The most straightforward solution uses a simple bash loop:
#!/bin/bash
for file in *; do
[[ -f "$file" ]] && tar -czvf "${file}.tar.gz" "$file"
done
For production environments, we should add proper validation:
#!/bin/bash
shopt -s nullglob
for file in *; do
if [[ -f "$file" && ! -d "$file" ]]; then
if [[ ! -e "${file}.tar.gz" ]]; then
if tar -czf "${file}.tar.gz" "$file"; then
echo "Successfully compressed: $file"
else
echo "Failed to compress: $file" >&2
fi
else
echo "Skipped (archive exists): ${file}.tar.gz" >&2
fi
fi
done
Using GNU parallel significantly speeds up compression for directories with many files:
#!/bin/bash
compress_file() {
file="$1"
tar -czf "${file}.tar.gz" "$file" && echo "Compressed $file" || echo "Failed $file" >&2
}
export -f compress_file
find . -maxdepth 1 -type f -not -name "*.tar.gz" | parallel compress_file
For cross-platform compatibility or integration with larger Python applications:
import os
import tarfile
for filename in os.listdir('.'):
if os.path.isfile(filename) and not filename.endswith('.tar.gz'):
with tarfile.open(f"{filename}.tar.gz", "w:gz") as tar:
tar.add(filename)
print(f"Created {filename}.tar.gz")
To skip certain files (like existing archives or temporary files):
#!/bin/bash
exclude_patterns=("*.tar.gz" "*.tmp" "temp_*")
for file in *; do
skip=false
for pattern in "${exclude_patterns[@]}"; do
if [[ "$file" == $pattern ]]; then
skip=true
break
fi
done
if [[ "$skip" == false && -f "$file" ]]; then
tar -czf "${file}.tar.gz" "$file"
fi
done
After creating archives, always verify them:
#!/bin/bash
for archive in *.tar.gz; do
if tar -tzf "$archive" >/dev/null 2>&1; then
echo "Verified: $archive"
else
echo "Corrupted: $archive" >&2
fi
done