ZFS snapshots are immutable by design - this is both their greatest strength and occasional pain point. When you need to permanently remove sensitive data or reclaim space from thousands of snapshots, standard filesystem operations won't cut it. The rm
command only affects the live filesystem, leaving the data intact in existing snapshots.
To actually free space, we need to reconstruct the snapshot history without the unwanted files. Here's the technical workflow:
# List all snapshots for the dataset
zfs list -H -t snapshot -r pool/dataset | awk '{print $1}'
# For each snapshot:
# 1. Create a clone
zfs clone pool/dataset@snapshot pool/clone_temp
# 2. Delete target files in clone
rm -rf /pool/clone_temp/cache/*
# 3. Promote the clone
zfs promote pool/clone_temp
# 4. Create new snapshot
zfs snapshot pool/clone_temp@cleaned
# 5. Destroy original snapshot
zfs destroy pool/dataset@snapshot
Here's a Bash script that automates cache directory removal across all snapshots:
#!/bin/bash
DATASET="pool/dataset"
TMP_CLONE="pool/temp_clone_$$"
TARGET_DIR="cache"
for SNAP in $(zfs list -H -t snapshot -r $DATASET | awk '{print $1}'); do
echo "Processing $SNAP"
# Create temporary clone
zfs clone "$SNAP" "$TMP_CLONE"
# Remove target directories
MOUNTPOINT=$(zfs get -H -o value mountpoint "$TMP_CLONE")
rm -rf "${MOUNTPOINT}/${TARGET_DIR}"
# Promote and create new snapshot
zfs promote "$TMP_CLONE"
NEW_SNAP="${SNAP/@/_cleaned@}"
zfs snapshot "$NEW_SNAP"
# Cleanup
zfs destroy "$SNAP"
zfs destroy "$TMP_CLONE"
done
For large datasets with many snapshots, consider these optimizations:
- Process snapshots in reverse chronological order
- Use
zfs send | zfs receive
for large-scale operations - Perform operations during low-usage periods
The community-developed zfs-remove-file
tool provides a more streamlined solution:
git clone https://github.com/ewwhite/zfs-remove-file.git
cd zfs-remove-file
./zfs-remove-file -r /pool/dataset cache
Remember that these operations will temporarily consume additional disk space during the clone/promote process. Always ensure you have adequate free space before proceeding.
When dealing with ZFS snapshots, traditional file deletion methods like rm -rf
don't actually free up disk space because the data remains referenced in previous snapshots. This becomes particularly problematic when you need to remove specific directory patterns (like "cache" folders) across multiple snapshots to reclaim space.
While the clone-promote-snapshot approach can hide files, it doesn't physically remove the data blocks from the pool. The original snapshot chain still holds references to the deleted files, preventing true space reclamation.
Here's a method that actually frees space by recreating snapshots without the unwanted files:
# 1. List all snapshots containing cache directories
zfs list -H -o name -t snapshot tank/dataset | \
while read snap; do
if zfs diff "$snap" | grep -q "/cache$"; then
echo "$snap"
fi
done > affected_snapshots.txt
# 2. Process each snapshot chronologically
while read snap; do
# Create writable clone
zfs clone "$snap" "${snap}_temp"
# Remove cache directories
find "/${snap}_temp" -type d -name "cache" -exec rm -rf {} +
# Promote clone to replace original
zfs promote "${snap}_temp"
# Create new clean snapshot
zfs snapshot "${snap%@*}"@"${snap#*@}"
# Cleanup
zfs destroy "${snap}_temp"
zfs destroy "$snap"
done < affected_snapshots.txt
For more complex cases, consider using ZFS replication with filters:
# Create initial send stream
zfs send tank/dataset@snap1 | \
mbuffer -q -s 128k -m 1G | \
zfs recv -F -u -d tank/clean_dataset
# Incremental sends for subsequent snapshots
zfs send -i tank/dataset@snap1 tank/dataset@snap2 | \
grep -v '/cache/' | \
mbuffer -q -s 128k -m 1G | \
zfs recv -F -u -d tank/clean_dataset
- Always test with non-production data first
- Maintain proper backups before bulk operations
- Consider performance impact during heavy operations
- The process may temporarily require additional storage space
For regular maintenance, consider wrapping this in a script with proper error handling and logging:
#!/bin/bash
set -euo pipefail
DATASET="tank/production"
LOG_FILE="/var/log/zfs_cache_clean.log"
function log {
echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}
log "Starting cache directory cleanup"
zfs list -H -o name -t snapshot "$DATASET" | \
while read -r snap; do
if zfs diff "$snap" | grep -q "/cache$"; then
log "Processing snapshot: $snap"
TEMP_CLONE="${snap}_temp_$(date +%s)"
zfs clone "$snap" "$TEMP_CLONE" || {
log "Clone failed for $snap"
exit 1
}
find "/$TEMP_CLONE" -type d -name "cache" -exec rm -rf {} + || {
log "Deletion failed in $TEMP_CLONE"
exit 1
}
zfs promote "$TEMP_CLONE" || {
log "Promote failed for $TEMP_CLONE"
exit 1
}
NEW_SNAP="${snap%@*}"@"${snap#*@}"
zfs snapshot "$NEW_SNAP" || {
log "Snapshot recreation failed for $NEW_SNAP"
exit 1
}
zfs destroy "$TEMP_CLONE" || {
log "Temporary clone destruction failed: $TEMP_CLONE"
exit 1
}
zfs destroy "$snap" || {
log "Original snapshot destruction failed: $snap"
exit 1
}
log "Successfully processed $snap"
fi
done
log "Cache cleanup completed successfully"