How to Permanently Delete Specific Files from All ZFS Snapshots


2 views

ZFS snapshots are immutable by design - this is both their greatest strength and occasional pain point. When you need to permanently remove sensitive data or reclaim space from thousands of snapshots, standard filesystem operations won't cut it. The rm command only affects the live filesystem, leaving the data intact in existing snapshots.

To actually free space, we need to reconstruct the snapshot history without the unwanted files. Here's the technical workflow:

# List all snapshots for the dataset
zfs list -H -t snapshot -r pool/dataset | awk '{print $1}'

# For each snapshot:
# 1. Create a clone
zfs clone pool/dataset@snapshot pool/clone_temp

# 2. Delete target files in clone
rm -rf /pool/clone_temp/cache/*

# 3. Promote the clone
zfs promote pool/clone_temp

# 4. Create new snapshot
zfs snapshot pool/clone_temp@cleaned

# 5. Destroy original snapshot
zfs destroy pool/dataset@snapshot

Here's a Bash script that automates cache directory removal across all snapshots:

#!/bin/bash
DATASET="pool/dataset"
TMP_CLONE="pool/temp_clone_$$"
TARGET_DIR="cache"

for SNAP in $(zfs list -H -t snapshot -r $DATASET | awk '{print $1}'); do
    echo "Processing $SNAP"
    
    # Create temporary clone
    zfs clone "$SNAP" "$TMP_CLONE"
    
    # Remove target directories
    MOUNTPOINT=$(zfs get -H -o value mountpoint "$TMP_CLONE")
    rm -rf "${MOUNTPOINT}/${TARGET_DIR}"
    
    # Promote and create new snapshot
    zfs promote "$TMP_CLONE"
    NEW_SNAP="${SNAP/@/_cleaned@}"
    zfs snapshot "$NEW_SNAP"
    
    # Cleanup
    zfs destroy "$SNAP"
    zfs destroy "$TMP_CLONE"
done

For large datasets with many snapshots, consider these optimizations:

  • Process snapshots in reverse chronological order
  • Use zfs send | zfs receive for large-scale operations
  • Perform operations during low-usage periods

The community-developed zfs-remove-file tool provides a more streamlined solution:

git clone https://github.com/ewwhite/zfs-remove-file.git
cd zfs-remove-file
./zfs-remove-file -r /pool/dataset cache

Remember that these operations will temporarily consume additional disk space during the clone/promote process. Always ensure you have adequate free space before proceeding.


When dealing with ZFS snapshots, traditional file deletion methods like rm -rf don't actually free up disk space because the data remains referenced in previous snapshots. This becomes particularly problematic when you need to remove specific directory patterns (like "cache" folders) across multiple snapshots to reclaim space.

While the clone-promote-snapshot approach can hide files, it doesn't physically remove the data blocks from the pool. The original snapshot chain still holds references to the deleted files, preventing true space reclamation.

Here's a method that actually frees space by recreating snapshots without the unwanted files:


# 1. List all snapshots containing cache directories
zfs list -H -o name -t snapshot tank/dataset | \
while read snap; do
    if zfs diff "$snap" | grep -q "/cache$"; then
        echo "$snap"
    fi
done > affected_snapshots.txt

# 2. Process each snapshot chronologically
while read snap; do
    # Create writable clone
    zfs clone "$snap" "${snap}_temp"
    
    # Remove cache directories
    find "/${snap}_temp" -type d -name "cache" -exec rm -rf {} +
    
    # Promote clone to replace original
    zfs promote "${snap}_temp"
    
    # Create new clean snapshot
    zfs snapshot "${snap%@*}"@"${snap#*@}"
    
    # Cleanup
    zfs destroy "${snap}_temp"
    zfs destroy "$snap"
done < affected_snapshots.txt

For more complex cases, consider using ZFS replication with filters:


# Create initial send stream
zfs send tank/dataset@snap1 | \
    mbuffer -q -s 128k -m 1G | \
    zfs recv -F -u -d tank/clean_dataset

# Incremental sends for subsequent snapshots
zfs send -i tank/dataset@snap1 tank/dataset@snap2 | \
    grep -v '/cache/' | \
    mbuffer -q -s 128k -m 1G | \
    zfs recv -F -u -d tank/clean_dataset
  • Always test with non-production data first
  • Maintain proper backups before bulk operations
  • Consider performance impact during heavy operations
  • The process may temporarily require additional storage space

For regular maintenance, consider wrapping this in a script with proper error handling and logging:


#!/bin/bash
set -euo pipefail

DATASET="tank/production"
LOG_FILE="/var/log/zfs_cache_clean.log"

function log {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

log "Starting cache directory cleanup"

zfs list -H -o name -t snapshot "$DATASET" | \
while read -r snap; do
    if zfs diff "$snap" | grep -q "/cache$"; then
        log "Processing snapshot: $snap"
        
        TEMP_CLONE="${snap}_temp_$(date +%s)"
        zfs clone "$snap" "$TEMP_CLONE" || {
            log "Clone failed for $snap"
            exit 1
        }
        
        find "/$TEMP_CLONE" -type d -name "cache" -exec rm -rf {} + || {
            log "Deletion failed in $TEMP_CLONE"
            exit 1
        }
        
        zfs promote "$TEMP_CLONE" || {
            log "Promote failed for $TEMP_CLONE"
            exit 1
        }
        
        NEW_SNAP="${snap%@*}"@"${snap#*@}"
        zfs snapshot "$NEW_SNAP" || {
            log "Snapshot recreation failed for $NEW_SNAP"
            exit 1
        }
        
        zfs destroy "$TEMP_CLONE" || {
            log "Temporary clone destruction failed: $TEMP_CLONE"
            exit 1
        }
        
        zfs destroy "$snap" || {
            log "Original snapshot destruction failed: $snap"
            exit 1
        }
        
        log "Successfully processed $snap"
    fi
done

log "Cache cleanup completed successfully"