ZFS snapshots are incredibly useful for data protection, but without proper cleanup, they can accumulate and consume valuable storage space. Many administrators create snapshots automatically through cron jobs but neglect the equally important task of pruning old snapshots.
Here's a robust bash script that handles snapshot retention automatically. It works by:
- Listing snapshots for a given filesystem
- Sorting them chronologically
- Keeping the specified number of most recent snapshots
- Destroying the older ones
#!/bin/bash
# Configuration
KEEP_DAILY=14 # Keep 14 daily snapshots
KEEP_WEEKLY=8 # Keep 8 weekly snapshots
DATASETS=("tank" "sastank")
for dataset in "${DATASETS[@]}"; do
# Daily snapshots cleanup
zfs list -t snapshot -o name -H |
grep "${dataset}@AutoD-" |
sort -r |
tail -n +$(($KEEP_DAILY + 1)) |
xargs -n 1 zfs destroy -vr
# Weekly snapshots cleanup
zfs list -t snapshot -o name -H |
grep "${dataset}@AutoW-" |
sort -r |
tail -n +$(($KEEP_WEEKLY + 1)) |
xargs -n 1 zfs destroy -vr
done
For more granular control across different datasets, consider this enhanced version:
#!/bin/bash
declare -A RETENTION=(
["tank/daily"]=14
["tank/weekly"]=8
["sastank/daily"]=2
["sastank/weekly"]=4
)
for prefix in "${!RETENTION[@]}"; do
IFS='/' read -r dataset frequency <<< "$prefix"
keep=${RETENTION[$prefix]}
zfs list -t snapshot -o name -H |
grep "${dataset}@Auto${frequency:0:1}-" |
sort -r |
tail -n +$(($keep + 1)) |
xargs -n 1 zfs destroy -vr
done
To run this automatically, add to your crontab:
# Clean up old snapshots daily at 3 AM
0 3 * * * /path/to/zfs_snapshot_cleanup.sh
Before deploying to production:
- Test with
-n
flag first (dry run) - Consider implementing snapshot locking during deletion
- Log deletions for audit purposes
- Monitor disk space after implementation
For those preferring Python:
import subprocess
from datetime import datetime
def cleanup_snapshots(dataset, prefix, keep):
snapshots = subprocess.check_output(
f"zfs list -t snapshot -o name -H | grep '{dataset}@{prefix}'",
shell=True).decode().splitlines()
sorted_snaps = sorted(snapshots,
key=lambda x: datetime.strptime(x.split('@')[1][5:], '%Y-%m-%d'),
reverse=True)
for snap in sorted_snaps[keep:]:
subprocess.run(['zfs', 'destroy', '-vr', snap])
# Usage
cleanup_snapshots('tank', 'AutoD', 14)
cleanup_snapshots('sastank', 'AutoD', 2)
ZFS snapshots are incredibly useful for data protection, but without proper management, they can consume significant storage space. The common pain point is maintaining a balance between retention history and available storage. Here's a robust solution to automate snapshot pruning.
Before implementing any solution, ensure:
- You have
zfs
command line access - Snapshots follow a consistent naming pattern
- You understand your retention requirements per dataset
Here's a bash script that handles automatic snapshot pruning:
#!/bin/bash
# Configuration
KEEP_DAILY=14 # Keep last 14 daily snapshots
KEEP_WEEKLY=8 # Keep last 8 weekly snapshots
DATASETS=("tank" "sastank") # Array of datasets to manage
for dataset in "${DATASETS[@]}"; do
# Different retention policies per dataset
if [ "$dataset" == "tank" ]; then
keep=$KEEP_WEEKLY
else
keep=$KEEP_DAILY
fi
# Get all snapshots sorted by creation time
snapshots=$(zfs list -H -t snapshot -o name -S creation "$dataset" | grep "@AutoD-")
# Count total snapshots
total=$(echo "$snapshots" | wc -l)
# Calculate how many to delete
to_delete=$((total - keep))
if [ $to_delete -gt 0 ]; then
echo "Deleting $to_delete old snapshots from $dataset"
echo "$snapshots" | tail -n $to_delete | while read -r snapshot; do
echo "Destroying $snapshot"
zfs destroy "$snapshot"
done
else
echo "No snapshots to delete from $dataset (has $total, keeping $keep)"
fi
done
For more precise control based on snapshot dates:
#!/bin/bash
# Keep snapshots newer than this many days
RETAIN_DAYS=14
DATASETS=("tank" "sastank")
CURRENT_DATE=$(date +%s)
for dataset in "${DATASETS[@]}"; do
# Adjust retention per dataset
if [ "$dataset" == "sastank" ]; then
retain_days=2 # Only keep 2 days for sastank
else
retain_days=$RETAIN_DAYS
fi
# Get all snapshots with creation time
zfs list -H -t snapshot -o name,creation "$dataset" | grep "@AutoD-" | while read -r line; do
snapshot=$(echo "$line" | awk '{print $1}')
creation=$(echo "$line" | awk '{print $2" "$3" "$4" "$5" "$6}')
creation_epoch=$(date -d "$creation" +%s)
age_days=$(( (CURRENT_DATE - creation_epoch) / 86400 ))
if [ $age_days -gt $retain_days ]; then
echo "Destroying $snapshot (age: $age_days days)"
zfs destroy "$snapshot"
fi
done
done
To run the cleanup weekly, add to crontab:
0 3 * * 0 /path/to/zfs_snapshot_cleanup.sh >> /var/log/zfs_snapshot_cleanup.log 2>&1
- Test scripts with
echo
before actual deletion - Consider implementing snapshot locking during critical operations
- Maintain backup of your retention script
- Monitor disk space and adjust retention policies as needed
For those preferring not to write custom scripts:
- sanoid/syncoid: Popular Perl-based ZFS snapshot management
- pyznap: Python-based solution with more features
- zfs-auto-snapshot: Ubuntu package with built-in expiration