ZFS deduplication relies on the Deduplication Data Table (DDT), which stores cryptographic hashes of data blocks. The DDT resides primarily in RAM for performance reasons, making its size crucial for system planning. Each entry in the DDT typically consumes about 320 bytes of memory (256-bit SHA-256 hash + metadata).
The most accurate way to examine current DDT usage is through the zdb
command. Here's the detailed procedure:
# Run as root or with sudo
sudo zdb -DD poolname
# Sample output analysis:
DDT-sha256-zap-duplicate: 1053 entries, size 308 on disk, 320 in core
DDT-sha256-zap-unique: 125478 entries, size 312 on disk, 320 in core
To calculate current RAM usage:
unique_entries = [value from DDT-sha256-zap-unique]
duplicate_entries = [value from DDT-sha256-zap-duplicate]
total_ram_usage = (unique_entries + duplicate_entries) * 320
For a quicker (but less detailed) overview:
sudo zpool status -D poolname
# Output includes:
dedup: DDT entries 126531, size 318B on disk, 320B in core
Consider this Python script to automate DDT monitoring:
#!/usr/bin/env python3
import subprocess
def get_ddt_stats(pool):
cmd = f"sudo zdb -DD {pool} | grep -A4 'DDT-sha256'"
output = subprocess.check_output(cmd, shell=True).decode()
stats = {}
for line in output.split('\\n'):
if 'entries' in line:
parts = line.split(',')
entry_type = parts[0].split(':')[0].strip()
entries = int(parts[0].split()[0])
stats[entry_type] = entries
total_entries = sum(stats.values())
ram_mb = (total_entries * 320) / (1024 * 1024)
return total_entries, ram_mb
If DDT grows too large:
- Consider increasing
zfs_dedup_prefetch
(default 0) to improve performance - Monitor
arcstats
for dedup-related memory pressure - Evaluate whether the dedup ratio justifies the memory overhead
For regular monitoring, add this to your cron:
# Daily DDT size logging
0 3 * * * root zdb -DD poolname | grep -A4 'DDT-sha256' >> /var/log/zfs_ddt.log
When working with ZFS deduplication, the Deduplication Table (DDT) is the critical in-memory structure that tracks block checksums. Its size directly impacts RAM requirements. For zvols that can't leverage snapshots/clones (like those formatted with ext4/xfs), dedupe becomes particularly memory-intensive.
The most accurate method uses zdb
with pool inspection. First ensure the pool isn't heavily active during this operation:
sudo zdb -DD poolname
Example output analysis:
DDT-sha256-zap-duplicate: 4553 entries, size 448 on disk, 144 in core
DDT-sha256-zap-unique: 78412 entries, size 512 on disk, 160 in core
Total DDT entries: 82965
On-disk DDT size: 43.2MB
In-core DDT size: 13.1MB
Each DDT entry consumes ~320 bytes in RAM (varies by ZFS version). To estimate future needs:
ddt_entries=$(sudo zdb -DD poolname | awk '/Total DDT entries/ {print $4}')
ram_estimate=$((ddt_entries * 320 / 1024 / 1024))
echo "Estimated RAM usage: ${ram_estimate}MB"
When dealing with zvols formatted with other filesystems:
- Expect higher entry counts due to FS block alignment
- Monitor growth rate with
zpool get dedupratio
- Consider
zfs set dedup=off
for temporary benchmarking
For continuous monitoring, parse /proc/spl/kstat/zfs/poolname/txgs
or use this Python snippet:
import subprocess
def get_ddt_mem(pool):
output = subprocess.check_output(['zdb', '-DD', pool]).decode()
for line in output.splitlines():
if 'in-core' in line:
return float(line.split()[-2])
return 0.0