How to Check Current ZFS Deduplication Table (DDT) Size for RAM Optimization


1 views

ZFS deduplication relies on the Deduplication Data Table (DDT), which stores cryptographic hashes of data blocks. The DDT resides primarily in RAM for performance reasons, making its size crucial for system planning. Each entry in the DDT typically consumes about 320 bytes of memory (256-bit SHA-256 hash + metadata).

The most accurate way to examine current DDT usage is through the zdb command. Here's the detailed procedure:

# Run as root or with sudo
sudo zdb -DD poolname

# Sample output analysis:
DDT-sha256-zap-duplicate: 1053 entries, size 308 on disk, 320 in core
DDT-sha256-zap-unique: 125478 entries, size 312 on disk, 320 in core

To calculate current RAM usage:

unique_entries = [value from DDT-sha256-zap-unique]
duplicate_entries = [value from DDT-sha256-zap-duplicate]
total_ram_usage = (unique_entries + duplicate_entries) * 320

For a quicker (but less detailed) overview:

sudo zpool status -D poolname

# Output includes:
dedup: DDT entries 126531, size 318B on disk, 320B in core

Consider this Python script to automate DDT monitoring:

#!/usr/bin/env python3
import subprocess

def get_ddt_stats(pool):
    cmd = f"sudo zdb -DD {pool} | grep -A4 'DDT-sha256'"
    output = subprocess.check_output(cmd, shell=True).decode()
    
    stats = {}
    for line in output.split('\\n'):
        if 'entries' in line:
            parts = line.split(',')
            entry_type = parts[0].split(':')[0].strip()
            entries = int(parts[0].split()[0])
            stats[entry_type] = entries
    
    total_entries = sum(stats.values())
    ram_mb = (total_entries * 320) / (1024 * 1024)
    return total_entries, ram_mb

If DDT grows too large:

  • Consider increasing zfs_dedup_prefetch (default 0) to improve performance
  • Monitor arcstats for dedup-related memory pressure
  • Evaluate whether the dedup ratio justifies the memory overhead

For regular monitoring, add this to your cron:

# Daily DDT size logging
0 3 * * * root zdb -DD poolname | grep -A4 'DDT-sha256' >> /var/log/zfs_ddt.log

When working with ZFS deduplication, the Deduplication Table (DDT) is the critical in-memory structure that tracks block checksums. Its size directly impacts RAM requirements. For zvols that can't leverage snapshots/clones (like those formatted with ext4/xfs), dedupe becomes particularly memory-intensive.

The most accurate method uses zdb with pool inspection. First ensure the pool isn't heavily active during this operation:

sudo zdb -DD poolname

Example output analysis:

DDT-sha256-zap-duplicate: 4553 entries, size 448 on disk, 144 in core
DDT-sha256-zap-unique: 78412 entries, size 512 on disk, 160 in core

Total DDT entries: 82965
On-disk DDT size: 43.2MB
In-core DDT size: 13.1MB

Each DDT entry consumes ~320 bytes in RAM (varies by ZFS version). To estimate future needs:

ddt_entries=$(sudo zdb -DD poolname | awk '/Total DDT entries/ {print $4}')
ram_estimate=$((ddt_entries * 320 / 1024 / 1024))
echo "Estimated RAM usage: ${ram_estimate}MB"

When dealing with zvols formatted with other filesystems:

  • Expect higher entry counts due to FS block alignment
  • Monitor growth rate with zpool get dedupratio
  • Consider zfs set dedup=off for temporary benchmarking

For continuous monitoring, parse /proc/spl/kstat/zfs/poolname/txgs or use this Python snippet:

import subprocess

def get_ddt_mem(pool):
    output = subprocess.check_output(['zdb', '-DD', pool]).decode()
    for line in output.splitlines():
        if 'in-core' in line:
            return float(line.split()[-2])
    return 0.0