How to Get Uncompressed Directory Size on ZFS with Compression Enabled in FreeBSD


1 views

When working with ZFS filesystems that have compression enabled (using lz4, gzip, or other algorithms), standard tools like du and ls report compressed sizes by default. This becomes problematic when you need to know the original, uncompressed size of directories - particularly for capacity planning or migration scenarios.

FreeBSD provides several ways to get accurate uncompressed size information:

# Method 1: Using zfs get with special properties
zfs get -r compressratio,referenced,used pool/dataset

# Method 2: The most precise directory-level solution
find /path/to/directory -type f -exec zdb -vvv {} + | \
awk '/uncompressed_size/ {sum+=$3} END {print sum}'

For more practical daily use, consider these alternatives:

# Option A: Using zfs diff (shows both compressed and uncompressed)
zfs diff -F pool/dataset@snapshot1 pool/dataset@snapshot2

# Option B: Custom script combining zdb and find
#!/bin/sh
find "$1" -type f | while read -r file; do
    zdb -A "$file" | grep uncompressed_size | awk '{print $3}'
done | awk '{sum+=$1} END {print sum}'

Understanding uncompressed sizes is critical for:

  • Planning storage expansions
  • Estimating backup sizes
  • Migrating data to non-ZFS systems
  • Comparing compression efficiency

The zdb approach can be resource-intensive for large directories. For better performance:

# Sample optimized command using parallel processing
find /large/directory -type f -print0 | \
xargs -0 -P8 -n1 zdb -A | \
awk '/uncompressed_size/ {sum+=$3} END {print sum}'

When working with ZFS on FreeBSD (or any UNIX-like system), one common challenge is determining the actual, uncompressed size of files when compression is enabled. The standard tools like du and ls show the compressed size by default, which doesn't reflect the original data size.

Running du -sh directory or ls -l file gives you the compressed size - the physical space used on disk. What we need is the logical size - how much space the data would occupy without compression.

ZFS provides several ways to get this information:

Method 1: zfs get all

First check your compression ratio for the entire dataset:

zfs get all pool/dataset | grep compressratio

Method 2: zdb for Detailed Analysis

For more precise file-level information:

zdb -vvv pool | grep -A 4 "object"

Method 3: Practical Script for Directories

Here's a bash script that calculates uncompressed size for all files in a directory:

#!/bin/sh
find "$1" -type f -exec sh -c '
  for f; do
    size=$(ls -l "$f" | awk "{print \$5}")
    disk_size=$(du -k "$f" | awk "{print \$1}")
    ratio=$(echo "scale=2; $size / ($disk_size * 1024)" | bc)
    printf "%s: %d bytes (disk: %dK, ratio: %.2fx)\n" "$f" $size $disk_size $ratio
  done
' sh {} +

For developers needing more control, you can access ZFS internal structures:

# Get object number for a file
ls -i /path/to/file

# Then query ZFS
zdb -ddddd pool/dataset object_number

Remember that calculating uncompressed sizes for large directories can be I/O intensive. For production systems, consider:

  • Running during off-peak hours
  • Caching results when possible
  • Sampling instead of full scans for large datasets