ZVOL Space Overconsumption: Diagnosing and Resolving Unexpected Disk Usage in ZFS Volumes


2 views

During routine storage monitoring on our FreeBSD 10.0-CURRENT system, we encountered a puzzling scenario where a 100GB ZFS volume (ZVOL) reported consuming 176GB of physical storage. The configuration showed no snapshots, reservations, or child datasets that could explain this behavior:

zfs get all zroot/DATA/vtest | grep -E 'used|volsize'
volsize               100G                   local
used                  176G                   -
referenced            176G                   -
written               176G                   -
logicalused           87.2G                  -

First, we verified the pool health and configuration:

zpool status -v
zpool list
zfs list

The output showed a healthy RAIDZ2 pool with no errors or capacity issues. The discrepancy appeared specifically with ZVOLs using 8K volblocksize, while 16K and larger block sizes showed normal usage patterns.

We conducted controlled tests with different volblocksize values:

# Test case with 8K blocks (problematic)
zfs create -V 100G -o volblocksize=8K zroot/DATA/vtest-8k

# Test case with 16K blocks (normal behavior)
zfs create -V 100G -o volblocksize=16K zroot/DATA/vtest-16k

After populating both volumes with identical data using dd, we observed:

zfs list -o name,volsize,used,volblocksize | grep vtest
zroot/DATA/vtest-8k   100G  201G       8K
zroot/DATA/vtest-16k  100G  102G      16K

The root cause appears to be related to ZFS's internal block pointer structure. Each block requires metadata overhead, and with smaller block sizes:

  1. More blocks are needed for the same logical size
  2. Each block carries fixed-size metadata overhead
  3. The checksum (fletcher4 in this case) adds per-block overhead

For 8K blocks, the metadata overhead can exceed 100% of the actual data size in some cases.

Based on our findings, we recommend these approaches:

# Recommended solution: Use larger block sizes
zfs set volblocksize=16K zroot/DATA/vtest

# Alternative: Enable compression to mitigate overhead
zfs set compression=lz4 zroot/DATA/vtest

# For existing volumes that must use 8K blocks:
zfs send | zfs recv -o volblocksize=16K

While larger block sizes reduce metadata overhead, they may impact performance for certain workloads:

Block Size Space Efficiency Random IOPS Sequential Throughput
8K Poor Best Good
16K Good Good Excellent
128K Best Poor Best

For cases requiring precise analysis of space usage:

# Calculate metadata overhead ratio
zfs get -p used,logicalused zroot/DATA/vtest | \
awk '/used/{u=$3} /logical/{l=$3} END{print "Overhead:",(u-l)/l*100"%"}'

# Monitor real-time allocation
dtrace -n 'fbt::zfs_space_delta_cb:entry { @[args[0]->dp_spa->spa_name] = sum(arg1); }'

When working with ZFS volumes (ZVOLs), administrators occasionally encounter puzzling storage reporting where the actual disk usage exceeds the configured volsize. This phenomenon appears particularly pronounced when using smaller volblocksize values (like 8K) compared to larger ones (16K or 128K).

From the diagnostic outputs, we observe several critical data points:

zroot/DATA/vtest-3:
  volsize: 100G
  volblocksize: 8K
  used: 201G
  logicalused: 100G

zroot/DATA/vtest-16:
  volsize: 100G
  volblocksize: 16K  
  used: 102G
  logicalused: 100G

The discrepancy stems from ZFS's block allocation strategy combined with the COW (Copy-On-Write) mechanism. When using small block sizes (8K):

  • Each write operation may require additional metadata overhead
  • The COW nature means blocks can't be overwritten in-place
  • Fragmentation becomes more severe with smaller blocks

To confirm this behavior, you can create test volumes:

# Create test volumes with different block sizes
zfs create -V 100G -o volblocksize=8k zroot/DATA/test8k
zfs create -V 100G -o volblocksize=16k zroot/DATA/test16k

# Write data to both volumes
dd if=/dev/zero of=/dev/zvol/zroot/DATA/test8k bs=1M count=100000
dd if=/dev/zero of=/dev/zvol/zroot/DATA/test16k bs=1M count=100000

# Check space usage
zfs list -o name,volsize,volblocksize,used,logicalused zroot/DATA/test8k zroot/DATA/test16k

For production systems where this overhead is unacceptable:

  1. Use larger volblocksize (16K or higher) when possible
  2. Enable compression (lz4 recommended) to reduce metadata overhead
  3. Consider refreservation for critical volumes
  4. Monitor fragmentation regularly with zpool list -v

For deeper analysis of block allocation patterns:

# Show detailed block statistics
zdb -bbbb zroot/DATA/vtest

# Check space histogram
zdb -S zroot

This behavior becomes problematic when:

  • Actual usage exceeds pool capacity unexpectedly
  • Performance degrades due to excessive metadata operations
  • You're running close to pool capacity limits