Implementing Transparent Compression on ext4: Stable Solutions and Workarounds


1 views

html

Many applications rely on ext4-specific features (journaling, extended attributes, etc.), but developers often need storage compression for space efficiency. While ZFS/btrfs offer native compression, ext4 lacks this capability natively. Here's a deep dive into practical solutions.

Your observation about ZFS zvol reporting double usage is correct. When creating ext4 on a ZFS volume:

# ZFS creates a sparse volume initially
zfs create -V 100G pool/ext4_vol

# ext4 immediately allocates metadata structures
mkfs.ext4 /dev/zvol/pool/ext4_vol

# Actual behavior:
du -hs /mnt/ext4      # Shows 1GB used
zfs list              # Shows 2GB allocated

This happens because ZFS can't track ext4's internal block allocation. For production systems, consider:

The most promising solution is the experimental e4compress patchset:

# Compile and insert module
git clone https://github.com/kdave/e4compress
make -C /lib/modules/$(uname -r)/build M=$PWD
insmod e4compress.ko

# Format with compression
mkfs.ext4 -O compression /dev/sdX1
mount -o compress /dev/sdX1 /mnt

Current limitations:

  • Only LZO compression supported
  • Requires kernel 5.10+ with custom patches
  • No inode-level compression control

For stability, consider dm-crypt + squashfs:

# Create compressed backing store
mkfs.ext4 /dev/sdX1
mount /dev/sdX1 /backing
mksquashfs /backing /compressed.img -comp lz4 -Xhc

# Mount with overlay
mount /compressed.img /mnt -t squashfs -o loop
mount -t overlay overlay -o lowerdir=/mnt,upperdir=/overlay,workdir=/work /final

For proper quota accounting with compression:

# With e4compress
tune2fs -Q usrquota,grpquota /dev/sdX1
quotaon /mnt

# Quotas will report uncompressed sizes
repquota -u /mnt

Important: All solutions should be tested with your specific workload. For database applications, compression may actually reduce performance due to CPU overhead.

Based on current technology maturity:

Solution Stability Compression Ratio
e4compress Experimental ~2x (LZO)
dm-crypt+squashfs Stable 3-4x (LZMA)
ZFS zvol Stable Not effective

For mission-critical systems, the stacked approach provides the best balance between ext4 compatibility and compression benefits.


Many applications require ext4-specific features while also needing compressed storage. After testing several approaches, I've found that achieving stable, production-ready transparent compression with ext4 presents unique challenges.

My initial attempt used ZFS as the underlying storage with ext4 on top:

zpool create pool raidz2 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde2 /dev/sdf1 /dev/sdg1 /dev/sdh2 /dev/sdi1
zfs set recordsize=128k pool
zfs create -p -V15100GB pool/test
zfs set compression=lz4 pool/test
mkfs.ext4 -m1 -O 64bit,has_journal,extents,huge_file,flex_bg,uninit_bg,dir_nlink /dev/zvol/pool/test

The results were disappointing:

# du -hs /mnt/test
1.1T    /mnt/test
# zfs list
NAME        USED  AVAIL  REFER  MOUNTPOINT
pool       15.2T  2.70G   290K  /pool
pool/test  15.2T  13.1T  2.14T  -

Fusecompress: While functional, stability issues make it unsuitable for production environments.

LessFS: Potentially interesting but lacks native ext4 integration. The documentation doesn't clearly explain if it can work in conjunction with ext4.

A critical issue emerges with compressed filesystems: quota handling. When compression is enabled, you want the system to benefit from space savings, not the end user. Current implementations show these problems:

  • User quotas apply to uncompressed sizes
  • df -h reports uncompressed sizes
  • No clear way to make compression transparent to quotas

For systems where quotas are essential, consider these approaches:

# Option 1: Implement quota at application level
def check_quota(user, file):
    compressed_size = get_compressed_size(file)
    if user.quota_remaining < compressed_size:
        raise QuotaExceededError
    
# Option 2: Use btrfs with quotas (though lacks ext4 features)
mkfs.btrfs -f /dev/sdX
mount -o compress-force=zstd:3,usrquota,grpquota /dev/sdX /mnt

The Linux kernel may eventually support native compression for ext4, similar to btrfs or f2fs. Until then, these are the most promising directions:

  1. Kernel patches implementing ext4 compression
  2. Improved FUSE-based solutions with better quota handling
  3. Filesystem stacking approaches (like overlayfs with compression)

For now, if ext4 features are absolutely required, the best approach might be implementing compression at the application level while using ext4 for storage.