Recovering Corrupted ZFS Pools on FreeBSD: A Deep Dive into Metadata Repair and Data Restoration Techniques


2 views

When dealing with ZFS pool corruption on legacy FreeBSD systems (particularly version 7.2 with ZFS v6), we're often facing one of these scenarios:

// Typical zpool status output showing corruption
$ zpool status
  pool: zpool01
 state: UNAVAIL
status: One or more devices are faulted
action: The pool cannot be imported due to damaged devices or data.
config:
    zpool01      UNAVAIL  insufficient replicas
      raidz1-0   UNAVAIL  corrupted data
        da5      ONLINE
        da6      ONLINE
        da7      ONLINE
        da8      ONLINE
      raidz1-1   ONLINE
        da1      ONLINE
        da2      ONLINE
        da3      ONLINE
        da4      ONLINE

The key diagnostic tool here is zdb, though we need to use it carefully:

# Check individual disk labels
$ zdb -l /dev/da5
------------------------------------
LABEL 0
------------------------------------
failed to unpack label 0
------------------------------------
LABEL 1
------------------------------------
failed to unpack label 1
------------------------------------
LABEL 2
------------------------------------
version: 5000
...

When standard recovery fails, we need to reconstruct the uberblock chain manually:

# Search for valid uberblocks
$ zdb -uuu zpool01
Traversing all blocks to verify checksums ...
    children[0]:
        type: 'disk'
        id: 0
        guid: 4795262086800816238
        path: '/dev/da5'
        whole_disk: 0
        DTL: 202
        labels: 0 1 2 3 
        status: ZIO_CHECKSUM_GOOD

When some labels are corrupted but others remain:

// Example of copying good labels to bad devices
$ dd if=/dev/da8 of=/dev/da5 bs=512 count=256 skip=0 seek=0  # Copy L0
$ dd if=/dev/da8 of=/dev/da5 bs=512 count=256 skip=131072000 seek=131072000  # Copy L2

For pools with damaged RAIDZ vdevs, consider these approaches:

  • Create a new pool with identical geometry
  • Use zdb -R to extract readable data
  • Attempt forced import with zpool import -F
# Attempt forced import with recovery mode
$ zpool import -F -f -N -R /mnt/recovery zpool01
$ zpool status -v zpool01

If you must maintain older ZFS implementations:

# Regular scrubbing schedule
$ crontab -e
0 3 * * 0 /sbin/zpool scrub zpool01

When dealing with ZFS pool corruption on legacy FreeBSD systems (particularly v7.2 with ZFS version 6), we're often facing metadata damage rather than actual data loss. The key indicators in this case are:

pool: zpool01
state: UNAVAIL
status: corrupted data in raidz1 vdev

Start with low-level device inspection using zdb:

# Check individual device labels
zdb -lll /dev/da5
zdb -lll /dev/da6
zdb -lll /dev/da7
zdb -lll /dev/da8

# Compare with working devices
zdb -lll /dev/da1

The critical error failed to unpack label on multiple devices suggests damaged ZFS labels, which typically reside at:

  • L0/L1: First 256KB of device
  • L2/L3: Last 256KB of device

When labels are damaged but data is intact, we can attempt label reconstruction:

# First make forensic copies of damaged devices
dd if=/dev/da5 of=/safe/storage/da5.img bs=1M

# Attempt label reconstruction from surviving labels
zpool labelclear -f /dev/da5
zdb -l /dev/da5

For ZFS v6 on FreeBSD 7.2, these commands differ from modern implementations:

# Legacy import attempt with forced mount
zpool import -f -F zpool01

# Alternative when import fails
zpool import -FX zpool01

Here's how to reconstruct a damaged vdev using healthy label references:

# Step 1: Create template from working device
zdb -lu /dev/da1 > /tmp/healthy_label

# Step 2: Edit GUIDs and paths for damaged device
sed -i 's/da1/da5/g' /tmp/healthy_label
sed -i 's/GUID=[0-9]*/GUID=$(uuidgen)/g' /tmp/healthy_label

# Step 3: Apply reconstructed label
zdb -R /dev/da5 L0 /tmp/healthy_label

When dealing with multi-vdev pools where one vdev is damaged:

# Try importing with missing vdev
zpool import -m zpool01

# If successful, force online state
zpool online -e zpool01 raidz1-0

Remember to document all steps before attempting recovery operations. The ZFS metadata structure for version 6 pools includes critical components at these offsets:

  • UBERBLOCK: Sector 0
  • NVPAIR: Sector 256
  • MOS: Sector 512