How to Inspect and Analyze Contents of ZFS ARC and L2ARC Caches


5 views

ZFS employs two primary caching mechanisms: ARC (Adaptive Replacement Cache) in RAM and L2ARC (Level 2 ARC) on fast storage devices. While these caches significantly improve performance, many administrators wonder what exactly gets cached and how to inspect it.

Several utilities can help examine cache contents:

# Basic cache statistics
arcstat.py
zfs-stats

# Detailed L2ARC inspection
zdb -l /dev/disk/by-id/your-l2arc-device

The ARC primarily caches:

  • Frequently accessed data blocks
  • Metadata (directory entries, ZPL structures)
  • Deduplication tables

To view ARC statistics:

# Show ARC breakdown
echo ::arc -v | mdb -k

# Alternative using DTrace
dtrace -n 'arcstat:::hit { @[args[0]->dev_statname] = count(); }'

The L2ARC contains evicted ARC blocks. To inspect:

# List L2ARC device contents
zdb -vvvv -l /dev/your-l2arc-device

# Show L2ARC header
echo "::l2arc -v" | mdb -k

To see which files have blocks in cache:

#!/bin/bash
for file in /pool/dataset/*; do
    if vnstat -a $file | grep -q "in core"; then
        echo "$file has cached blocks"
    fi
done

For detailed metrics:

# ARC size breakdown
kstat -p zfs:0:arcstats:size

# L2ARC statistics
kstat -p zfs:0:arcstats:l2*

Create a simple cache heatmap:

#!/usr/bin/python
import subprocess
import matplotlib.pyplot as plt

arc = subprocess.check_output(["kstat", "-p", "zfs::arcstats:*"]).splitlines()
data = {k:v for k,v in [line.split() for line in arc]}
plt.pie([float(data["zfs:0:arcstats:data_size"]), 
         float(data["zfs:0:arcstats:metadata_size"])],
        labels=["Data", "Metadata"])
plt.title("ARC Cache Distribution")
plt.show()

For real-time cache monitoring:

#!/usr/sbin/dtrace -s
#pragma D option quiet

fbt::arc_hdr_move:entry
{
    @[args[0]->b_flags & B_METADATA ? "Metadata" : "Data"] = count();
}

tick-10s
{
    printa("%-10s %@d\n", @);
    trunc(@);
}

ZFS employs a sophisticated caching architecture with two primary components:

1. ARC (Adaptive Replacement Cache) - RAM-based primary cache
2. L2ARC (Level 2 ARC) - SSD-based secondary cache

The ARC stores frequently accessed blocks in memory, while L2ARC extends this cache to persistent storage.

To examine cache contents, we'll use these tools:

zdb -Z         # ARC statistics
arcstat.py    # Real-time ARC monitoring
dtrace        # Low-level tracing
kstat         # Kernel statistics

Use zdb to dump ARC contents:

# Show ARC summary
sudo zdb -vvvvv poolname | grep -A 20 "ARC stats"

# Detailed breakdown
sudo echo ::arc | mdb -k

Sample output showing cached block types:

ARC breakdown:
  data:       45.3% (3.2GB)
  metadata:   38.1% (2.7GB)
  other:      16.6% (1.1GB)

For L2ARC analysis:

# List L2ARC devices
sudo zpool iostat -v | grep cache

# Show L2ARC header
sudo zdb -l /dev/disk/by-id/ata-INTEL_SSD_X25-M_*

DTrace script to track L2ARC hits:

#!/usr/sbin/dtrace -s
fbt::arc_read_nolock:entry
{
  self->path = args[0]->v_path;
  self->start = timestamp;
}

fbt::arc_read_nolock:return
/self->start/
{
  @[self->path] = sum(timestamp - self->start);
  self->start = 0;
}

To see which filesystems benefit most from caching:

sudo zfs get all | grep -E 'name|arc|l2arc'

# Per-filesystem cache stats
sudo kstat -n zfs_arc_* | grep -E 'hits|misses'

1. Find top cached files:

sudo dtrace -n 'fbt::arc_hits:entry {
  @[stringof(args[0]->v_path)] = count();
}'

2. Track cache age distribution:

sudo mdb -k
> ::walk arc | ::print -a zfs_arc_buf_hdr_t b_arc_access

Remember these cache characteristics:

  • L2ARC headers consume ~70 bytes per block
  • Metadata gets priority in ARC
  • Frequent cache flushes occur during heavy writes