ZFS employs two primary caching mechanisms: ARC (Adaptive Replacement Cache) in RAM and L2ARC (Level 2 ARC) on fast storage devices. While these caches significantly improve performance, many administrators wonder what exactly gets cached and how to inspect it.
Several utilities can help examine cache contents:
# Basic cache statistics
arcstat.py
zfs-stats
# Detailed L2ARC inspection
zdb -l /dev/disk/by-id/your-l2arc-device
The ARC primarily caches:
- Frequently accessed data blocks
- Metadata (directory entries, ZPL structures)
- Deduplication tables
To view ARC statistics:
# Show ARC breakdown
echo ::arc -v | mdb -k
# Alternative using DTrace
dtrace -n 'arcstat:::hit { @[args[0]->dev_statname] = count(); }'
The L2ARC contains evicted ARC blocks. To inspect:
# List L2ARC device contents
zdb -vvvv -l /dev/your-l2arc-device
# Show L2ARC header
echo "::l2arc -v" | mdb -k
To see which files have blocks in cache:
#!/bin/bash
for file in /pool/dataset/*; do
if vnstat -a $file | grep -q "in core"; then
echo "$file has cached blocks"
fi
done
For detailed metrics:
# ARC size breakdown
kstat -p zfs:0:arcstats:size
# L2ARC statistics
kstat -p zfs:0:arcstats:l2*
Create a simple cache heatmap:
#!/usr/bin/python
import subprocess
import matplotlib.pyplot as plt
arc = subprocess.check_output(["kstat", "-p", "zfs::arcstats:*"]).splitlines()
data = {k:v for k,v in [line.split() for line in arc]}
plt.pie([float(data["zfs:0:arcstats:data_size"]),
float(data["zfs:0:arcstats:metadata_size"])],
labels=["Data", "Metadata"])
plt.title("ARC Cache Distribution")
plt.show()
For real-time cache monitoring:
#!/usr/sbin/dtrace -s
#pragma D option quiet
fbt::arc_hdr_move:entry
{
@[args[0]->b_flags & B_METADATA ? "Metadata" : "Data"] = count();
}
tick-10s
{
printa("%-10s %@d\n", @);
trunc(@);
}
ZFS employs a sophisticated caching architecture with two primary components:
1. ARC (Adaptive Replacement Cache) - RAM-based primary cache
2. L2ARC (Level 2 ARC) - SSD-based secondary cache
The ARC stores frequently accessed blocks in memory, while L2ARC extends this cache to persistent storage.
To examine cache contents, we'll use these tools:
zdb -Z # ARC statistics
arcstat.py # Real-time ARC monitoring
dtrace # Low-level tracing
kstat # Kernel statistics
Use zdb
to dump ARC contents:
# Show ARC summary
sudo zdb -vvvvv poolname | grep -A 20 "ARC stats"
# Detailed breakdown
sudo echo ::arc | mdb -k
Sample output showing cached block types:
ARC breakdown:
data: 45.3% (3.2GB)
metadata: 38.1% (2.7GB)
other: 16.6% (1.1GB)
For L2ARC analysis:
# List L2ARC devices
sudo zpool iostat -v | grep cache
# Show L2ARC header
sudo zdb -l /dev/disk/by-id/ata-INTEL_SSD_X25-M_*
DTrace script to track L2ARC hits:
#!/usr/sbin/dtrace -s
fbt::arc_read_nolock:entry
{
self->path = args[0]->v_path;
self->start = timestamp;
}
fbt::arc_read_nolock:return
/self->start/
{
@[self->path] = sum(timestamp - self->start);
self->start = 0;
}
To see which filesystems benefit most from caching:
sudo zfs get all | grep -E 'name|arc|l2arc'
# Per-filesystem cache stats
sudo kstat -n zfs_arc_* | grep -E 'hits|misses'
1. Find top cached files:
sudo dtrace -n 'fbt::arc_hits:entry {
@[stringof(args[0]->v_path)] = count();
}'
2. Track cache age distribution:
sudo mdb -k
> ::walk arc | ::print -a zfs_arc_buf_hdr_t b_arc_access
Remember these cache characteristics:
- L2ARC headers consume ~70 bytes per block
- Metadata gets priority in ARC
- Frequent cache flushes occur during heavy writes