The Origin and Best Practices of Naming ZFS Pools as “tank”


2 views

In the early days of ZFS development at Sun Microsystems, engineers needed a simple, memorable name for example storage pools. "tank" emerged as the go-to placeholder because:

  • Short (4 letters) and easy to type during demos
  • Unlikely to conflict with real-world naming schemes
  • Metaphorically represents a "storage tank" for data

While "tank" remains popular in documentation, production systems typically use more descriptive names. Here's a common naming convention I've seen in enterprise environments:

# Standard pool creation
zpool create tank mirror sda sdb

# Production naming examples
zpool create primary-ssd mirror nvme0n1 nvme1n1
zpool create backup-hdd raidz2 sdb sdc sdd sde

The name still serves valid purposes today:

  • Testing environments - Quick setup for experiments
  • Documentation examples - Consistent reference point
  • Temporary storage - When the purpose is truly generic

For systems with multiple pools, I recommend this approach:

# Media server example
zpool create media-4k raidz2 /dev/disk/by-id/ata-*
zpool create media-1080p raidz /dev/disk/by-id/scsi-*
zpool create scratch mirror ssd1 ssd2  # For temporary work

The key principles are:

  • Include the pool's purpose or content type
  • Indicate redundancy level when helpful
  • Maintain consistency across your infrastructure

The naming convention of "tank" for ZFS pools traces back to Sun Microsystems' original ZFS documentation. During Sun's Solaris development in the early 2000s, engineers needed a simple, memorable name for example configurations that wouldn't conflict with existing system terminology.

Several technical reasons made "tank" an ideal choice:

  • Short (4 letters) and easy to type in command-line operations
  • Unlikely to conflict with system directories (/tank doesn't exist by default)
  • Mnemonic value - imagine data "storage tanks"
  • Distinct from traditional Unix names like "pool" or "data"

In multi-pool systems, administrators typically follow these patterns:

# Production pools often get descriptive names
zpool create prod_data mirror sda sdb
zpool create backup raidz2 sdc sdd sde sdf

# While keeping 'tank' for testing/scratch
zpool create tank mirror sdg sdh

Enterprise environments demonstrate various approaches:

# Financial institution example
zpool create transactions mirror nvme0n1 nvme1n1
zpool create reports raidz2 sda sdb sdc sdd
zpool create tank -T # Temporary testing pool

# Cloud deployment pattern
zpool create ${HOSTNAME}_fast mirror nvme0n1 nvme1n1
zpool create ${HOSTNAME}_bulk raidz2 sd[abcdef]

Modern best practices suggest more descriptive names:

# Good alternatives
zpool create fast_ssd mirror nvme0n1 nvme1n1
zpool create bulk_hdd raidz2 sd[abcdef]

# With property inheritance
zpool create -o ashift=12 db_primary mirror nvme0n1 nvme1n1
zpool create -o ashift=12 db_secondary raidz2 sda sdb sdc

The "tank" convention affects scripting considerations:

#!/bin/bash
# Instead of hardcoding 'tank'
POOL_NAME="${1:-tank}" # Default to tank but allow override

zpool create $POOL_NAME mirror sda sdb
zfs create $POOL_NAME/datasets