When dealing with sustained write throughput of 1.1GB/s (50×75GB/hour peak), we need to consider both the storage medium and filesystem architecture. The NFS requirement adds another layer of complexity to the solution.
A hybrid approach combining SSDs and HDDs is practical for this scale:
# Example ZFS pool configuration for tiered storage
zpool create datapool \
mirror nvme0n1 nvme1n1 \ # SSD mirror for ZIL/SLOG
mirror nvme2n1 nvme3n1 \ # L2ARC cache
raidz2 hdd0 hdd1 hdd2 hdd3 \ # Bulk storage
raidz2 hdd4 hdd5 hdd6 hdd7
Key ZFS parameters to adjust:
# Recommended settings for high-write workloads
echo 33554432 > /sys/module/zfs/parameters/zfs_arc_max
echo 1 > /sys/module/zfs/parameters/zfs_prefetch_disable
echo "options zfs zfs_vdev_async_write_max_active=32" >> /etc/modprobe.d/zfs.conf
For the dual 10GbE links, consider LACP bonding:
# /etc/network/interfaces example
auto bond0
iface bond0 inet manual
bond-mode 802.3ad
bond-miimon 100
bond-lacp-rate 1
bond-slaves enp1s0f0 enp1s0f1
auto bond0.100
iface bond0.100 inet static
address 192.168.100.10
netmask 255.255.255.0
mtu 9000
Essential NFS server optimizations:
# /etc/exports configuration
/datapool 192.168.100.0/24(rw,async,no_wdelay,no_root_squash,no_subtree_check)
# Kernel parameters
echo 4096 > /proc/sys/net/core/netdev_max_backlog
echo 32768 > /proc/sys/net/core/somaxconn
echo "sunrpc.tcp_max_slot_table_entries=128" >> /etc/modprobe.d/sunrpc.conf
Implement proactive monitoring with these metrics:
# Basic monitoring commands
zpool iostat -v datapool 1
cat /proc/spl/kstat/zfs/datapool/io
nfsstat -o net -s
Sample server configuration:
- 2× Intel Xeon Silver 4310 (24 cores total)
- 256GB DDR4 ECC RAM
- 4× 1TB NVMe (ZIL/SLOG and L2ARC)
- 12× 16TB HDD (RAIDZ2 vdevs)
- Dual-port 10GbE NIC
- HBA controller (not RAID)
When dealing with peak write throughput of 1.1GB/s (50×75GB/hour), we need to consider both the storage media and filesystem architecture. The key parameters:
Peak throughput: 1100MB/s (~1.1GB/s) Connection: Dual 10GbE (20Gbps theoretical) Protocol: NFSv4 Data characteristics: Large sequential files Retention: 50-75TB total
ZFS can handle these speeds with proper configuration. Critical ZFS parameters for high-throughput writes:
# Example zpool creation for high-speed writes: zpool create -o ashift=12 tank \ mirror nvme0n1 nvme1n1 \ mirror nvme2n1 nvme3n1 \ -O recordsize=1M \ -O compression=lz4 \ -O atime=off \ -O xattr=sa \ -O logbias=throughput
Key considerations:
- Use NVMe-based ZIL (ZFS Intent Log) for sync writes
- Large record sizes (1M) match big file patterns
- Disable atime and enable lz4 compression reduces IOPS load
A tiered approach balances cost and performance:
# Example Linux device-mapper setup for tiering: # Fast tier (NVMe) pvcreate /dev/nvme0n1 vgcreate fast_tier /dev/nvme0n1 # Capacity tier (HDD) pvcreate /dev/sd[abcdef] vgcreate slow_tier /dev/sd[abcdef] # Create cache pool lvcreate -L 500G -n cache_pool fast_tier lvcreate -L 50G -n meta_pool fast_tier # Create cached logical volume lvcreate -l 100%FREE -n data_volume slow_tier lvconvert --type cache --cachepool fast_tier/cache_pool \ --cachemode writethrough \ --metadatapool fast_tier/meta_pool \ slow_tier/data_volume
Essential NFS server configurations:
# /etc/exports configuration: /storage 10.0.0.0/24(rw,async,no_wdelay,no_root_squash, no_subtree_check,insecure_locks, sec=sys,fsid=0) # Recommended sysctl tweaks: net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216
Essential performance tools and their usage:
# FIO benchmark example for validation: [global] ioengine=libaio direct=1 runtime=300 time_based [write-test] rw=write bs=1M size=100G numjobs=4 iodepth=32 directory=/storage/test # Monitoring commands: zpool iostat -v tank 1 nfsiostat 1 iftop -nN -i eth0
If ZFS proves problematic at scale, consider:
- Lustre filesystem with NVMe OSTs
- Ceph with bluestore and NVMe journals
- Pure SSD array with hardware RAID (consider Dell ME4 series)
For budget-conscious implementations, used enterprise NVMe drives (like Intel P4610/P4510) in ZFS mirrors provide excellent price/performance.