Performance Benchmark Analysis: Samba vs NFS vs GlusterFS for Small File Writes in Web Server Environments


2 views

During infrastructure planning for our web server farm, we conducted extensive NAS filesystem performance tests with surprising results. Using a WordPress tar.gz extraction as our test case (approximately 1,800 small files), we observed:

# Test command used for all protocols:
rsync -a --delete /source/wordpress/ /mnt/nas_mount/wordpress/
Protocol Configuration Time (seconds) CPU Load
GlusterFS Replicated (2 nodes) 32-35 High
GlusterFS Single node 14-16 High
GlusterFS+NFS NFS client 16-19 High
NFS Kernel server (sync) 32-36 Low
NFS Kernel server (async) 3-4 Low
Samba Default config 4-7 Medium
Direct Disk Ext4 <1 Minimal

The surprising Samba performance advantage stems from several architectural factors:

  • Protocol Optimization: SMB3 includes advanced features like compounded operations that reduce round trips
  • Caching Behavior: Samba implements more aggressive write-behind caching than NFS in sync mode
  • Metadata Handling: Samba batches metadata operations more efficiently for small files

For improved NFS performance with small files, consider these adjustments:

# /etc/exports optimization for small files:
/mnt/export  *(rw,sync,no_wdelay,no_subtree_check,fsid=0)

# Kernel parameters to improve NFS sync performance:
echo 1048576 > /proc/sys/fs/nfs/nfs_mountpoint_max
echo 32768 > /proc/sys/fs/nfs/nfs_max_dirty_ratio

Our tests showed XFS outperforming ext4 by up to 40% for this workload. Recommended mount options:

# XFS mount options for small file performance:
mount -o noatime,nodiratime,logbsize=256k,allocsize=1m,inode64 /dev/sdx /mnt/nas

Based on our findings, we recommend:

  1. For pure Linux environments needing best small file performance: Samba with XFS backend
  2. For mixed-OS environments: Samba with XFS + 'strict sync = no' parameter
  3. For large file workloads: NFS async mode with appropriate backup strategy

To validate our results, you can reproduce with:

# Create test environment
mkdir -p /test/source /test/dest
wget https://wordpress.org/latest.tar.gz
tar xzf latest.tar.gz -C /test/source

# Time the transfer (run for each protocol)
time rsync -a --delete /test/source/wordpress/ /mnt/nas_mount/wordpress/

During my recent infrastructure planning for a web server farm, I conducted extensive NAS filesystem performance tests with surprising results. The benchmark focused specifically on small file write operations - a critical metric for web applications like WordPress with numerous small PHP, CSS, and JavaScript files.

Using rsync to transfer a WordPress installation (unpacked tar.gz) as test data:


GlusterFS replicated 2: 32-35s (high CPU)
GlusterFS single: 14-16s (high CPU) 
GlusterFS + NFS client: 16-19s (high CPU)
NFS kernel server (sync): 32-36s (low CPU)
NFS kernel server (async): 3-4s (low CPU)
Samba: 4-7s (medium CPU)
Direct disk: <1s

Tests performed on EC2 m1.small instances:

  • Source: Ephemeral disk
  • Target: EBS volumes
  • Filesystems tested: ext4 and xfs (xfs showed 40% better performance)
  • Samba config: Default Debian Squeeze package with only "sync always = yes" added

The most striking finding was Samba's superior performance in small file operations compared to both NFS and GlusterFS. This contradicts conventional wisdom that NFS should outperform SMB/CIFS in Linux environments.

The dramatic difference between NFS sync (32-36s) and async (3-4s) modes suggests the sync writes are bottlenecked by disk I/O wait times. Here's how to check NFS server stats:


# Check NFS server statistics
nfsstat -s

# Monitor NFS operations in real-time
nfsiostat 2

While using default settings, these Samba tweaks could further improve performance:


[global]
    socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=65536 SO_SNDBUF=65536
    strict locking = no
    use sendfile = yes
    min receivefile size = 16384
    write cache size = 262144

The high CPU usage in GlusterFS suggests protocol overhead. Consider these volume parameters:


# Create optimized GlusterFS volume
gluster volume create webdata replica 2 transport tcp server1:/brick1 server2:/brick1
gluster volume set webdata performance.cache-size 256MB
gluster volume set webdata performance.io-thread-count 16
gluster volume set webdata performance.write-behind-window-size 4MB

XFS consistently outperformed ext4 in these tests. For web server workloads, consider these mount options:


# /etc/fstab example for XFS
/dev/sdb1 /webdata xfs defaults,noatime,nodiratime,logbsize=256k 0 0

For completeness, consider testing these additional protocols:


# WebDAV benchmark example
time cadaver http://server/webdav/ <<EOF
mput wordpress/*
EOF

# SSHFS test
time rsync -a wordpress/ /mnt/sshfs/