Performance Benchmark Analysis: Samba vs NFS vs GlusterFS for Small File Writes in Web Server Environments


12 views

During infrastructure planning for our web server farm, we conducted extensive NAS filesystem performance tests with surprising results. Using a WordPress tar.gz extraction as our test case (approximately 1,800 small files), we observed:

# Test command used for all protocols:
rsync -a --delete /source/wordpress/ /mnt/nas_mount/wordpress/
Protocol Configuration Time (seconds) CPU Load
GlusterFS Replicated (2 nodes) 32-35 High
GlusterFS Single node 14-16 High
GlusterFS+NFS NFS client 16-19 High
NFS Kernel server (sync) 32-36 Low
NFS Kernel server (async) 3-4 Low
Samba Default config 4-7 Medium
Direct Disk Ext4 <1 Minimal

The surprising Samba performance advantage stems from several architectural factors:

  • Protocol Optimization: SMB3 includes advanced features like compounded operations that reduce round trips
  • Caching Behavior: Samba implements more aggressive write-behind caching than NFS in sync mode
  • Metadata Handling: Samba batches metadata operations more efficiently for small files

For improved NFS performance with small files, consider these adjustments:

# /etc/exports optimization for small files:
/mnt/export  *(rw,sync,no_wdelay,no_subtree_check,fsid=0)

# Kernel parameters to improve NFS sync performance:
echo 1048576 > /proc/sys/fs/nfs/nfs_mountpoint_max
echo 32768 > /proc/sys/fs/nfs/nfs_max_dirty_ratio

Our tests showed XFS outperforming ext4 by up to 40% for this workload. Recommended mount options:

# XFS mount options for small file performance:
mount -o noatime,nodiratime,logbsize=256k,allocsize=1m,inode64 /dev/sdx /mnt/nas

Based on our findings, we recommend:

  1. For pure Linux environments needing best small file performance: Samba with XFS backend
  2. For mixed-OS environments: Samba with XFS + 'strict sync = no' parameter
  3. For large file workloads: NFS async mode with appropriate backup strategy

To validate our results, you can reproduce with:

# Create test environment
mkdir -p /test/source /test/dest
wget https://wordpress.org/latest.tar.gz
tar xzf latest.tar.gz -C /test/source

# Time the transfer (run for each protocol)
time rsync -a --delete /test/source/wordpress/ /mnt/nas_mount/wordpress/

During my recent infrastructure planning for a web server farm, I conducted extensive NAS filesystem performance tests with surprising results. The benchmark focused specifically on small file write operations - a critical metric for web applications like WordPress with numerous small PHP, CSS, and JavaScript files.

Using rsync to transfer a WordPress installation (unpacked tar.gz) as test data:


GlusterFS replicated 2: 32-35s (high CPU)
GlusterFS single: 14-16s (high CPU) 
GlusterFS + NFS client: 16-19s (high CPU)
NFS kernel server (sync): 32-36s (low CPU)
NFS kernel server (async): 3-4s (low CPU)
Samba: 4-7s (medium CPU)
Direct disk: <1s

Tests performed on EC2 m1.small instances:

  • Source: Ephemeral disk
  • Target: EBS volumes
  • Filesystems tested: ext4 and xfs (xfs showed 40% better performance)
  • Samba config: Default Debian Squeeze package with only "sync always = yes" added

The most striking finding was Samba's superior performance in small file operations compared to both NFS and GlusterFS. This contradicts conventional wisdom that NFS should outperform SMB/CIFS in Linux environments.

The dramatic difference between NFS sync (32-36s) and async (3-4s) modes suggests the sync writes are bottlenecked by disk I/O wait times. Here's how to check NFS server stats:


# Check NFS server statistics
nfsstat -s

# Monitor NFS operations in real-time
nfsiostat 2

While using default settings, these Samba tweaks could further improve performance:


[global]
    socket options = TCP_NODELAY IPTOS_LOWDELAY SO_RCVBUF=65536 SO_SNDBUF=65536
    strict locking = no
    use sendfile = yes
    min receivefile size = 16384
    write cache size = 262144

The high CPU usage in GlusterFS suggests protocol overhead. Consider these volume parameters:


# Create optimized GlusterFS volume
gluster volume create webdata replica 2 transport tcp server1:/brick1 server2:/brick1
gluster volume set webdata performance.cache-size 256MB
gluster volume set webdata performance.io-thread-count 16
gluster volume set webdata performance.write-behind-window-size 4MB

XFS consistently outperformed ext4 in these tests. For web server workloads, consider these mount options:


# /etc/fstab example for XFS
/dev/sdb1 /webdata xfs defaults,noatime,nodiratime,logbsize=256k 0 0

For completeness, consider testing these additional protocols:


# WebDAV benchmark example
time cadaver http://server/webdav/ <<EOF
mput wordpress/*
EOF

# SSHFS test
time rsync -a wordpress/ /mnt/sshfs/