Atomic File Operations in NFS: Handling Concurrent Read/Write Access Safely


5 views

Network File System (NFS) presents unique challenges when multiple clients attempt simultaneous file operations. Unlike local filesystems, NFS must handle network latency, client caching, and protocol limitations.

The behavior differs between NFS versions:

  • NFSv3: Implements weak consistency - readers might see partial writes during ongoing operations
  • NFSv4: Introduces stronger consistency guarantees with stateful protocols

Consider this dangerous scenario:


// Client A writing large file
write(fd, large_buffer, large_size);

// Client B reading simultaneously
read(fd, buffer, size); // Might get partial/inconsistent data

The safest approach is atomic file replacement:


// Writing client
1. Create new file (file.tmp)
2. Write complete content
3. fsync() to ensure disk flush
4. rename("file.tmp", "file.txt")

// Reading client always sees complete files

NFSv4 supports file locking, but with caveats:


// Advisory locking example
flock(fd, LOCK_EX);  // Exclusive lock
// Perform operations
flock(fd, LOCK_UN);

Consider a CMS uploading images via NFS:


def safe_upload(dest_path, content):
    temp_path = f"{dest_path}.{uuid.uuid4()}.tmp"
    with open(temp_path, 'wb') as f:
        f.write(content)
        f.flush()
        os.fsync(f.fileno())
    os.rename(temp_path, dest_path)

Add verification when critical:


int verify_complete(const char *path) {
    struct stat st;
    if (stat(path, &st) == -1) return 0;
    return (st.st_size > 0) && (S_ISREG(st.st_mode));
}

Options with different consistency guarantees:

Method Consistency Performance
Direct write Weak Best
Atomic replace Strong Good
Full locking Strongest Worst

When dealing with Network File System (NFS) in multi-user environments, concurrent file access becomes a critical concern. Unlike local file systems, NFS introduces additional complexity due to its distributed nature and potential network latency issues.

NFS implements a stateless protocol, which means the server doesn't maintain client state information between requests. For file operations:


# Example showing basic NFS operations
with open('/nfs_mount/file.txt', 'r') as f:  # Client reading
    data = f.read()

# Simultaneously on another client
with open('/nfs_mount/file.txt', 'w') as f:  # Client writing
    f.write(new_data)

NFS versions implement different approaches to atomic operations:

  • NFSv3: Limited atomic operations (mainly through file locking)
  • NFSv4: Improved with better stateful operations and delegation

For scenarios where atomic operations are crucial, consider these patterns:


# Safe write pattern using temporary files
import os
import tempfile

def safe_write(path, data):
    with tempfile.NamedTemporaryFile('w', dir=os.path.dirname(path), delete=False) as tmp:
        tmp.write(data)
        tmp.flush()
        os.fsync(tmp.fileno())
    os.rename(tmp.name, path)

NFS supports advisory locking through fcntl:


import fcntl

# Advisory lock example
with open('/nfs_mount/file.txt', 'r+') as f:
    fcntl.flock(f, fcntl.LOCK_EX)  # Exclusive lock
    # Critical section
    fcntl.flock(f, fcntl.LOCK_UN)  # Release lock

For processes running on the NFS server itself accessing files:

  • Use proper file synchronization primitives (fsync, fdatasync)
  • Consider using memory-mapped files with appropriate synchronization
  • Implement proper error handling for stale file handles

Important mount options to consider:


# /etc/fstab entry example
nfs-server:/export  /mnt/nfs  nfs  rw,sync,no_subtree_check 0 0

The 'sync' option ensures writes are committed to stable storage before returning, while 'async' offers better performance but weaker consistency guarantees.

Key tools for diagnosing NFS issues:

  • nfsstat -c/-s for client/server statistics
  • mountstats for detailed performance metrics
  • Wireshark for packet-level analysis