Optimizing PostgreSQL Commit Performance: Diagnosing Slow WAL Sync Operations in RAID Configurations


2 views

During routine benchmarking of a PostgreSQL 9.1 installation on Ubuntu 12.10 with software RAID 1 (mdadm) configuration, we observed unexpectedly high COMMIT latency (22ms avg) compared to a local development machine (0.4ms avg). The test case involved simple single-row INSERT transactions:

BEGIN;
INSERT INTO test (foo) VALUES ('bar');
COMMIT;  -- This was the bottleneck

The pg_test_fsync utility revealed significant performance differences:

Server (RAID 1):
fsync: 30.524 ops/sec (≈32.7ms per operation)
fdatasync: 11.920 ops/sec (≈83.9ms per operation)

Local (Single Disk):
fsync: 34.593 ops/sec (≈28.9ms per operation) 
fdatasync: 68.871 ops/sec (≈14.5ms per operation)

Key hardware specifications:

  • Server: Dual 2TB SATA (Seagate ST3000DM001) in mdadm RAID 1
  • Local: Single consumer-grade SATA disk
  • Filesystem: ext4 with default options

The mdadm configuration showed proper alignment but potential write barriers:

/dev/md2:
  Version : 1.2
  Raid Level : raid1
  Array Size : 2917156159 (2782.02 GiB)
  Raid Devices : 2
  State : clean

Partition alignment verification:

sudo parted /dev/sdb unit s print
Sector size (logical/physical): 512B/4096B
Partition Start: 26218496s (properly aligned)

For ext4 on RAID configurations:

# /etc/fstab options for PostgreSQL WAL:
/dev/md2 /var/lib/postgresql ext4 defaults,noatime,nodiratime,data=writeback,barrier=0 0 1

# Recommended mkfs options if reformatting:
mkfs.ext4 -E stride=16,stripe-width=32 /dev/md2

Critical postgresql.conf parameters for commit performance:

# WAL settings
wal_level = minimal
synchronous_commit = off  # For bulk loads, consider local fsync
wal_sync_method = fdatasync  # Test alternatives
wal_buffers = 16MB
commit_delay = 10000  # Microseconds
commit_siblings = 5

# Storage tuning
random_page_cost = 2.0  # Lower for RAID
effective_io_concurrency = 200  # For RAID

For write-heavy workloads:

-- Consider batched transactions
BEGIN;
INSERT INTO test (foo) VALUES ('bar1');
INSERT INTO test (foo) VALUES ('bar2');
...
INSERT INTO test (foo) VALUES ('bar100');
COMMIT;  # Single commit overhead

-- Or use UNLOGGED tables for temporary data
CREATE UNLOGGED TABLE temp_data (...);

When RAID isn't enough:

  • Test with HW RAID controller with battery-backed cache
  • Consider separate WAL disk (SSD recommended)
  • Evaluate ZFS with dedicated SLOG device

When benchmarking simple INSERT transactions, I discovered COMMIT operations were taking 22ms compared to 0.4ms on a slower development machine. The pg_test_fsync results revealed the core issue:

fdatasync: 11.920 ops/sec (server) vs 68.871 ops/sec (dev machine)
fsync: 30.524 ops/sec (server) vs 34.593 ops/sec (dev machine)

The problematic server runs Software RAID 1 with these specifications:

Array Size: 2917156159 (2782.02 GiB)
Disks: 2x Seagate ST3000DM001-9YN166 (SATA)
Filesystem: ext4 with default options

Key findings from storage diagnostics:

  • Partition alignment confirmed with 4096B physical sectors
  • No SMART errors detected
  • Individual disk tests showed similar performance
  • Mount options: rw,noatime

For this RAID configuration, consider these postgresql.conf adjustments:

# Reduce fsync pressure
wal_buffers = 16MB
synchronous_commit = off
commit_delay = 10000
commit_siblings = 5

# Alternative WAL method
wal_sync_method = fdatasync

Add these to /etc/sysctl.conf for better I/O performance:

vm.dirty_ratio = 10
vm.dirty_background_ratio = 5
vm.swappiness = 1
blockdev --setra 4096 /dev/md2

Remount with optimized parameters:

mount -o remount,noatime,nodiratime,data=writeback,barrier=0 /dev/md2 /

For critical systems where 22ms commits are unacceptable:

1. Consider battery-backed write cache controller
2. Test with XFS filesystem
3. Evaluate ZFS with ZIL disabled
4. Upgrade to NVMe storage for WAL

Create a monitoring view to track commit latency:

CREATE VIEW commit_stats AS
SELECT 
    avg(total_time) as avg_commit_time,
    max(total_time) as max_commit_time,
    count(*) as transactions
FROM pg_stat_statements
WHERE query = 'COMMIT';