When monitoring a slow PostgreSQL query with iotop, you'll often encounter a puzzling scenario where kswapd0 shows 99.99% IO usage while displaying zero disk activity. This phenomenon reveals critical memory pressure issues:
# Typical iotop output showing the anomaly TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 27 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd0]
kswapd0 is Linux's memory page reclaim daemon. The 99.99% IO usage indicates:
- Intense memory scanning operations (not physical disk I/O)
- Memory pressure triggering constant page evaluation
- Potential thrashing where the system spends more time managing memory than executing processes
When your heavy query runs, PostgreSQL's memory behavior creates this situation:
# Check PostgreSQL's shared buffer usage SELECT name, setting, unit FROM pg_settings WHERE name IN ('shared_buffers', 'work_mem', 'maintenance_work_mem'); # Monitor memory pressure SELECT * FROM pg_stat_activity WHERE state = 'active' ORDER BY query_start;
Before investing in new hardware, verify these PostgreSQL configurations:
# Critical memory parameters to review shared_buffers = 25% of available RAM (but not > 8GB) effective_cache_size = 50-75% of total RAM work_mem = (RAM - shared_buffers) / (max_connections * 3)
Try these immediate improvements before hardware upgrades:
-- Optimize the problematic query with EXPLAIN ANALYZE EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM large_table WHERE complex_conditions; -- Create targeted indexes CREATE INDEX CONCURRENTLY idx_improvement ON large_table (critical_columns) WHERE frequently_used_conditions;
For hardware improvements, focus on:
- Faster storage (NVMe) if you must swap
- Higher RAM density modules if motherboard slots are full
- Processor with better cache architecture
When analyzing PostgreSQL performance issues, I recently encountered a puzzling scenario where kswapd0
showed 99.99% IO utilization while reporting 0% disk read/write activity. Here's what's actually happening:
# Sample iotop output showing the anomaly TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 27 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd0]
kswapd0
is the kernel's swap daemon that manages memory pages. The high IO percentage indicates it's actively managing memory, but the zero disk activity suggests:
- The system is under memory pressure but not necessarily swapping to disk
- kswapd is scanning memory pages aggressively (page reclaimation)
- This creates CPU overhead rather than disk I/O bottlenecks
Before considering hardware upgrades, let's verify PostgreSQL's memory settings:
# Check current PostgreSQL memory configuration SHOW shared_buffers; SHOW work_mem; SHOW maintenance_work_mem; SHOW effective_cache_size; # Linux memory pressure metrics grep -E '^(SwapCached|SwapTotal|SwapFree|Committed_AS)' /proc/meminfo
Here are concrete steps to improve performance without immediate hardware upgrades:
# 1. Optimize PostgreSQL configuration (postgresql.conf) shared_buffers = 25% of available RAM (but not > 8GB) effective_cache_size = 50-75% of total RAM work_mem = (2MB per core) for complex queries # 2. Linux kernel parameters (sysctl.conf) vm.swappiness = 1 (reduces aggressive swapping) vm.vfs_cache_pressure = 50 (balanced cache reclaimation) # 3. Monitor with better tools sudo perf top -p $(pgrep kswapd0) sudo bpftrace -e 'kprobe:shrink_page_list { @[comm] = count(); }'
Consider these metrics before upgrading:
- If
pg_stat_activity
shows >30% time spent on I/O waits - When
vmstat 1
shows sustained highsi/so
values - If query EXPLAIN ANALYZE shows excessive temp file usage
For complex queries, first try adding appropriate indexes:
-- Example index for common slow query pattern CREATE INDEX CONCURRENTLY idx_orders_customer_date ON orders (customer_id, order_date DESC) WHERE status = 'completed';
Often the biggest gains come from query restructuring. Here's an example of optimizing a slow aggregate query:
-- Before: Full table scan with heavy aggregation EXPLAIN ANALYZE SELECT customer_id, SUM(amount) FROM transactions WHERE date BETWEEN '2023-01-01' AND '2023-12-31' GROUP BY customer_id; -- After: Materialized view for pre-aggregated data CREATE MATERIALIZED VIEW customer_yearly_totals AS SELECT customer_id, SUM(amount) as yearly_total FROM transactions GROUP BY customer_id, DATE_TRUNC('year', date); -- With refresh policy CREATE UNIQUE INDEX idx_cust_year ON customer_yearly_totals (customer_id, DATE_TRUNC('year', date)); REFRESH MATERIALIZED VIEW CONCURRENTLY customer_yearly_totals;