When implementing a document indexer with Python multiprocessing (4 parallel workers), we hit a severe performance wall. Each worker:
def process_document(doc):
text = extract_text(doc)
cursor.execute("INSERT INTO documents VALUES (...)")
connection.commit() # THIS is where hell breaks loose
System monitoring shows jbd2
(EXT4 journaling daemon) pegged at 99.9% IO, forcing CPU stalls during every MySQL commit operation.
EXT4's journal ensures filesystem consistency but creates significant overhead for:
- Small, frequent transactions (exactly what we're doing)
- Metadata-heavy operations (database commits qualify)
- Concurrent writers (our 4 processes)
The nuclear option (barrier=0
) completely disables:
/etc/fstab
UUID=... / ext4 defaults,barrier=0,noatime 0 1
DANGER: Without UPS, this risks filesystem corruption during power loss. With UPS? Probably safe.
Before going nuclear, try these:
1. Batch Commits
def worker(docs):
for i, doc in enumerate(docs):
cursor.execute(...)
if i % 100 == 0: # Commit every 100 docs
connection.commit()
2. Tune Journal Settings
# Reduce journal commit interval (default 5s)
echo 100 > /proc/sys/fs/jbd2/commit_timeout
# Limit journal size
tune2fs -J size=1024 /dev/sdX
3. Filesystem Alternatives
For read-heavy workloads with frequent small writes:
mkfs.xfs -f -l size=1024m /dev/sdX
When using InnoDB:
[mysqld]
innodb_flush_log_at_trx_commit = 2 # Trade durability for speed
innodb_doublewrite = 0 # If you're REALLY brave
Verify improvements with:
iostat -x 1
# Look for %util & await on your data disk
When scaling up our Python document indexer from single-process to multiprocessing (4 parallel workers), we encountered severe performance degradation during MySQL commits. The surprising culprit? EXT4's journaling daemon (jbd2) maxing out at 99% IO wait, causing CPU stalls.
Here's what happens under the hood:
# Python worker pseudocode
def process_document(doc):
text = extract_text(doc)
cursor.execute("INSERT INTO documents VALUES (...)")
connection.commit() # ← IO storm begins
Each commit forces:
- MySQL's innodb_flush_log_at_trx_commit=1 (default) ensures ACID compliance
- EXT4 journal writes metadata twice (before+after data write)
- fsync() calls guarantee durability
Before making filesystem changes, verify with:
# Check IO wait distribution
iotop -oP
# Monitor jbd2 specifically
watch -n 1 'grep -E "jbd2" /proc/diskstats'
# Measure actual commit latency
import time
start = time.time()
conn.commit()
print(f"Commit took {time.time()-start:.4f}s")
Option 1: Filesystem Mount Tweaks
Add these to /etc/fstab (requires reboot):
/dev/sda1 / ext4 defaults,barrier=0,data=writeback 0 1
Tradeoffs:
- barrier=0: Disables write barriers (OK with UPS)
- data=writeback: Journal only metadata
- Risk: Potential corruption on power loss
Option 2: MySQL Configuration
[mysqld]
innodb_flush_log_at_trx_commit=2 # Don't flush every transaction
innodb_flush_method=O_DIRECT # Bypass OS cache
innodb_doublewrite=0 # Careful! Disables crash safety
Option 3: Batch Processing Pattern
Our final implementation reduced commits by 90%:
from queue import Queue
class BatchWriter:
def __init__(self, batch_size=1000):
self.queue = Queue()
self.batch_size = batch_size
def add_document(self, doc):
self.queue.put(doc)
if self.queue.qsize() >= self.batch_size:
self.flush()
def flush(self):
with connection.cursor() as cur:
while not self.queue.empty():
doc = self.queue.get()
cur.execute("INSERT ...", doc)
connection.commit()
Approach | Documents/sec | jbd2 IO% |
---|---|---|
Original | 42 | 99% |
Barrier=0 | 210 | 75% |
Batching | 380 | 15% |