When running Apache on AWS Linux AMI with EBS storage, you might encounter a situation where kworker
processes consume excessive IO (90%+) while showing zero actual disk activity. This manifests as high load averages (>8) despite normal memory availability.
# Typical iotop output showing the anomaly:
Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.37 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
3730 be/4 root 0.00 B 0.00 B 0.00 % 91.98 % [kworker/u8:1]
774 be/3 root 0.00 B 1636.00 K 0.00 % 15.77 % [jbd2/xvda1-8]
The critical observation is that these kernel workers become inactive when Apache is stopped. This points to filesystem/journaling operations triggered by web server activity rather than actual disk bottlenecks.
The perf report shows significant time spent in Xen hypervisor scheduling:
Samples: 114K of event 'cpu-clock'
- 83.58% swapper [kernel.kallsyms] [k] xen_hypercall_sched_op
+ xen_hypercall_sched_op
+ default_idle
+ arch_cpu_idle
- cpu_startup_entry
70.16% cpu_bringup_and_idle
- 29.84% rest_init
Try these diagnostic commands to identify the root cause:
# Check filesystem activity
sudo strace -p [kworker_PID] -e trace=file
# Monitor kernel workqueue events
echo 1 > /sys/kernel/debug/tracing/events/workqueue/enable
cat /sys/kernel/debug/tracing/trace_pipe
# Check journaling activity
sudo dmesg | grep -i jbd2
Based on similar cases, these adjustments often help:
# Adjust dirty_ratio to reduce writeback pressure
echo 10 > /proc/sys/vm/dirty_ratio
# Modify ext4 journaling parameters (for xvda1)
tune2fs -o journal_data_writeback /dev/xvda1
tune2fs -O ^has_journal /dev/xvda1
# Optimize Apache's file handling
<IfModule mpm_prefork_module>
MaxRequestWorkers 150
MaxConnectionsPerChild 1000
</IfModule>
For deeper analysis, use this SystemTap script to track kworker activity:
probe kernel.function("process_one_work") {
if (execname() == "kworker/u8:1") {
printf("Work item: %s\n", kernel_string($work->func));
}
}
AWS EBS configurations that can alleviate this:
# Use provisioned IOPS for consistent performance
aws ec2 modify-volume --volume-id vol-12345 --iops 4000
# Consider gp3 instead of gp2 for better baseline
aws ec2 modify-volume --volume-id vol-12345 --volume-type gp3
Recently, I encountered a puzzling performance issue on our AWS Linux AMI running Apache with EBS storage. The server showed:
- Consistently high load average (>8)
- kworker processes consuming >90% IO shown in iotop
- Zero disk read but significant disk write activity
- Apache processes showing substantial disk writes
# Sample iotop output
Total DISK READ: 0.00 B/s | Total DISK WRITE: 2.37 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
3730 be/4 root 0.00 B 0.00 B 0.00 % 91.98 % [kworker/u8:1]
774 be/3 root 0.00 B 1636.00 K 0.00 % 15.77 % [jbd2/xvda1-8]
The first clue was that both kworker and jbd2 processes disappeared when Apache was stopped. This pointed to some filesystem interaction triggered by Apache activity.
Using perf to analyze kernel activity revealed significant time spent in Xen hypercalls and filesystem operations:
# perf report snippet
- 83.58% swapper [kernel.kallsyms] [k] xen_hypercall_sched_op
+ 1.73% httpd [kernel.kallsyms] [k] __d_lookup_rcu
+ 1.08% httpd [kernel.kallsyms] [k] xen_hypercall_xen_version
After digging deeper, I identified several potential culprits:
- Journaling filesystem overhead: The jbd2 process indicates ext4 journal activity
- Metadata operations: Apache creating/deleting many small files (logs, sessions, cache)
- EBS performance characteristics: High IOPS for metadata operations
To gather more evidence, I used these commands:
# Check filesystem metadata operations
sudo strace -p $(pgrep httpd | head -1) -e trace=file -f 2>&1 | grep -v ENOENT
# Monitor inode operations
sudo inotifywait -rme modify,attrib,move,create,delete /var/www/html/
# Check filesystem mount options
mount | grep xvda1
Based on the findings, I implemented these changes:
# /etc/fstab modifications for EBS volume
/dev/xvda1 / ext4 defaults,noatime,nodiratime,data=writeback,commit=60 0 1
# Apache configuration for better filesystem interaction
<IfModule mpm_prefork_module>
# Reduce keepalive to minimize persistent connections
KeepAlive On
KeepAliveTimeout 2
MaxKeepAliveRequests 100
# Adjust for session handling
php_value session.save_handler redis
php_value session.save_path "tcp://redis.example.com:6379"
</IfModule>
For more severe cases, consider:
- Moving session storage to Redis/Memcached
- Using tmpfs for temporary files
- Implementing a CDN for static assets
- Upgrading to EBS Provisioned IOPS volumes
After implementing these changes, the kworker IO wait dropped significantly, and server load returned to normal levels.