When Java processes consistently show 400%+ CPU usage and the kernel reports tasks being blocked for over 120 seconds, we're dealing with severe system contention. The key indicators are:
[timestamp] INFO: task java:21547 blocked for more than 120 seconds.
[timestamp] INFO: task kjournald:190 blocked for more than 120 seconds.
[timestamp] INFO: task flush-202:0:709 blocked for more than 120 seconds.
The blocked tasks suggest the kernel is detecting unresponsive processes. Several factors could contribute:
- I/O Wait Dominance: When kjournald (ext3/ext4 journaling daemon) and flush workers are blocked, it indicates storage subsystem problems
- Memory Pressure: No swap space can cause OOM killer to trigger improperly
- CPU Saturation: 400%+ Java CPU suggests thread contention or GC issues
Run these during normal operation to establish baselines:
# Check I/O wait
vmstat 1 10 | awk '{print $16}'
# Monitor disk latency
iostat -xmd 2
# Check memory pressure
free -m && cat /proc/meminfo | grep -E 'MemFree|Swap'
# Identify Java thread states
jstack -l <pid> | grep "java.lang.Thread.State" | sort | uniq -c
For Ubuntu 10.04 with 2.6.x kernels:
# Disable memory overcommit (add to /etc/sysctl.conf)
vm.overcommit_memory = 2
vm.overcommit_ratio = 80
# Adjust dirty page thresholds
vm.dirty_background_ratio = 5
vm.dirty_ratio = 15
# Disable NUMA balancing if present
echo 0 > /proc/sys/kernel/numa_balancing
Add these JVM flags to prevent GC-induced hangs:
-XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode
-XX:+UseTLAB
-XX:ParallelGCThreads=<CPU cores>
-XX:+DisableExplicitGC
If hangs persist after tuning:
# Temporary workaround (not recommended for production)
echo 0 > /proc/sys/kernel/hung_task_timeout_secs
# Better alternative - adjust timeout to 300s
echo 300 > /proc/sys/kernel/hung_task_timeout_secs
For kjournald issues, verify your filesystem mount options:
# Example fstab entry with safer options
/dev/sdX1 / ext4 noatime,nodiratime,data=writeback,barrier=0 0 1
Note: Disable barriers only if you have battery-backed RAID controllers.
When Java processes spike beyond 400% CPU utilization on Linux servers (particularly Ubuntu 10.04 with 2.6.3x kernels), system monitoring reveals repetitive console errors like:
INFO: task java:21547 blocked for more than 120 seconds
INFO: task kjournald:190 blocked for more than 120 seconds
INFO: task flush-202:0:709 blocked for more than 120 seconds
Several critical patterns emerge:
- Occurs across both VM and bare-metal environments
- Persists after hardware migration
- Involves both Java processes and kernel threads (kjournald, flush)
- Console output becomes primary diagnostic channel (dmesg inaccessible)
Based on the evidence, we're likely dealing with:
# Common triggers for task blocking:
1. I/O subsystem congestion (kjournald involvement)
2. Memory pressure (no swap remaining)
3. CPU starvation (400%+ Java utilization)
4. Kernel-level deadlocks
5. Missing irqbalance service
First, establish monitoring before crashes:
#!/bin/bash
# Continuous system monitoring script
while true; do
echo "===== $(date) =====" >> /var/log/system_monitor.log
top -b -n 1 | head -20 >> /var/log/system_monitor.log
vmstat 1 5 >> /var/log/system_monitor.log
iostat -dx 1 5 >> /var/log/system_monitor.log
sleep 30
done
While investigating, implement these adjustments:
# Temporary kernel parameter changes
echo 300 > /proc/sys/kernel/hung_task_timeout_secs
echo 1 > /proc/sys/vm/dirty_background_ratio
echo 5 > /proc/sys/vm/dirty_ratio
# Install critical services
apt-get install irqbalance sysstat
service irqbalance start
For the high-CPU Java processes:
// Add these JVM flags:
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:ParallelGCThreads=4
-XX:ConcGCThreads=2
-XX:InitiatingHeapOccupancyPercent=35
Given kjournald involvement, examine filesystem performance:
# Check filesystem mount options
mount | grep -E '(ext3|ext4|xfs)'
# Recommended options for database/Java workloads:
defaults,noatime,nodiratime,data=writeback,barrier=0
- Upgrade to newer LTS Ubuntu (14.04+) with modern kernel
- Implement proper process cgroups for Java
- Add swap space if memory constrained
- Consider switching to XFS for better journaling performance
When console is the only output channel, use serial console redirection:
# In /etc/default/grub:
GRUB_CMDLINE_LINUX="console=tty0 console=ttyS0,115200n8"
GRUB_TERMINAL=serial
# Then update-grub and configure serial logging