When dealing with OutOfMemoryErrors in Java applications, especially those running large heaps (3GB+), traditional heap dump methods often fail. Our team encountered a 90% failure rate when using jmap with Java 1.6 on 64-bit systems, despite the documented improvements since Java 1.4.
The primary issues we've identified:
- Heap dumping freezes the JVM during the process
- Native memory pressure during dump creation
- Race conditions when OOM triggers multiple mechanisms
After extensive testing, we recommend this multi-layered approach:
1. The Safe jmap Alternative
Instead of direct jmap invocation, use the attach API:
#!/bin/bash
PID=$(jps | grep YourAppName | awk '{print $1}')
jmap -dump:live,format=b,file=/tmp/heap.hprof $PID
2. JVM Native Flag Configuration
Add these JVM options for better dump reliability:
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/path/to/dumps
-XX:OnOutOfMemoryError="/path/to/your/script.sh %p"
-XX:+UseGCOverheadLimit
-XX:-UseLargePages
3. The Fallback Script
Create a robust monitoring script that:
- Detects OOM in logs
- Waits 30 seconds for JVM to stabilize
- Attempts dump with multiple methods
For systems where 100% dump reliability is essential:
1. Live Heap Analysis
Implement periodic sampling instead of full dumps:
jcmd PID GC.class_histogram > histogram.txt
jstat -gcutil PID 1000 10 > gc_stats.txt
2. Memory-Mapped Dump Files
Configure the JVM to use memory-mapped files for dumps:
-XX:+UseCompressedOops
-XX:+UseCompressedClassPointers
-XX:NativeMemoryTracking=detail
- Test dump procedures under load (not just OOM conditions)
- Allocate twice the heap size in disk space for dump files
- Monitor the dump process itself for failures
- Consider upgrading to Java 8+ for improved dump reliability
When dumps fail, check:
cat /proc/sys/kernel/core_pattern
ulimit -c unlimited
df -h /tmp
Working with a Java 1.6 JVM handling 3GB heap sizes, our team consistently encountered failed heap dumps when attempting to diagnose OutOfMemoryError situations. While the -XX:+HeapDumpOnOutOfMemoryError
flag exists, specific operational constraints forced us to use jmap
triggered via bash scripts instead.
Through painful experience, we identified several key failure points:
- Insufficient disk space during dump generation (3GB heap ≠ 3GB dump file)
- Signal contention when multiple monitoring tools compete
- JVM instability during OOM conditions
- Native memory exhaustion during dump creation
After extensive testing, we implemented these improvements:
# Sample improved bash script snippet
JAVA_PID=$(pgrep -f "our_application.jar")
DUMP_DIR="/heapdumps"
mkdir -p $DUMP_DIR
# Critical parameters for reliable dumps
ulimit -c unlimited
sysctl -w kernel.mm.max_map_count=262144
jmap -dump:format=b,file=${DUMP_DIR}/heapdump_$(date +%s).hprof $JAVA_PID || {
echo "Primary dump failed, attempting fallback" >&2
jmap -F -dump:format=b,file=${DUMP_DIR}/heapdump_$(date +%s)_fallback.hprof $JAVA_PID
}
These OS-level changes significantly improved success rates:
- Set
vm.overcommit_memory=1
temporarily during dump collection - Increased
kernel.pid_max
to prevent PID exhaustion - Pre-allocated dump directory with 2x heap size free space
When jmap proves unreliable:
- Use
jcmd
instead (requires Java 7+):jcmd ${JAVA_PID} GC.heap_dump ${DUMP_DIR}/heapdump.hprof
- Implement a shutdown hook for graceful dumping:
Runtime.getRuntime().addShutdownHook(new Thread(() -> { HotSpotDiagnosticMXBean diagBean = ManagementFactory.getPlatformMXBean( HotSpotDiagnosticMXBean.class); diagBean.dumpHeap("/emergency_dump.hprof", true); }));
Through this troubleshooting process, we discovered:
- Heap dumps during OOM are inherently unstable - capture dumps proactively
- The
-F
(force) flag in jmap can sometimes work when normal mode fails - Parallel GC algorithms tend to produce more reliable dumps than CMS during failures