When inheriting a large-scale Tomcat application serving as a Red5 server for Flex clients, we noticed response times creeping up from under 100ms to 300-400ms under sustained load. Memory leaks were suspected but never conclusively proven. The real surprise came during a staging environment load test when the server essentially stopped responding.
Out of desperation, I tried:
sync && echo 3 > /proc/sys/vm/drop_caches
Miraculously, the server immediately returned to normal operation. But why did this work?
sync: Forces completion of pending disk writes
echo 3 > /proc/sys/vm/drop_caches: Clears:
- 1: PageCache
- 2: Dentries and Inodes
- 3: Both (1+2)
In high-throughput Java applications:
- Linux aggressively caches file operations (JARs, class files, etc.)
- Over time, cached entries become stale or fragmented
- Java's memory management competes with OS caching
Consider this approach when:
# Monitoring script example
if [ $(grep -c 'java.lang.OutOfMemoryError' /var/log/tomcat/catalina.out) -gt 0 ] ||
[ $(vmstat 1 2 | tail -1 | awk '{print $4}') -lt 10000 ]; then
logger "Triggering cache cleanup"
sync && echo 3 > /proc/sys/vm/drop_caches
fi
For production systems, consider more controlled approaches:
# Scheduled maintenance (cron)
0 3 * * * root /usr/bin/sync && /usr/bin/echo 1 > /proc/sys/vm/drop_caches
1. This is NOT a substitute for proper memory management
2. Dropping caches causes immediate performance hit until cache rebuilds
3. Consider adjusting vm.vfs_cache_pressure instead for permanent solution
When inheriting a legacy Tomcat application serving as a Red5 server for Flex clients, I encountered a perplexing performance degradation pattern. Under sustained load, response times would gradually inflate from under 100ms to 300-400ms. Memory leaks were suspected but never conclusively proven through heap dumps or GC analysis.
# Sample metrics showing the performance degradation:
Requests: 1500/min → Response p99: 98ms
After 4 hours:
Requests: 1500/min → Response p99: 387ms
During a particularly severe episode where a staging server nearly stopped responding, I executed:
sync && echo 3 > /proc/sys/vm/drop_caches
Remarkably, the server immediately returned to baseline performance. Let's break down why this worked.
Modern Linux kernels aggressively cache filesystem operations to improve performance. Three types of caches exist:
- Page cache: Caches file contents (echo 1)
- Dentry/inode caches: Caches directory structures (echo 2)
- Combined: Both above (echo 3)
In Java applications handling many small real-time interactions, these caches can grow excessively:
# Check current cache usage
free -h
total used free shared buff/cache available
Mem: 16G 5.2G 230M 1.3G 11G 9.2G
The sync
command ensures all buffered writes hit disk before clearing caches. Omitting this risks data corruption. The sequence:
- sync: Force write completion
- echo 3: Clear all cache types
For production systems, consider implementing controlled cache drops during low-traffic periods:
#!/bin/bash
# Safe cache dropper for Tomcat/Java apps
if [ $(pgrep -fc java) -lt 1 ]; then
echo "No Java processes running"
exit 1
fi
# Only proceed if load is below threshold
LOAD=$(awk '{print $1}' /proc/loadavg)
if [ $(echo "$LOAD < 2.0" | bc) -eq 1 ]; then
logger "Initiating safe cache drop"
sync
echo 3 > /proc/sys/vm/drop_caches
logger "Cache drop completed. Current free: $(free -h | awk '/Mem/{print $4}')"
fi
For long-term stability, consider:
- JVM Tuning: Reduce filesystem interactions through proper buffer sizing
- Redis Caching: Offload real-time data from filesystem
- Kernel Parameters: Adjust vm.vfs_cache_pressure
# Example kernel tuning
sysctl -w vm.vfs_cache_pressure=150
The cache drop technique serves as an emergency remedy rather than a permanent solution, but understanding its mechanics proves invaluable when troubleshooting mysterious latency spikes in real-time systems.