Recently, I encountered a puzzling situation where my application was hitting the "too many open files" error despite having generous user-level limits configured. Here's what I found:
# Current system-wide settings
$ ulimit -n
100000
# Application-specific open files count
$ lsof -n -u myapp | wc -l
2708
Linux actually enforces file descriptor limits at multiple levels:
- System-wide maximum: Defined in /proc/sys/fs/file-max
- User-level limits: Set via /etc/security/limits.conf or pam_limits
- Per-process limits: Inherited from the parent process and can be modified via prlimit
To inspect the actual limit for a running process:
# Method 1: Using /proc
$ cat /proc/$(pidof myapp)/limits | grep "Max open files"
# Method 2: Using prlimit
$ prlimit --pid $(pidof myapp) --nofile
Several situations can cause process limits to be lower than user limits:
- The process was started before ulimit changes were applied
- The application uses setrlimit() to self-impose stricter limits
- The process is containerized with different cgroup limits
- The application uses multiple threads with shared FD tables
Here are actionable fixes for different scenarios:
# For systemd services (add to service unit file)
[Service]
LimitNOFILE=100000
# For Docker containers
docker run --ulimit nofile=100000:100000 myapp
# For temporary process adjustment
prlimit --pid $(pidof myapp) --nofile=100000:100000
When you suspect FD leaks, monitor changes over time:
#!/bin/bash
while true; do
ls -l /proc/$(pidof myapp)/fd | wc -l
sleep 1
done
For high-performance applications, you might need to adjust:
# Increase system-wide maximum
echo 2000000 > /proc/sys/fs/file-max
# For persistent changes, add to /etc/sysctl.conf:
fs.file-max = 2000000
When you encounter "too many open files" errors despite having high system limits configured, you're dealing with Linux's multi-layered resource control system. The key constraints operate at three levels:
# System-wide limits (affects all users)
/etc/security/limits.conf
# Per-process limits (kernel-enforced)
/proc/sys/fs/file-max
/proc/sys/fs/nr_open
# Application-specific limits (check your software docs)
To diagnose exactly where your bottleneck occurs:
# Check current process limits
cat /proc/<PID>/limits
# Verify system-wide FD usage
cat /proc/sys/fs/file-nr
# Compare with your application's actual usage
ls -l /proc/<PID>/fd | wc -l
Even experienced engineers frequently overlook:
- Thread-per-connection models that don't properly close sockets
- LEAKED FILE DESCRIPTORS shown via
lsof -n -p <PID> | grep DEL
- Containerized environments imposing additional cgroup limits
For high-performance applications needing thousands of connections:
# Temporary increase (until reboot)
sysctl -w fs.file-max=2000000
sysctl -w fs.nr_open=3000000
# Persistent configuration
echo "fs.file-max = 2000000" >> /etc/sysctl.conf
echo "fs.nr_open = 3000000" >> /etc/sysctl.conf
sysctl -p
Here's how we fixed a Java application hitting FD limits despite high ulimit:
# 1. Find the leaked descriptors
jstack <PID> | grep -A 10 "Socket.*not closed"
# 2. Verify with OS tools
lsof -p <PID> -a -iTCP -nP
# 3. Fix in code (Java example)
try (Socket s = new Socket(host, port);
InputStream is = s.getInputStream()) {
// socket auto-closed by try-with-resources
}
For long-running services, implement proactive monitoring:
#!/bin/bash
# Continuous FD usage monitor
while true; do
fd_count=$(ls -1 /proc/$1/fd | wc -l)
echo "$(date) - FD count: $fd_count"
if [ $fd_count -gt $WARNING_THRESHOLD ]; then
alert_ops_team "$1 approaching FD limit"
fi
sleep 60
done