How to Fix “failed to create fsnotify watcher: too many open files” Error in Kubernetes Log Tailing


2 views

When working with Kubernetes, you might encounter the error failed to create fsnotify watcher: too many open files while trying to tail pod logs using kubectl logs -f. This typically indicates your system has hit the maximum limit for open file descriptors.

Kubernetes heavily relies on file watchers for various operations. Each:

  • Running pod
  • ConfigMap/Secret update
  • Log tailing operation

consumes file descriptors. The default limit (often 1024) can be quickly exhausted in busy clusters.

First, verify your current limits:

# Check system-wide limits
cat /proc/sys/fs/file-max

# Check process limits (replace PID with your process)
cat /proc/PID/limits | grep files

1. Temporary Increase for Testing

For immediate testing:

ulimit -n 65536

2. Permanent System Configuration

Edit /etc/security/limits.conf:

* soft nofile 65536
* hard nofile 65536

3. Kubernetes-Specific Fixes

For kubelet (often the culprit):

# Edit kubelet service file (systemd)
sudo systemctl edit kubelet

[Service]
LimitNOFILE=1048576

If increasing limits isn't possible, reduce watchers:

# Reduce inotify instances
echo fs.inotify.max_user_instances=1024 | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Use this command to monitor file usage:

lsof -u $(whoami) | wc -l

When debugging a specific pod:

# Check file descriptors used by kubelet
ls -l /proc/$(pgrep kubelet)/fd | wc -l

# Alternative for containerized kubelet
docker exec -it $(docker ps | grep kubelet | awk '{print $1}') sh -c "ls -l /proc/1/fd | wc -l"

When working with Kubernetes pods, you might encounter the error failed to create fsnotify watcher: too many open files while attempting to tail logs. This typically occurs when your system hits the maximum limit of file descriptors that can be opened simultaneously.

Kubernetes heavily relies on file watchers (fsnotify) to monitor changes in various resources. Each pod, container, and log stream requires file descriptors. When you have many pods running or aggressively tail logs, you can quickly exhaust the default file descriptor limit.


# Check current limits
ulimit -n

The most effective solution is to increase the maximum number of open files at both the system and process levels:


# Temporary increase (until next reboot)
ulimit -n 65536

# Permanent solution - add to /etc/security/limits.conf
* soft nofile 65536
* hard nofile 65536

For Kubernetes components like kubelet that might be causing this issue:


# Edit kubelet service file (usually /etc/systemd/system/kubelet.service.d/10-kubeadm.conf)
[Service]
LimitNOFILE=infinity

If increasing limits isn't possible, consider reducing the number of watched resources:


# Use more selective log tailing
kubectl logs -f pod-name --tail=100 --since=1h

To identify processes consuming file descriptors:


# List processes with open file counts
lsof | awk '{print $1}' | sort | uniq -c | sort -nr | head

If using Docker, ensure its file descriptor limits are also increased:


# For Docker
echo "LimitNOFILE=infinity" >> /etc/systemd/system/docker.service.d/override.conf
systemctl daemon-reload
systemctl restart docker