Troubleshooting Supervisor Socket File Missing Error: /var/run/supervisor.sock and /tmp/supervisor.sock Issues


1 views

When dealing with Supervisor's socket file issues, it's crucial to understand how Supervisor handles its communication channels. The socket file serves as an IPC (Inter-Process Communication) mechanism between supervisorctl and the supervisord daemon.

# Typical socket file locations:
/var/run/supervisor.sock  # Default in most configurations
/tmp/supervisor.sock      # Fallback location when /var/run isn't accessible

The inconsistent behavior you're seeing (sometimes looking in /var/run, sometimes in /tmp) typically occurs when:

  • The supervisord service wasn't properly started after reboot
  • Files in /var/run get cleared during reboot (as it's often a tmpfs)
  • Permissions prevent socket file creation
  • Multiple Supervisor instances are conflicting

Ensure your /etc/supervisor/supervisord.conf contains these critical settings:

[unix_http_server]
file=/var/run/supervisor.sock   ; This should match your desired path
chmod=0700                      ; Socket permissions
chown=nobody:nogroup            ; Adjust according to your user

[supervisorctl]
serverurl=unix:///var/run/supervisor.sock  ; Must match above

If you're using systemd, create or modify /etc/systemd/system/supervisor.service:

[Unit]
Description=Supervisor process control system
After=network.target

[Service]
Type=forking
ExecStartPre=/bin/mkdir -p /var/run/supervisor
ExecStart=/usr/bin/supervisord -c /etc/supervisor/supervisord.conf
ExecStop=/usr/bin/supervisorctl shutdown
ExecReload=/usr/bin/supervisorctl reload
User=root
Restart=on-failure

[Install]
WantedBy=multi-user.target

Then run:

sudo systemctl daemon-reload
sudo systemctl enable supervisor
sudo systemctl start supervisor

Use these diagnostic commands to identify the issue:

# Check if supervisord is running
ps aux | grep supervisord

# Check socket file existence
ls -l /var/run/supervisor.sock /tmp/supervisor.sock 2>/dev/null

# Force socket file recreation
sudo supervisorctl shutdown
sudo supervisord -c /etc/supervisor/supervisord.conf

Sometimes the issue stems from environment variables or permission problems:

# Ensure proper permissions on run directory
sudo mkdir -p /var/run/supervisor
sudo chmod 755 /var/run/supervisor

# Check environment variables that might affect paths
env | grep SUPERVISOR

When running supervisorctl, you can explicitly specify the socket path:

supervisorctl -s unix:///var/run/supervisor.sock status

Or create an alias in your shell configuration:

alias supervisorctl='supervisorctl -s unix:///var/run/supervisor.sock'

After installing Supervisor (v3.1.2) to manage ElastAlert, I encountered inconsistent UNIX socket errors:

unix:///var/run/supervisor.sock no such file
unix:///tmp/supervisor.sock no such file

What's particularly puzzling is that while supervisorctl throws these errors, ElastAlert continues to function normally after reboots.

The default configuration in /etc/supervisor/supervisor.conf specifies:

[unix_http_server]
file=/var/run/supervisor.sock
chmod=0700

Yet we're seeing references to both /var/run and /tmp locations, suggesting a configuration conflict.

First, verify if the supervisor service is actually running:

ps aux | grep supervisord

Check the actual socket file location being used:

sudo find / -name "*.sock" -type f 2>/dev/null

Edit your supervisor configuration to explicitly define the socket location:

[unix_http_server]
file=/var/run/supervisor/supervisor.sock
chmod=0770
chown=root:supervisor

Then ensure the directory exists with proper permissions:

sudo mkdir -p /var/run/supervisor
sudo chown root:supervisor /var/run/supervisor
sudo chmod 755 /var/run/supervisor

For systems using systemd, create or modify the service file at /etc/systemd/system/supervisor.service:

[Unit]
Description=Supervisor process control system
After=network.target

[Service]
Type=forking
ExecStart=/usr/bin/supervisord -c /etc/supervisor/supervisord.conf
ExecStop=/usr/bin/supervisorctl shutdown
ExecReload=/usr/bin/supervisorctl reload
User=root

[Install]
WantedBy=multi-user.target

Then reload systemd:

sudo systemctl daemon-reload
sudo systemctl enable supervisor
sudo systemctl restart supervisor

When debugging, these commands can help identify the issue:

# Check supervisor status
sudo supervisorctl status

# View supervisor logs
sudo tail -f /var/log/supervisor/supervisord.log

# Force socket recreation
sudo pkill supervisord
sudo unlink /var/run/supervisor.sock
sudo service supervisor restart

If socket issues persist, consider switching to TCP:

[inet_http_server]
port=127.0.0.1:9001

Then connect using:

supervisorctl -s http://127.0.0.1:9001