Debugging “memcached dead but subsys locked” Error on CentOS: Port Conflicts and Process Lock Analysis


2 views

When you encounter the "memcached dead but subsys locked" message on CentOS, it indicates a specific service management state where:

  • The process appears dead in service manager's perspective (like chkconfig or service commands)
  • System retains lock files preventing service restart
  • Actual process might still be running (as shown in your ps output)

Your ps output shows memcached is actually running under user nobody:

nobody   21983  0.0  1.8  60272 19912 ?        Ssl  16:46   0:00 memcached -d -p 11211 -u nobody -c 1024 -m 64

The port appears properly bound as shown in netstat:

tcp        0      0 :::11211                    :::*                        LISTEN      
udp        0      0 0.0.0.0:11211               0.0.0.0:*

This typically occurs when:

  1. PID file exists but process isn't properly registered in service manager
  2. Improper shutdown left lock files in /var/lock/subsys/
  3. Multiple instances attempting to bind to same port

1. Clean up existing locks:

sudo rm -f /var/lock/subsys/memcached
sudo rm -f /var/run/memcached.pid

2. Verify no conflicting processes:

sudo lsof -i :11211
ps aux | grep memcached

3. Force-clean the service state:

sudo service memcached stop
sudo pkill -9 memcached

Add proper PID file handling in /etc/sysconfig/memcached:

PORT="11211"
USER="nobody"
MAXCONN="1024"
CACHESIZE="64"
OPTIONS="-l 127.0.0.1 -P /var/run/memcached.pid"

Create a proper init script at /etc/init.d/memcached:

#!/bin/sh
#
# chkconfig: - 55 45
# description: memcached

PIDFILE=/var/run/memcached.pid
LOCKFILE=/var/lock/subsys/memcached

start() {
    [ -f $LOCKFILE ] && return 0
    daemon --pidfile $PIDFILE /usr/bin/memcached -d -p 11211 -u nobody -c 1024 -m 64
    retval=$?
    [ $retval -eq 0 ] && touch $LOCKFILE
    return $retval
}

stop() {
    [ ! -f $PIDFILE ] && return 0
    killproc -p $PIDFILE /usr/bin/memcached
    retval=$?
    [ $retval -eq 0 ] && rm -f $LOCKFILE
    return $retval
}

After implementing fixes, verify with:

sudo service memcached restart
sudo tail -f /var/log/memcached.log
sudo service memcached status

The "dead but subsys locked" message typically appears when a service (in this case memcached) reports as stopped in the system's service management framework, but certain subsystem resources remain allocated. This state often indicates that the process terminated unexpectedly while holding locks or resources.

From your system observations:

# Network status:
tcp        0      0 :::11211                    :::*                        LISTEN      
udp        0      0 0.0.0.0:11211               0.0.0.0:*                               

# Process status:
nobody   21983  0.0  1.8  60272 19912 ?        Ssl  16:46   0:00 memcached -d -p 11211 -u nobody -c 1024 -m 64

Several scenarios can lead to this state:

  • Improper service shutdown
  • Resource leaks preventing cleanup
  • PID file remaining after process termination
  • Incorrect SELinux contexts

First, check for stale PID files:

ls -l /var/run/memcached/
cat /var/run/memcached/memcached.pid

Verify service unit status:

systemctl status memcached
journalctl -u memcached -n 50

Method 1: Clean Restart

systemctl stop memcached
pkill -9 memcached
rm -f /var/run/memcached/memcached.pid
systemctl start memcached

Method 2: Configuration Check

Review your memcached configuration (/etc/sysconfig/memcached):

PORT="11211"
USER="nobody"
MAXCONN="1024"
CACHESIZE="64"
OPTIONS=""

Method 3: SELinux Context Verification

ls -Z /usr/bin/memcached
restorecon -v /usr/bin/memcached

Consider implementing these best practices:

  1. Add proper logging to your memcached service:
OPTIONS="-vv >> /var/log/memcached.log 2>&1"
  1. Implement a monitoring script:
#!/bin/bash
if ! pgrep -x "memcached" > /dev/null
then
    systemctl restart memcached
fi

For persistent issues, try running memcached in foreground debug mode:

memcached -vv -u nobody -p 11211

Check system resource limits:

ulimit -a
cat /proc/$(pgrep memcached)/limits