How to Automatically Kill Long-Running Processes by User in Linux Using Bash Script


15 views

When managing Linux servers, we often need to terminate processes that have been running excessively long by specific users. This prevents resource hogging and maintains system stability. Here's a robust solution for CentOS/RHEL systems.

The most reliable approach combines ps with process time filtering and targeted killing:


#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5

# Get PIDs of processes running longer than threshold
pids=$(ps -u $USERNAME -o pid,etimes= | \
       awk -v threshold=$((THRESHOLD_MINUTES*60)) \
       'NR>1 && $2>=threshold {print $1}')

# Kill the processes if any found
if [ -n "$pids" ]; then
    kill $pids
    echo "$(date): Killed following PIDs for user $USERNAME: $pids" >> /var/log/long_process_killer.log
fi

For production systems, we should add more safeguards:


#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5
PROCESS_WHITELIST=("sshd" "bash") # Processes to never kill
LOG_FILE="/var/log/process_killer.log"

# Function to check if process is in whitelist
is_whitelisted() {
    local pid=$1
    local comm=$(ps -p $pid -o comm= 2>/dev/null)
    [[ " ${PROCESS_WHITELIST[@]} " =~ " ${comm} " ]]
}

# Main logic
pids=$(ps -u $USERNAME -o pid,etimes=,comm= | \
       awk -v threshold=$((THRESHOLD_MINUTES*60)) \
       'NR>1 && $2>=threshold {print $1,$3}')

while read pid comm; do
    if ! is_whitelisted $pid; then
        if kill $pid 2>/dev/null; then
            echo "$(date): Killed PID $pid ($comm) for user $USERNAME" >> $LOG_FILE
        else
            echo "$(date): Failed to kill PID $pid ($comm)" >> $LOG_FILE
        fi
    fi
done <<< "$pids"

To run this every 5 minutes via cron:


# Edit crontab as root
sudo crontab -e

# Add this line (adjust path to script)
*/5 * * * * /path/to/process_killer.sh >/dev/null 2>&1

For systems where ps output format varies, we can parse proc directly:


#!/bin/bash
USERNAME="target_user"
THRESHOLD=$((5*60)) # 5 minutes in seconds

for pid in /proc/[0-9]*; do
    if [ -f "$pid/status" ] && [ -f "$pid/stat" ]; then
        uid=$(grep '^Uid:' $pid/status | awk '{print $2}')
        user=$(getent passwd $uid | cut -d: -f1)
        
        if [ "$user" = "$USERNAME" ]; then
            start_time=$(stat -c %Y $pid)
            now=$(date +%s)
            elapsed=$((now - start_time))
            
            if [ $elapsed -ge $THRESHOLD ]; then
                pid_num=$(basename $pid)
                kill $pid_num 2>/dev/null && \
                echo "Killed $pid_num running for $((elapsed/60)) minutes"
            fi
        fi
    fi
done
  • Always test scripts in a non-production environment first
  • Consider adding email notifications for killed processes
  • For critical systems, implement gradual termination (SIGTERM first, then SIGKILL)
  • Monitor the log file for unexpected terminations

When managing Linux servers, we often need to automatically terminate stale processes that exceed specific runtime thresholds. For CentOS/RHEL systems, this becomes particularly important for:

  • Preventing resource hogging by runaway processes
  • Automating cleanup in cron jobs
  • Maintaining system stability for specific users

Here's a robust script that combines ps, awk, and kill commands with precise time calculations:

#!/bin/bash

TARGET_USER="yourusername"
THRESHOLD_MINUTES=5

# Get current timestamp in seconds since epoch
NOW=$(date +%s)

ps -u $TARGET_USER -o pid= -o etime= -o comm= | while read PID ETIME COMM
do
    # Convert elapsed time to seconds
    if [[ $ETIME =~ ([0-9]+)-([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
        # Days+HH:MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*86400 + ${BASH_REMATCH[2]}*3600 + ${BASH_REMATCH[3]}*60 + ${BASH_REMATCH[4]} ))
    elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
        # HH:MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*3600 + ${BASH_REMATCH[2]}*60 + ${BASH_REMATCH[3]} ))
    elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]) ]]; then
        # MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*60 + ${BASH_REMATCH[2]} ))
    else
        # SS format
        SECONDS=${ETIME}
    fi

    # Convert threshold to seconds
    THRESHOLD=$((THRESHOLD_MINUTES * 60))

    if [ $SECONDS -ge $THRESHOLD ]; then
        echo "Killing process $PID ($COMM) running for $ETIME"
        kill -9 $PID
    fi
done

To run this every 5 minutes via cron:

*/5 * * * * /path/to/your/script.sh >> /var/log/process_cleanup.log 2>&1

For production systems, consider this enhanced version with proper logging:

#!/bin/bash

LOG_FILE="/var/log/long_running_kill.log"
MAX_LOG_SIZE=1048576 # 1MB

# Rotate log if needed
if [ -f "$LOG_FILE" ]; then
    LOG_SIZE=$(stat -c%s "$LOG_FILE")
    if [ $LOG_SIZE -gt $MAX_LOG_SIZE ]; then
        mv "$LOG_FILE" "${LOG_FILE}.old"
    fi
fi

{
    echo "=== Process cleanup started at $(date) ==="
    
    # Main processing logic here (same as above)
    # ...
    
    echo "=== Cleanup completed ==="
} >> "$LOG_FILE"
  • Always test with echo before actual kill
  • Consider whitelisting critical processes
  • Monitor /var/log/messages for potential issues
  • For system users, check dependencies with pstree -p

For systems where ps output parsing is problematic:

#!/bin/bash

for pid in $(find /proc -maxdepth 1 -type d -name '[0-9]*' -user $TARGET_USER); do
    start_time=$(cat $pid/stat | awk '{print $22}')
    run_seconds=$(( ($(date +%s) - start_time) / 100 ))
    
    if [ $run_seconds -gt 300 ]; then
        kill -9 ${pid##*/}
    fi
done