How to Automatically Kill Long-Running Processes by User in Linux Using Bash Script


2 views

When managing Linux servers, we often need to terminate processes that have been running excessively long by specific users. This prevents resource hogging and maintains system stability. Here's a robust solution for CentOS/RHEL systems.

The most reliable approach combines ps with process time filtering and targeted killing:


#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5

# Get PIDs of processes running longer than threshold
pids=$(ps -u $USERNAME -o pid,etimes= | \
       awk -v threshold=$((THRESHOLD_MINUTES*60)) \
       'NR>1 && $2>=threshold {print $1}')

# Kill the processes if any found
if [ -n "$pids" ]; then
    kill $pids
    echo "$(date): Killed following PIDs for user $USERNAME: $pids" >> /var/log/long_process_killer.log
fi

For production systems, we should add more safeguards:


#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5
PROCESS_WHITELIST=("sshd" "bash") # Processes to never kill
LOG_FILE="/var/log/process_killer.log"

# Function to check if process is in whitelist
is_whitelisted() {
    local pid=$1
    local comm=$(ps -p $pid -o comm= 2>/dev/null)
    [[ " ${PROCESS_WHITELIST[@]} " =~ " ${comm} " ]]
}

# Main logic
pids=$(ps -u $USERNAME -o pid,etimes=,comm= | \
       awk -v threshold=$((THRESHOLD_MINUTES*60)) \
       'NR>1 && $2>=threshold {print $1,$3}')

while read pid comm; do
    if ! is_whitelisted $pid; then
        if kill $pid 2>/dev/null; then
            echo "$(date): Killed PID $pid ($comm) for user $USERNAME" >> $LOG_FILE
        else
            echo "$(date): Failed to kill PID $pid ($comm)" >> $LOG_FILE
        fi
    fi
done <<< "$pids"

To run this every 5 minutes via cron:


# Edit crontab as root
sudo crontab -e

# Add this line (adjust path to script)
*/5 * * * * /path/to/process_killer.sh >/dev/null 2>&1

For systems where ps output format varies, we can parse proc directly:


#!/bin/bash
USERNAME="target_user"
THRESHOLD=$((5*60)) # 5 minutes in seconds

for pid in /proc/[0-9]*; do
    if [ -f "$pid/status" ] && [ -f "$pid/stat" ]; then
        uid=$(grep '^Uid:' $pid/status | awk '{print $2}')
        user=$(getent passwd $uid | cut -d: -f1)
        
        if [ "$user" = "$USERNAME" ]; then
            start_time=$(stat -c %Y $pid)
            now=$(date +%s)
            elapsed=$((now - start_time))
            
            if [ $elapsed -ge $THRESHOLD ]; then
                pid_num=$(basename $pid)
                kill $pid_num 2>/dev/null && \
                echo "Killed $pid_num running for $((elapsed/60)) minutes"
            fi
        fi
    fi
done
  • Always test scripts in a non-production environment first
  • Consider adding email notifications for killed processes
  • For critical systems, implement gradual termination (SIGTERM first, then SIGKILL)
  • Monitor the log file for unexpected terminations

When managing Linux servers, we often need to automatically terminate stale processes that exceed specific runtime thresholds. For CentOS/RHEL systems, this becomes particularly important for:

  • Preventing resource hogging by runaway processes
  • Automating cleanup in cron jobs
  • Maintaining system stability for specific users

Here's a robust script that combines ps, awk, and kill commands with precise time calculations:

#!/bin/bash

TARGET_USER="yourusername"
THRESHOLD_MINUTES=5

# Get current timestamp in seconds since epoch
NOW=$(date +%s)

ps -u $TARGET_USER -o pid= -o etime= -o comm= | while read PID ETIME COMM
do
    # Convert elapsed time to seconds
    if [[ $ETIME =~ ([0-9]+)-([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
        # Days+HH:MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*86400 + ${BASH_REMATCH[2]}*3600 + ${BASH_REMATCH[3]}*60 + ${BASH_REMATCH[4]} ))
    elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
        # HH:MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*3600 + ${BASH_REMATCH[2]}*60 + ${BASH_REMATCH[3]} ))
    elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]) ]]; then
        # MM:SS format
        SECONDS=$(( ${BASH_REMATCH[1]}*60 + ${BASH_REMATCH[2]} ))
    else
        # SS format
        SECONDS=${ETIME}
    fi

    # Convert threshold to seconds
    THRESHOLD=$((THRESHOLD_MINUTES * 60))

    if [ $SECONDS -ge $THRESHOLD ]; then
        echo "Killing process $PID ($COMM) running for $ETIME"
        kill -9 $PID
    fi
done

To run this every 5 minutes via cron:

*/5 * * * * /path/to/your/script.sh >> /var/log/process_cleanup.log 2>&1

For production systems, consider this enhanced version with proper logging:

#!/bin/bash

LOG_FILE="/var/log/long_running_kill.log"
MAX_LOG_SIZE=1048576 # 1MB

# Rotate log if needed
if [ -f "$LOG_FILE" ]; then
    LOG_SIZE=$(stat -c%s "$LOG_FILE")
    if [ $LOG_SIZE -gt $MAX_LOG_SIZE ]; then
        mv "$LOG_FILE" "${LOG_FILE}.old"
    fi
fi

{
    echo "=== Process cleanup started at $(date) ==="
    
    # Main processing logic here (same as above)
    # ...
    
    echo "=== Cleanup completed ==="
} >> "$LOG_FILE"
  • Always test with echo before actual kill
  • Consider whitelisting critical processes
  • Monitor /var/log/messages for potential issues
  • For system users, check dependencies with pstree -p

For systems where ps output parsing is problematic:

#!/bin/bash

for pid in $(find /proc -maxdepth 1 -type d -name '[0-9]*' -user $TARGET_USER); do
    start_time=$(cat $pid/stat | awk '{print $22}')
    run_seconds=$(( ($(date +%s) - start_time) / 100 ))
    
    if [ $run_seconds -gt 300 ]; then
        kill -9 ${pid##*/}
    fi
done