When managing Linux servers, we often need to terminate processes that have been running excessively long by specific users. This prevents resource hogging and maintains system stability. Here's a robust solution for CentOS/RHEL systems.
The most reliable approach combines ps
with process time filtering and targeted killing:
#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5
# Get PIDs of processes running longer than threshold
pids=$(ps -u $USERNAME -o pid,etimes= | \
awk -v threshold=$((THRESHOLD_MINUTES*60)) \
'NR>1 && $2>=threshold {print $1}')
# Kill the processes if any found
if [ -n "$pids" ]; then
kill $pids
echo "$(date): Killed following PIDs for user $USERNAME: $pids" >> /var/log/long_process_killer.log
fi
For production systems, we should add more safeguards:
#!/bin/bash
USERNAME="target_user"
THRESHOLD_MINUTES=5
PROCESS_WHITELIST=("sshd" "bash") # Processes to never kill
LOG_FILE="/var/log/process_killer.log"
# Function to check if process is in whitelist
is_whitelisted() {
local pid=$1
local comm=$(ps -p $pid -o comm= 2>/dev/null)
[[ " ${PROCESS_WHITELIST[@]} " =~ " ${comm} " ]]
}
# Main logic
pids=$(ps -u $USERNAME -o pid,etimes=,comm= | \
awk -v threshold=$((THRESHOLD_MINUTES*60)) \
'NR>1 && $2>=threshold {print $1,$3}')
while read pid comm; do
if ! is_whitelisted $pid; then
if kill $pid 2>/dev/null; then
echo "$(date): Killed PID $pid ($comm) for user $USERNAME" >> $LOG_FILE
else
echo "$(date): Failed to kill PID $pid ($comm)" >> $LOG_FILE
fi
fi
done <<< "$pids"
To run this every 5 minutes via cron:
# Edit crontab as root
sudo crontab -e
# Add this line (adjust path to script)
*/5 * * * * /path/to/process_killer.sh >/dev/null 2>&1
For systems where ps output format varies, we can parse proc directly:
#!/bin/bash
USERNAME="target_user"
THRESHOLD=$((5*60)) # 5 minutes in seconds
for pid in /proc/[0-9]*; do
if [ -f "$pid/status" ] && [ -f "$pid/stat" ]; then
uid=$(grep '^Uid:' $pid/status | awk '{print $2}')
user=$(getent passwd $uid | cut -d: -f1)
if [ "$user" = "$USERNAME" ]; then
start_time=$(stat -c %Y $pid)
now=$(date +%s)
elapsed=$((now - start_time))
if [ $elapsed -ge $THRESHOLD ]; then
pid_num=$(basename $pid)
kill $pid_num 2>/dev/null && \
echo "Killed $pid_num running for $((elapsed/60)) minutes"
fi
fi
fi
done
- Always test scripts in a non-production environment first
- Consider adding email notifications for killed processes
- For critical systems, implement gradual termination (SIGTERM first, then SIGKILL)
- Monitor the log file for unexpected terminations
When managing Linux servers, we often need to automatically terminate stale processes that exceed specific runtime thresholds. For CentOS/RHEL systems, this becomes particularly important for:
- Preventing resource hogging by runaway processes
- Automating cleanup in cron jobs
- Maintaining system stability for specific users
Here's a robust script that combines ps
, awk
, and kill
commands with precise time calculations:
#!/bin/bash
TARGET_USER="yourusername"
THRESHOLD_MINUTES=5
# Get current timestamp in seconds since epoch
NOW=$(date +%s)
ps -u $TARGET_USER -o pid= -o etime= -o comm= | while read PID ETIME COMM
do
# Convert elapsed time to seconds
if [[ $ETIME =~ ([0-9]+)-([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
# Days+HH:MM:SS format
SECONDS=$(( ${BASH_REMATCH[1]}*86400 + ${BASH_REMATCH[2]}*3600 + ${BASH_REMATCH[3]}*60 + ${BASH_REMATCH[4]} ))
elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]):([0-9][0-9]) ]]; then
# HH:MM:SS format
SECONDS=$(( ${BASH_REMATCH[1]}*3600 + ${BASH_REMATCH[2]}*60 + ${BASH_REMATCH[3]} ))
elif [[ $ETIME =~ ([0-9][0-9]):([0-9][0-9]) ]]; then
# MM:SS format
SECONDS=$(( ${BASH_REMATCH[1]}*60 + ${BASH_REMATCH[2]} ))
else
# SS format
SECONDS=${ETIME}
fi
# Convert threshold to seconds
THRESHOLD=$((THRESHOLD_MINUTES * 60))
if [ $SECONDS -ge $THRESHOLD ]; then
echo "Killing process $PID ($COMM) running for $ETIME"
kill -9 $PID
fi
done
To run this every 5 minutes via cron:
*/5 * * * * /path/to/your/script.sh >> /var/log/process_cleanup.log 2>&1
For production systems, consider this enhanced version with proper logging:
#!/bin/bash
LOG_FILE="/var/log/long_running_kill.log"
MAX_LOG_SIZE=1048576 # 1MB
# Rotate log if needed
if [ -f "$LOG_FILE" ]; then
LOG_SIZE=$(stat -c%s "$LOG_FILE")
if [ $LOG_SIZE -gt $MAX_LOG_SIZE ]; then
mv "$LOG_FILE" "${LOG_FILE}.old"
fi
fi
{
echo "=== Process cleanup started at $(date) ==="
# Main processing logic here (same as above)
# ...
echo "=== Cleanup completed ==="
} >> "$LOG_FILE"
- Always test with
echo
before actualkill
- Consider whitelisting critical processes
- Monitor
/var/log/messages
for potential issues - For system users, check dependencies with
pstree -p
For systems where ps
output parsing is problematic:
#!/bin/bash
for pid in $(find /proc -maxdepth 1 -type d -name '[0-9]*' -user $TARGET_USER); do
start_time=$(cat $pid/stat | awk '{print $22}')
run_seconds=$(( ($(date +%s) - start_time) / 100 ))
if [ $run_seconds -gt 300 ]; then
kill -9 ${pid##*/}
fi
done