Monitoring Real-time Line Count in tail -f Output for Error Log Analysis

When debugging production systems, we often need to monitor error rates in real-time. The classic approach of tail -f | grep error shows us the errors, but doesn't give us quantitative metrics about the error frequency. Here's how to solve this monitoring gap.

The simplest approach combines three common Unix tools:

watch -n 1 "tail -n 100 /var/log/my_process/*.log | grep error | wc -l"

This gives you a line count update every second. The -n 100 limits to recent entries to prevent memory issues.

For long-running monitoring, consider this awk-based solution that maintains counts between refreshes:

tail -f /var/log/my_process/*.log | awk '/error/{count++; print count}'

To see error counts per time interval (e.g., per minute):

tail -f /var/log/my_process/*.log | \
awk '/error/{
    curr_min = strftime("%M"); 
    if (curr_min != last_min) {
        print strftime("%H:%M") ": " count;
        count = 0;
        last_min = curr_min;
    }
    count++
}'

When dealing with multiple log files, we need to handle log rotation and multiple sources:

find /var/log/my_process/ -name "*.log" -type f -print0 | \
xargs -0 tail -F | \
grep --line-buffered error | \
pv --line-mode --interval 1 | \
wc -l

The -F flag handles rotated logs, and pv gives us rate statistics.

For serious monitoring, consider this robust implementation:

# Persistent error counter with timestamp
tail -F /var/log/my_process/*.log | \
stdbuf -oL grep error | \
while read -r line; do
    ((count++))
    now=$(date +%s)
    if (( now - last_print >= 1 )); then
        echo "$(date '+%Y-%m-%d %H:%M:%S') - $count errors"
        count=0
        last_print=$now
    fi
done

This handles buffering issues and provides timestamps for better analysis.

When debugging production systems, we often need to monitor error rates in real-time. A common scenario is watching growing log files while counting occurrences of specific patterns (like error messages) over time.

The simplest approach combines watch with tail and wc:

watch -n 1 "tail -n 100 /var/log/my_process/*.log | grep error | wc -l"

For better monitoring, we can include timestamps:

while true; do
  echo -n "$(date '+%H:%M:%S') - ";
  tail -n 100 /var/log/my_process/*.log | grep error | wc -l;
  sleep 1;
done

To measure the rate of new errors rather than just totals:

prev_count=0
while true; do
  current_count=$(tail -n 100 /var/log/my_process/*.log | grep error | wc -l)
  new_errors=$((current_count - prev_count))
  echo "$(date '+%H:%M:%S') - New errors: $new_errors - Total: $current_count"
  prev_count=$current_count
  sleep 1
done

For more complex filtering before counting:

watch -n 1 "tail -n 200 /var/log/my_process/*.log | \
  awk '/error/ && !/known_warning/ {print}' | wc -l"

To both monitor and save results:

while true; do
  echo "$(date -Is) - $(tail -n 50 /var/log/my_process/*.log | \
    grep -c error)" | tee -a error_count.log
  sleep 1
done

When dealing with multiple log files, we should:

watch -n 1 "find /var/log/my_process/ -name '*.log' -mtime -1 | \
  xargs tail -n 50 | grep -c error"

ServerDevWorker

Monitoring Real-time Line Count in tail -f Output for Error Log Analysis

Related Articles