How to Monitor Apache Logs in Real-Time and Trigger Actions on New Entries


7 views

When working with Apache web server logs, a common requirement is to process entries as they appear without re-reading the entire file. Traditional approaches like cron jobs or periodic scanning waste resources and introduce delays. What we really want is to detect and process log entries the moment they're written.

While tail -f shows new entries in real-time, it doesn't provide a programmatic way to process them. We need a solution that:

  • Maintains position in the file between runs
  • Only processes new entries
  • Can trigger specific actions on matches
  • Handles log rotation gracefully

Here's a robust solution for Linux systems that uses inotify to monitor file changes:


#!/usr/bin/env python3
import pyinotify
import re

class LogEventHandler(pyinotify.ProcessEvent):
    def __init__(self, pattern, callback):
        self.pattern = re.compile(pattern)
        self.callback = callback
    
    def process_default(self, event):
        if event.mask & pyinotify.IN_MODIFY:
            with open(event.pathname, 'r') as f:
                f.seek(self.last_pos if hasattr(self, 'last_pos') else 0)
                new_lines = f.readlines()
                self.last_pos = f.tell()
                
                for line in new_lines:
                    if self.pattern.search(line):
                        self.callback(line)

def process_matched_line(line):
    print(f"Matched line: {line.strip()}")
    # Add your custom logic here

wm = pyinotify.WatchManager()
handler = LogEventHandler(r'404', process_matched_line)
notifier = pyinotify.Notifier(wm, handler)
wdd = wm.add_watch('/var/log/apache2/access.log', pyinotify.IN_MODIFY)
notifier.loop()

For simpler cases, you can use a combination of tail and awk:


#!/bin/bash
LOG_FILE="/var/log/apache2/access.log"
TEMP_FILE="/tmp/last_position.tmp"

# Get last position or start at beginning
[ -f "$TEMP_FILE" ] && LAST_POS=$(cat "$TEMP_FILE") || LAST_POS=0

while true; do
    CURRENT_SIZE=$(stat -c %s "$LOG_FILE")
    
    if [ "$CURRENT_SIZE" -lt "$LAST_POS" ]; then
        # Log file was rotated
        LAST_POS=0
    fi
    
    if [ "$CURRENT_SIZE" -gt "$LAST_POS" ]; then
        # Read new content
        dd if="$LOG_FILE" bs=1 skip="$LAST_POS" 2>/dev/null | while read -r line; do
            if [[ "$line" =~ "404" ]]; then
                echo "Found 404 error: $line"
                # Add your action here
            fi
        done
        LAST_POS="$CURRENT_SIZE"
        echo "$LAST_POS" > "$TEMP_FILE"
    fi
    
    sleep 1
done

Production systems require special handling for log rotation. The Python example automatically detects file changes while the bash script checks for size reduction. For more robust rotation handling:


# In the Python solution, add this to the process_default method:
if event.mask & pyinotify.IN_DELETE or event.mask & pyinotify.IN_MOVE_SELF:
    self.last_pos = 0
    wm.add_watch(event.pathname, pyinotify.IN_MODIFY)

For high-traffic servers:

  • Batch process multiple lines at once
  • Use efficient pattern matching (pre-compiled regex)
  • Consider buffering matches before taking action
  • Offload intensive processing to separate threads

When monitoring Apache logs, repeatedly scanning entire files or even recent chunks creates unnecessary overhead. The ideal solution should:

  • Process entries exactly once
  • Respond in real-time
  • Minimize resource usage
  • Handle log rotation gracefully

The tail -f command (follow mode) is specifically designed for this use case. Here's a basic implementation:

#!/bin/bash
LOG_FILE="/var/log/apache2/access.log"

tail -n0 -F "$LOG_FILE" | while read LINE
do
  # Your processing logic here
  if [[ "$LINE" =~ "404" ]]; then
    echo "Found 404 error: $LINE" >> /var/log/my_monitor.log
    # Trigger additional actions
  fi
done

The magic happens with these options:

  • -n0: Start reading at the end of file (no existing lines)
  • -F: Follow by name (handles log rotation)
  • while read LINE: Processes each new line as it arrives

For more complex filtering, consider this regex example:

#!/bin/bash
LOG_FILE="/var/log/apache2/error.log"

tail -n0 -F "$LOG_FILE" | while read LINE
do
  if [[ "$LINE" =~ "PHP (Fatal|Parse) error" ]]; then
    send_alert_email "Critical PHP error detected: ${BASH_REMATCH[1]}"
    logger -t apache_monitor "PHP error: $LINE"
  fi
done

For busy servers, consider these optimizations:

#!/bin/bash
# Buffer output for performance
LOG_FILE="/var/log/apache2/access.log"
BUFFER_SIZE=100
TIMEOUT=5

tail -n0 -F "$LOG_FILE" | stdbuf -oL awk -v buffer="$BUFFER_SIZE" -v timeout="$TIMEOUT" '
{
  lines[NR % buffer] = $0
  if (NR % buffer == 0 || systime() - last_flush > timeout) {
    process_buffer(lines)
    last_flush = systime()
  }
}

function process_buffer(arr) {
  for (i in arr) {
    if (arr[i] ~ /POST \/admin/) {
      # Security alert action
    }
  }
}'

For production environments, consider these robust solutions:

  • systemd journal: journalctl -f -u apache2
  • SWATCH: The simple watcher utility
  • Filebeat: Elastic's log shipper with processors

Always test your script with:

# Simulate log rotation
mv access.log access.log.old
touch access.log
service apache2 restart