When analyzing Apache 2.2 logs for request rate measurement, the key fields in your LogFormat
are:
%t - Time when request was received (for request counting)
%D - Time taken to process request in microseconds (for performance analysis)
Here's a practical approach using common Unix tools:
# Count requests per second from access.log
awk -F'[:[]' '{print $2}' access.log | cut -d' ' -f1 | sort | uniq -c
# Breakdown example output:
# 42 02/Nov/2023:15:00:01
# 38 02/Nov/2023:15:00:02
# 45 02/Nov/2023:15:00:03
For more precise measurements across specific time intervals:
#!/bin/bash
# Analyze req/sec in 10-second buckets
awk -F'[:[]' '{bucket=int(substr($2,15,2)/10)*10; counts[bucket]++}
END {for (b in counts) print b"-"b+9"s: "counts[b]/10" req/sec"}' access.log
To correlate request rate with performance (%D field):
# Generate req/sec with average response time
awk -F' ' '{split($4,datetime,":");
sec=datetime[3];
count[sec]++;
total_time[sec]+=$12;
}
END {
for (s in count) {
printf "%s: %.2f req/sec (avg %dμs)\n",
s, count▼显示, total_time▼显示/count▼显示
}
}' access.log
For continuous monitoring across rotated logs:
# Real-time monitoring using tail
tail -f /var/log/apache2/access.log | awk -F'[:[]' '{
current_second=substr($2,15,2);
if (current_second != last_second) {
print last_second": "count" req/sec";
count=0;
last_second=current_second;
}
count++;
}'
For longer-term analysis, consider this R approach:
library(ggplot2)
logs <- read.table("access.log", sep=" ", col.names=c("vhost","port","ip","user",
"time","request","status","size","referer","ua","latency"))
logs$time <- as.POSIXct(logs$time, format="[%d/%b/%Y:%H:%M:%S")
logs$second <- format(logs$time, "%H:%M:%S")
req_rate <- aggregate(request~second, data=logs, FUN=length)
ggplot(req_rate, aes(x=second, y=request)) +
geom_line(group=1) +
labs(title="Requests per Second", y="req/sec")
The key to measuring requests per second lies in properly configuring your Apache log format. Based on your example, you've already included the crucial elements:
LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\" %D" combined
The relevant parameters for RPS calculation are:
- %t: Time when the request was received
- %D: Time taken to serve the request in microseconds
Here's a simple AWK script to calculate RPS from your access logs:
# Calculate requests per second from Apache logs
awk -F'[][]' '{
split($2, parts, ":")
date_str = parts[1]
time_str = parts[2]
gsub(/:/, " ", time_str)
timestamp = mktime(date_str " " time_str)
counts[timestamp]++
}
END {
for (ts in counts) {
printf "%s: %d requests\n", strftime("%Y-%m-%d %H:%M:%S", ts), counts[ts]
}
}' /var/log/apache2/access.log
For more sophisticated analysis, here's a Python script that calculates RPS and provides distribution:
import re
from collections import defaultdict
from datetime import datetime
log_pattern = re.compile(
r'\[(?P<timestamp>[^\]]+)\] '
r'"(?P<request>[^"]+)" '
r'(?P<status>\d+) '
r'(?P<size>\d+) '
r'"(?P<referer>[^"]*)" '
r'"(?P<user_agent>[^"]*)" '
r'(?P<time_taken>\d+)'
)
def calculate_rps(logfile_path):
time_counts = defaultdict(int)
with open(logfile_path) as f:
for line in f:
match = log_pattern.search(line)
if match:
log_data = match.groupdict()
timestamp = datetime.strptime(
log_data['timestamp'],
'%d/%b/%Y:%H:%M:%S %z'
)
time_key = timestamp.strftime('%Y-%m-%d %H:%M:00')
time_counts[time_key] += 1
for minute, count in sorted(time_counts.items()):
print(f"{minute}: {count} requests ({count/60:.2f} req/s)")
if __name__ == "__main__":
calculate_rps('/var/log/apache2/access.log')
For a ready-made solution, GoAccess provides excellent RPS metrics:
goaccess /var/log/apache2/access.log --log-format='%v:%p %h %l %u %t "%r" %>s %O "%{Referer}i" "%{User-Agent}i" %D' --date-format='%d/%b/%Y' --time-format='%H:%M:%S'
This will generate an interactive report showing requests per second, minute, and hour.
When comparing your stress test results with production logs:
- Filter logs to match your test time window
- Compare the RPS distribution with your test plan
- Check for anomalies in response times (%D) during peak periods
- Verify that response codes (%>s) remain stable