Unix/Linux Command for Real-Time Line Counting per Second from stdin (for SQL Query/Log Analysis)


2 views

When debugging database performance or monitoring web traffic, we often need to measure event frequency in real-time. A common scenario:

tail -f query.log | grep "SELECT" | [WE_NEED_SOLUTION_HERE]

The missing piece should output metrics like:

12 lines/s
18 lines/s
23 lines/s

Using awk with system timestamps:

awk '{print strftime("%s"); system("")}' | uniq -c

But this shows total counts per whole second. For smoother updates:

Rolling window with watch + wc:

watch -n 1 'grep "SELECT" query.log | wc -l'

Limited to 1-second granularity and resets counts each interval.

The pv (pipe viewer) tool provides the most elegant solution:

grep "SELECT" query.log | pv --line-mode --rate > /dev/null

Sample output showing real-time throughput:

1.2MiB 0:00:15 [ 153kiB/s] [ 325 lines/s]

For microsecond precision, use this Perl one-liner:

perl -ne 'BEGIN{$|++} $t=time; $c++; 
if(time != $t) {printf "%d lines/s\n",$c; $c=0}'

Or Python version with floating-point precision:

python3 -c 'import time,sys;t=time.time();c=0
for line in sys.stdin:
    c+=1
    nt=time.time()
    if nt>t+1:
        print(f"{c/(nt-t):.1f} lines/s")
        t=nt;c=0'

Multi-second averages:

grep "INSERT" audit.log | pv --line-mode --interval 5 --average-rate > /dev/null

Logarithmic scaling for bursty traffic:

journalctl -f | awk -v last=$(date +%s) \
'{now=systime(); if(now>last){print NR/(now-last)" lines/s"; NR=0; last=now}}'

When processing high-volume logs (100k+ lines/sec):

  • Use compiled tools (pv, awk) over interpreted ones
  • Disable unnecessary terminal updates with --no-terminal in pv
  • Consider buffering with stdbuf for more consistent measurements

When performance testing SQL queries or monitoring web server logs, we often need real-time metrics on throughput. The fundamental requirement is counting input lines per second from stdin - similar to measuring QPS (Queries Per Second) or RPS (Requests Per Second).

The pv (pipe viewer) command offers the closest built-in solution:

grep "SELECT" query.log | pv --line-mode --rate > /dev/null

This gives continuous output like:

5.2k/s

For more control over measurement intervals, this AWK one-liner works well:

grep "SELECT" query.log | awk '{current=strftime("%s"); if (prev==current) {count++} else {print prev, count; prev=current; count=1}}'

Sample output showing per-second counts:

1659322800 42
1659322801 38
1659322802 45

For averages over longer periods (e.g., 10-second windows):

grep "SELECT" query.log | awk '
{
    now=systime()
    counts[now]++
    # Clean old entries
    for (t in counts) 
        if (now - t > 10) 
            delete counts[t]
    # Print stats every second
    if (now != last_print) {
        sum=0
        for (t in counts) sum+=counts[t]
        print now, sum/length(counts)
        last_print=now
    }
}'

For critical monitoring, consider this robust version with error handling:

#!/usr/bin/env bash

interval=1  # Measurement interval in seconds
window=60   # Rolling window size

grep "SELECT" query.log | awk -v intvl="$interval" -v win="$window" '
BEGIN {
    if (intvl <= 0 || win <= 0) {
        print "Invalid parameters" > "/dev/stderr"
        exit 1
    }
    last=0
}
{
    now=systime()
    bucket=int(now/intvl)*intvl
    counts[bucket]++
    
    # Purge old buckets
    cutoff=bucket-win*intvl
    for (t in counts)
        if (t < cutoff)
            delete counts[t]
    
    if (bucket != last) {
        sum=0
        for (t in counts)
            sum+=counts[t]
        print strftime("%Y-%m-%d %H:%M:%S",bucket), sum/length(counts)
        last=bucket
    }
}'
  • dstat - Combined system stats with plugin support
  • netdata - Full monitoring dashboard solution
  • ts (from moreutils) - Timestamp prepending utility