Efficient CIDR Range Matching in Log Files: Regex Techniques and Tool Alternatives


2 views

When working with server logs, matching IP addresses within specific CIDR ranges often proves more complex than simple string matching. While basic cases like /8, /16, and /24 masks are straightforward, arbitrary CIDR ranges require careful regex construction.

The core challenge lies in converting the binary network mask into corresponding regex patterns for each octet. For example, a /17 mask means we need to match specific ranges in the third octet while allowing any values in the remaining bits.

# Example: Matching 192.168.128.0/17
grep -E "192\.168\.(12[89]|1[3-9][0-9]|2[0-5][0-9])\." access_log

1. Using ipcalc for Regex Generation

The ipcalc tool (available on most Linux systems) can generate appropriate regex patterns:

ipcalc 192.168.128.0/17 | grep 'Hosts' | awk '{print $4}'

2. Perl One-liner Approach

Perl's Net::Netmask module provides robust CIDR handling:

perl -e 'use Net::Netmask; 
$block = Net::Netmask->new("192.168.128.0/17");
print join("|", map { s/\./\\./g; $_ } $block->enumerate())'

3. Python Script Solution

For more complex needs, a Python script offers flexibility:

#!/usr/bin/env python
import ipaddress
import re

def cidr_to_regex(cidr):
    net = ipaddress.ip_network(cidr)
    regex_parts = []
    for octet in str(net.network_address).split('.'):
        regex_parts.append(re.escape(octet))
    return r"\.".join(regex_parts)

print(cidr_to_regex("192.168.128.0/17"))

IPv6 CIDR matching requires different approaches due to hexadecimal notation and compression rules:

# Example IPv6 CIDR matching
grep -E "2001:0db8:.*:.*:.*:.*:.*:.*" access_log

For precise IPv6 matching, consider specialized tools like ipv6calc or Python's ipaddress module.

  • grepcidr: A dedicated tool for CIDR matching in log files
  • jq: When working with JSON logs, jq can filter by CIDR ranges
  • logstash: For large-scale log processing with CIDR filters

Remember that performance matters when processing large log files. Test your regex patterns with sample data before running them on production logs.


Working with log files often requires filtering entries based on IP address ranges. While simple CIDR blocks like /8, /16, or /24 are straightforward to match using basic regex patterns, more complex ranges (/17, /25, etc.) present significant challenges.

Here are some example patterns for different CIDR ranges:

# Example /16 range match (simple case)
grep " 192\\.168\\." access.log

# /17 range matching (more complex)
grep -E " 192\\.168\\.(12[89]|1[3-9][0-9]|2[0-5][0-9])\\." access.log

Instead of crafting complex regex patterns manually, consider these alternatives:

Option 1: Using ipcalc

grep -E "$(ipcalc 192.168.128.0/17 | grep '^Network' | cut -d' ' -f4)" access.log

Option 2: Python One-Liner

python3 -c "import ipaddress; print('|'.join(str(net) for net in ipaddress.IPv4Network('192.168.128.0/17')))" | xargs -I {} grep " {} " access.log

For IPv6 ranges, similar approaches work but require different tools:

# Using sipcalc for IPv6
grep -E "$(sipcalc 2001:db8::/32 | grep '^Network range' | awk '{print $4}')" access.log

Here's a simple bash function to generate regex patterns:

cidr2regex() {
    local cidr=$1
    local base=$(echo $cidr | cut -d/ -f1)
    local bits=$(echo $cidr | cut -d/ -f2)
    
    case $bits in
        8) echo "$(echo $base | cut -d. -f1)\\.";;
        16) echo "$(echo $base | cut -d. -f1-2 | tr . \\\\.)\\.";;
        24) echo "$(echo $base | cut -d. -f1-3 | tr . \\\\.)\\.";;
        *) python3 -c "import ipaddress,re; print(re.escape(str(ipaddress.IPv4Network('$cidr'))))";;
    esac
}

Using our cidr2regex function:

grep -E "$(cidr2regex 192.168.128.0/17)" access.log

This approach handles both simple and complex cases while maintaining readability.