Optimizing NIC Interrupt Handling: NAPI vs. Single Interrupt Per Frame for High-Throughput Squid Proxy on Broadcom 5709


3 views

When dealing with high-traffic Squid proxy servers (like handling 100Mbps at 10,000 pps), the Broadcom 5709 NIC's default interrupt behavior becomes problematic. The symptoms you're seeing - latency spikes to 200ms and Squid response times jumping from 30ms to 500+ms - are classic signs of interrupt overload, despite having CPU headroom.

Modern Linux kernels (2.6.38+) offer two primary approaches for NIC packet handling:

1. Traditional Interrupt-per-Packet (IRQ)
2. New API (NAPI) with polling

For your hardware configuration (Xeon E5530, bnx2 driver), the interrupt rate of 15,000/s suggests the system is spending too much time handling IRQs rather than processing packets.

A good rule of thumb: when packet rate exceeds 5,000-10,000 pps, NAPI typically becomes beneficial. Your case (10,000 pps) is right at the threshold where NAPI should help, but requires careful tuning.

First, check current settings:

ethtool -c eth0

For the bnx2 driver, try these conservative coalescing settings:

ethtool -C eth0 rx-usecs 100 rx-frames 32
ethtool -C eth1 rx-usecs 100 rx-frames 32

Watch these metrics after changes:

watch -n 1 "cat /proc/interrupts | grep eth"
mpstat -P ALL 1
cat /proc/net/softnet_stat

If coalescing isn't enough, force NAPI behavior:

echo 1 > /sys/class/net/eth0/napi/defer_hard_irqs
echo 10000 > /sys/class/net/eth0/napi/defer_count

Add these to /etc/sysctl.conf:

net.core.netdev_budget = 600
net.core.netdev_max_backlog = 3000
net.core.somaxconn = 2048

If NAPI tuning doesn't resolve the issue, consider RPS (Receive Packet Steering):

echo ffff > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo 4096 > /proc/sys/net/core/rps_sock_flow_entries

When running Squid proxy servers on Broadcom 5709 NICs (bnx2 driver), many admins notice latency spikes during peak traffic despite adequate CPU resources. The core issue manifests as:

  • 15,000+ interrupts/sec at just 10,000 packets/sec
  • Ping latency jumping to 200ms+ on local GbE
  • Squid response times degrading from 30ms to 500ms

The Linux kernel offers two primary modes for NIC packet handling:

Method Mechanism Best For
Traditional IRQ Hard interrupt per packet Low PPS (under 5K)
NAPI (polling) Hybrid polling/interrupt High PPS (10K+)

First verify your current interrupt configuration:

# Check IRQ counts in 1 second intervals
watch -n1 "cat /proc/interrupts | grep eth"

# Check softirq processing
watch -n1 'grep "NET_RX|NET_TX" /proc/softirqs'

# Current coalesce settings
ethtool -c eth0

For Broadcom 5709 (bnx2) without adaptive interrupts, try these settings:

# Set rx-usecs to 100 (μs delay before IRQ)
# rx-frames 32 (packets before IRQ)
ethtool -C eth0 rx-usecs 100 rx-frames 32

# For extreme cases, enable NAPI fully
sysctl -w net.core.netdev_budget=600
sysctl -w net.core.netdev_budget_usecs=6000

With 8 cores and MSI-X, distribute interrupts effectively:

# Manually pin IRQs (example for 16 IRQs across 8 cores)
for i in {0..15}; do
  echo $((i%8)) > /proc/irq/$((16+i))/smp_affinity_list
done

# Verify CPU usage per IRQ
grep "CPU\|eth" /proc/interrupts

Complement NIC tuning with Squid optimizations:

# Increase filedescriptors
squid -k parse | grep MaxFD

# In squid.conf
max_filedescriptors 16384
workers 4

If tuning doesn't resolve issues, consider these hardware indicators:

  • Sustained interrupts >20K/sec at <50% link utilization
  • More than 5% CPU time in softirq (ksoftirqd)
  • Packet drops in ethtool -S eth0 | grep drop

For Broadcom 5709 specifically, upgrading to later firmware (5.2.3+) often resolves MSI-X issues.