During recent network quality assessments between two servers connected via 100Mbps links, I encountered an interesting discrepancy:
- iperf3 UDP test (port 9005) reported 96Mbps throughput with 3.3-3.7% packet loss
- tcpdump packet analysis showed only 0.25% average loss when examining actual traffic
# Typical iperf UDP test command
iperf3 -c receiver_ip -u -b 100M -p 9005 -t 60 -i 10
iperf calculates loss based on sequence numbers in UDP payloads. Each packet contains:
- Timestamp (8 bytes)
- Sequence number (4 bytes)
- Optional payload
# Capture commands on both ends
sender$ tcpdump -i eth0 udp port 9005 -w sender.pcap
receiver$ tcpdump -i eth0 udp port 9005 -w receiver.pcap
# Packet counting analysis
tshark -r sender.pcap -Y "udp.port==9005" | wc -l
tshark -r receiver.pcap -Y "udp.port==9005" | wc -l
Timing Window Mismatch
iperf reports loss over entire test duration while tcpdump analysis might capture different time segments.
Buffer Handling Variations
# Check interface statistics during test
watch -n 1 'ethtool -S eth0 | grep -E "dropped|errors"'
Kernel vs Userspace Accounting
Packet drops can occur at multiple levels:
- NIC hardware ring buffers
- Kernel network stack
- Application socket buffers
For different network applications:
Application Type | Tolerable Loss |
---|---|
VoIP | <1% |
Video Streaming | 1-2% |
Bulk Data Transfer | 2-5% |
Real-time Gaming | <0.5% |
#!/bin/bash
# Compare iperf loss with pcap analysis
IPERF_DURATION=60
PCAP_FILE="test_$(date +%s).pcap"
iperf3 -c $RECEIVER -u -b 100M -t $IPERF_DURATION -p 9005 --json > iperf.json &
tcpdump -i eth0 -w $PCAP_FILE udp port 9005 &
wait
# Parse results
IPERF_LOSS=$(jq '.end.sum.lost_percent' iperf.json)
PCAP_SENT=$(tshark -r $PCAP_FILE -Y "udp.srcport==9005" | wc -l)
PCAP_RECV=$(ssh $RECEIVER "tshark -r /tmp/receiver.pcap -Y 'udp.dstport==9005' | wc -l")
PCAP_LOSS=$(echo "scale=2; (1 - $PCAP_RECV/$PCAP_SENT)*100" | bc)
echo "iperf reported loss: ${IPERF_LOSS}%"
echo "pcap calculated loss: ${PCAP_LOSS}%"
- Use consistent time windows for both measurements
- Consider kernel buffer tuning before tests:
sysctl -w net.core.rmem_max=16777216 sysctl -w net.core.wmem_max=16777216
- Verify NIC settings with ethtool:
ethtool -g eth0 # Show ring buffer sizes
During network quality assessment between two 100Mbps servers, I encountered an interesting mismatch in packet loss measurements. While iperf3 -u -p 9005 -b 100M
reported 3.3-3.7% packet loss, subsequent tcpdump analysis of actual packet flows showed only ~0.25% loss.
Several factors could explain this variance:
- Buffer Differences: iperf's userspace buffers vs kernel network stack buffers
- Timing Mechanisms: iperf uses application-level timing while tcpdump captures at lower layers
- UDP Characteristics: Unlike TCP, UDP has no flow control or retransmission
To validate these findings, I used this tcpdump analysis approach:
# On sender:
tcpdump -i eth0 -w sender.pcap udp port 9005
# On receiver:
tcpdump -i eth0 -w receiver.pcap udp port 9005
# Compare sequence numbers (requires custom script):
python3 compare_pcaps.py sender.pcap receiver.pcap
For different network scenarios:
Network Type | Acceptable UDP Loss |
---|---|
LAN | <0.1% |
WAN | 0.1-1% |
Wireless | 1-5% |
Consider a VoIP application using G.711 codec:
// Simplified loss impact calculation
function calculate_voice_quality(packet_loss) {
const mos = 4.2 - (packet_loss * 0.1);
return mos > 1 ? mos : 1;
}
// 3% loss → MOS 3.9 (acceptable)
// 5% loss → MOS 3.7 (noticeable degradation)
For more precise measurements, consider:
- Using kernel bypass techniques like DPDK
- Implementing histogram analysis of inter-packet gaps
- Adding hardware timestamping with PTP