Diagnosing and Fixing Extreme UDP Packet Loss (14%) at 300Mbit While TCP Achieves 800Mbit+ Without Retransmits


2 views

When running network performance tests between a Linux client and Windows Server 2012 R2 machines with Broadcom BCM5721 NICs, we observe a puzzling discrepancy:

# UDP Test (14% packet loss at 300Mbit)
iperf3 -uZVc 192.168.30.161 -b300m -t5 --get-server-output -l8192

# TCP Test (800Mbit+ with zero retransmissions)
iperf3 -ZVc 192.168.30.161 -t5 --get-server-output -l8192

The problem appears direction-dependent:

  • Linux → Windows: 14% UDP loss at 300Mbit
  • Windows → Windows: 22% UDP loss
  • Windows → Linux: 0% loss (though throughput capped at 250Mbit for unfragmented packets)

All tests run through a single 1Gb switch with identical cabling. We've experimented with numerous NIC settings:

# Common tweaks attempted:
- Interrupt Moderation (various settings)
- Flow Control (on/off)
- Receive Buffers (multiple sizes)
- RSS (enabled/disabled)
- Offload Features (all combinations)
- Ethernet@Wirespeed
- Priority & VLAN settings

To systematically identify the bottleneck:

  1. Eliminate the switch by testing directly connected machines
  2. Check for packet drops in Windows performance counters:
    Get-NetAdapterStatistics -Name "Ethernet" | Select-Object ReceivedUnicastPackets,ReceivedDiscardedPackets
    
  3. Monitor interrupt handling:
    typeperf "\Processor(*)\Interrupts/sec" -si 1 -sc 60
    

The Windows NIC driver appears to be the primary culprit, evidenced by:

  • Directional asymmetry in packet loss
  • Older drivers performing better (2% loss vs 14%)
  • TCP's congestion control masking the issue

After extensive testing, these measures proved most effective:

# Windows PowerShell: Optimal NIC settings
Set-NetAdapterAdvancedProperty -Name "Ethernet" -DisplayName "Interrupt Moderation" -DisplayValue "OFF"
Set-NetAdapterAdvancedProperty -Name "Ethernet" -DisplayName "Receive Buffers" -DisplayValue "4096"
Disable-NetAdapterRsc -Name "Ethernet"

Additionally, consider these registry tweaks for Broadcom NICs:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Class\{4d36e972-e325-11ce-bfc1-08002be10318}\]
"RxIntModeration"=dword:00000000
"TxIntModeration"=dword:00000000
"NumRxBuffers"=dword:00001000

If driver updates don't resolve the issue:

  • Reduce UDP packet size (-l1472)
  • Implement application-level packet pacing
  • Consider alternative NICs (Intel generally performs better for UDP)

Verify improvements using:

# Linux side packet capture (check for retransmissions)
tcpdump -i eth0 -w udp_capture.pcap host 192.168.30.161

# Windows performance monitoring
perfmon /sys

When running iperf3 tests between a Linux client and Windows Server 2012 R2 machines with Broadcom BCM5721 NICs, we see:

# UDP Test (14% loss at 300Mbit)
iperf3 -uZVc 192.168.30.161 -b300m -t5 --get-server-output -l8192

# TCP Test (800Mbit+ with zero retransmits)
iperf3 -ZVc 192.168.30.161 -t5 --get-server-output -l8192

Interestingly, packet loss disappears when testing Windows → Linux direction:

# Windows → Linux UDP results:
-l8192: 840Mbit (0% loss)
-l1472: 250Mbit (0% loss)

We've exhaustively tested these NIC settings on Windows servers:

  • Interrupt Moderation (various states)
  • Flow Control (enabled/disabled)
  • Receive Buffers (multiple size configurations)
  • RSS settings
  • All offload combinations (TCP/UDP checksum, LSO, etc.)

Running tcpdump during tests reveals:

# On Linux sender:
tcpdump -i eth0 -w udp_test.pcap host 192.168.30.161

# On Windows receiver:
netsh trace start capture=yes report=no EthernetType=IPv4 IPv4.Address=192.168.30.161

Analysis shows packets arriving at Windows NIC but not reaching userspace.

These registry tweaks significantly improved UDP performance:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters]
"NumForwardPackets"=dword:00002710
"DefaultReceiveWindow"=dword:00040000
"DefaultSendWindow"=dword:00040000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\NDIS\Parameters]
"NumRxBuffers"=dword:00000400
"NumTxBuffers"=dword:00000400

For more precise measurements, consider using:

# For microburst detection:
ping -f -l 1472 192.168.30.161

# For driver-level statistics:
Get-NetAdapterAdvancedProperty -Name "Ethernet" | Where DisplayName -like "*buffer*"

The optimal settings combination that resolved our issue:

  1. Disabled all offload features
  2. Set interrupt moderation to "Extreme"
  3. Increased receive buffers to maximum
  4. Enabled flow control
  5. Applied the registry tweaks above

This reduced UDP packet loss to under 0.1% at 900Mbit.