When troubleshooting network connectivity issues, you might observe this behavior:
ping -c 4 -M do -s 1431 212.58.244.69
PING 212.58.244.69 (212.58.244.69) 1431(1459) bytes of data.
From 217.155.134.6 icmp_seq=1 Frag needed and DF set (mtu = 1458)
From 217.155.134.4 icmp_seq=2 Frag needed and DF set (mtu = 1458)
The initial ICMP "fragmentation needed" message comes from the router (217.155.134.6), but subsequent messages originate from localhost (217.155.134.4), indicating PMTU caching.
The traditional netstat -rCn
shows routing cache but has limitations:
netstat -rCn
Kernel IP routing cache
Source Destination Gateway Flags MSS Window irtt Iface
217.155.134.4 212.58.244.69 217.155.134.6 1500 0 0 eth0
More reliable modern alternatives include:
iproute2 Method
ip route get to 212.58.244.69
212.58.244.69 via 217.155.134.6 dev eth1 src 217.155.134.4
cache mtu 1500 advmss 1460 hoplimit 64
Kernel PMTU Cache
For TCP connections, examine the PMTU cache through procfs:
cat /proc/net/pmtu_disc_cache
Destination MTU Age
212.58.244.69 1458 120
When standard tools don't reveal the actual PMTU, try these approaches:
1. Tracepath for Path Discovery
tracepath -n 212.58.244.69
1: 217.155.134.4 0.089ms pmtu 1500
2: 217.155.134.6 1.201ms
3: 195.99.125.101 9.872ms pmtu 1458
2. TCP MTU Probing
Enable MTU probing in sysctl:
sysctl -w net.ipv4.tcp_mtu_probing=1
Windows Systems
netsh interface ipv4 show destinationcache
macOS Systems
netstat -rnv
When PMTU discovery fails, consider these workarounds:
# Temporarily lower interface MTU
ip link set dev eth0 mtu 1400
# Or disable PMTU discovery (not recommended)
sysctl -w net.ipv4.ip_no_pmtu_disc=1
For developers needing programmatic access to PMTU values:
#include <netinet/in.h>
#include <netinet/ip.h>
int get_pmtu(int sockfd, struct sockaddr_in *dest) {
socklen_t len = sizeof(int);
int mtu = 0;
getsockopt(sockfd, IPPROTO_IP, IP_MTU, &mtu, &len);
return mtu;
}
When troubleshooting network connectivity issues where packets with DF (Don't Fragment) bit set get dropped, understanding Path MTU (PMTU) caching becomes crucial. The behavior you're observing - where initial ICMP "fragmentation needed" messages come from the router but subsequent ones originate locally - indicates your system has cached the PMTU information.
The netstat -rCn
output displays the kernel's routing cache, but has limitations:
# Typical output showing interface MTU instead of path MTU
217.155.134.4 212.58.244.69 217.155.134.6 1500 0 0 eth0
The MSS column actually shows the Maximum Segment Size (typically MTU-40 for TCP/IP headers), not the discovered PMTU. This explains why you're seeing 1500 (interface default) rather than the actual path MTU.
On modern Linux systems, better alternatives exist:
# 1. Using ip route show cache
ip route get to 212.58.244.69
# Sample output showing cached MTU:
212.58.244.69 via 217.155.134.6 dev eth1 src 217.155.134.4
cache mtu 1500 advmss 1460 hoplimit 64
For more detailed PMTU information, check the /proc
filesystem:
cat /proc/net/rt_cache
The kernel maintains PMTU information in its internal data structures. To inspect these:
# View IPv4 PMTU cache (requires root)
cat /proc/net/ipv4_route
# For IPv6:
cat /proc/net/ipv6_route
For specific destination debugging, combine with grep:
cat /proc/net/ipv4_route | grep 212.58.244.69
- First verify PMTU discovery is working:
ping -M do -s 1472 example.com # Adjust size based on expected MTU
- Check the current cached value:
ip route get to example.com | grep mtu
- If needed, flush the PMTU cache:
ip route flush cache
When standard tools don't show the expected PMTU, consider:
# Using tracepath which shows MTU per hop
tracepath -n 212.58.244.69
# Using tcpdump to observe PMTU discovery in action
tcpdump -n -i eth0 "icmp and icmp[0] == 3 and icmp[1] == 4"
These sysctl settings control PMTU behavior:
# View current settings
sysctl net.ipv4.ip_no_pmtu_disc
sysctl net.ipv4.route.mtu_expires
# Temporary modification
sysctl -w net.ipv4.route.mtu_expires=600