Understanding and Troubleshooting PCIe ASPM: How Disabling `pcie_aspm` Fixed My e1000e Ethernet Driver Issues


3 views

The pcie_aspm (Active State Power Management) kernel parameter controls power saving features for PCI Express devices. When enabled, it allows PCIe links to enter low-power states (L0s/L1) when idle. While beneficial for power efficiency, it can sometimes cause compatibility issues with certain hardware, particularly network interface cards.

In my case, the e1000e driver for Intel Ethernet controllers was failing to initialize properly. System logs showed timeouts and link training failures:


[   12.345678] e1000e 0000:00:19.0: Failed to initialize MSI-X interrupts
[   12.345790] e1000e 0000:00:19.0: Failed to initialize the device

After extensive troubleshooting, I discovered the issue was related to PCIe power management. The e1000e driver wasn't recovering properly from ASPM low-power states. This manifested in several ways:

  • Intermittent network connectivity
  • Driver initialization failures
  • System hangs when bringing up the interface

To resolve this, I disabled ASPM in the kernel boot parameters by adding:


pcie_aspm=off

To make this persistent across reboots, I added it to GRUB configuration:


# Edit /etc/default/grub
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pcie_aspm=off"

# Update GRUB
sudo update-grub

For cases where complete ASPM disable isn't desirable, you can try more targeted solutions:


# Disable ASPM for specific device
echo 0 | sudo tee /sys/bus/pci/devices/0000:00:19.0/power/control

# Or disable just L1 state
setpci -s 00:19.0 CAP_EXP+0x10.b=0x00

After applying the fix, verify ASPM status:


lspci -vv -s 00:19.0 | grep ASPM

Expected output should show ASPM disabled:


LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM not supported

While disabling ASPM resolves compatibility issues, be aware of potential impacts:

  • Increased power consumption (typically 0.5-2W per PCIe device)
  • No significant performance impact for most workloads
  • May affect battery life on laptops

The official kernel documentation covers ASPM parameters in detail:


Documentation/admin-guide/kernel-parameters.txt
Documentation/power/pci.rst

PCIe Active State Power Management (ASPM) is a power-saving feature that allows PCI Express links to enter low-power states when idle. While generally beneficial for power efficiency, it can sometimes cause compatibility issues with certain hardware - exactly what happened in your case with the e1000e driver.

ASPM operates at the hardware level with two main power states:

  • L0s: Fast transition latency (microseconds)
  • L1: Higher power savings but longer wake latency (hundreds of microseconds)

The Linux kernel exposes control through these parameters:

pcie_aspm=off|force|powersave|performance
pcie_aspm.policy=default|performance|powersave

The e1000e Intel Ethernet driver is particularly sensitive to ASPM timing issues. When ASPM is active, you might experience:

  • Intermittent connection drops
  • Reduced throughput
  • Complete failure to initialize

To check your current ASPM settings:

# Check supported ASPM states
lspci -vv | grep -i aspm

# View current ASPM policy
cat /sys/module/pcie_aspm/parameters/policy

For a persistent fix, you have several options:

1. Kernel Boot Parameter (recommended):

# Add to GRUB_CMDLINE_LINUX in /etc/default/grub
GRUB_CMDLINE_LINUX="pcie_aspm=off"

2. Driver-specific Workaround:

# Create modprobe configuration
echo "options e1000e InterruptThrottleRate=3000,3000,3000" > /etc/modprobe.d/e1000e.conf

3. Runtime Disable (temporary):

# For immediate testing
echo "performance" > /sys/module/pcie_aspm/parameters/policy

If power savings are critical (like on laptops), try these compromises:

  • Enable only L0s state: pcie_aspm.policy=powersave
  • Increase PCIe latency tolerance: pcie_aspm.l1_acceptable_latency=1000

After making changes, verify the improvement:

# Check link status
ethtool -S eth0 | grep errors

# Monitor ASPM transitions
sudo perf stat -e 'power/energy-pkg/' ping -c 1000 example.com