When examining the bonding configuration on this RHEL 6.4 system with Broadcom NetXtreme II NICs, we observe proper bond initialization but failure during the actual failover event:
# Current bond status check
cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: eth0
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:64:f8:ef:60
Slave Interface: eth1
MII Status: up
Link Failure Count: 0
Permanent HW addr: 00:22:64:f8:ef:62
The following elements must be verified for proper active-backup operation:
- Update bonding options in ifcfg-bond0:
BONDING_OPTS="mode=1 miimon=100 primary=eth0 fail_over_mac=1"
- Verify network manager isn't interfering:
chkconfig NetworkManager off service NetworkManager stop
For HP ProCurve switches, these settings are recommended:
interface 1
no lacp
spanning-tree portfast
!
interface 2
no lacp
spanning-tree portfast
To monitor bond transitions in real-time:
watch -n 0.5 "cat /proc/net/bonding/bond0 | grep -e 'Active' -e 'MII' -e 'Slave'"
Force a manual failover for testing:
ifdown eth0
sleep 5
ifup eth0
Add these parameters to /etc/modprobe.d/bonding.conf for better debugging:
options bonding max_bonds=2 miimon=100 downdelay=200 updelay=200
Here's a verified working configuration for RHEL 6.4:
# /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
IPADDR=192.168.11.222
NETMASK=255.255.255.0
GATEWAY=192.168.11.1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="mode=1 miimon=100 primary=eth0 fail_over_mac=1 use_carrier=0"
After making changes, restart networking:
service network restart
rmmod bonding
modprobe bonding
When working with NIC bonding in RHEL 6.4 (kernel-2.6.32-358.el6), the active-backup (mode=1) configuration appears to initialize correctly but fails to perform failover when the primary interface loses connectivity. The system shows all bonding components as operational through standard diagnostic commands:
# Check bond status
cat /proc/net/bonding/bond0
# Output should show:
Ethernet Channel Bonding Driver: v3.6.0
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
The key indicators of a properly functioning active-backup bond should be:
- Automatic promotion of backup NIC when primary fails
- ARP announcements updating the MAC address mapping
- Proper carrier detection through MII/ETHTOOL
To verify the actual failover behavior, run these diagnostic commands while unplugging the primary NIC:
# Monitor bond events in real-time
tail -f /var/log/messages | grep bond
# Check active slave changes
watch -n 1 cat /proc/net/bonding/bond0 | grep "Active Slave"
# Verify ARP updates (run from another host)
arp -a | grep bond0-ip
From experience with Broadcom BCM5708 NICs on HP hardware, several factors could disrupt failover:
Network Manager Interference
Despite NM_CONTROLLED=yes
in ifcfg files, NetworkManager may still interfere. Completely disable it:
service NetworkManager stop
chkconfig NetworkManager off
Switch Port Configuration
Some switches require special port settings for bonding. Verify these ProCurve 1800-8G settings:
interface 1-2
spanning-tree disable
no lacp
exit
Driver-Specific Issues
The bnx2 driver may need specific parameters. Create /etc/modprobe.d/bnx2.conf
:
options bnx2 disable_msi=0 debug=0x1
Modify your bond0 configuration with these enhanced parameters:
# /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
IPADDR=192.168.11.222
NETMASK=255.255.255.0
GATEWAY=192.168.11.1
ONBOOT=yes
BOOTPROTO=none
USERCTL=no
BONDING_OPTS="mode=1 miimon=100 primary=eth0 fail_over_mac=1 updelay=2000 downdelay=2000 use_carrier=1"
Key parameters explained:
fail_over_mac=1
: Ensure MAC address changes during failoverup/downdelay=2000
: Give switches time to update MAC tablesuse_carrier=1
: Better link detection with Broadcom NICs
After implementing these changes, test failover with this procedure:
# Start continuous ping test
ping -I bond0 192.168.11.1
# In another terminal, monitor bond status
watch -n 0.5 'cat /proc/net/bonding/bond0 | grep -E "Active|MII"'
# Physically disconnect eth0 cable
# Should observe:
# 1. Brief ping interruption (1-2 packets)
# 2. Active slave changes to eth1 in watch output
# 3. Ping resumes automatically
For production environments, verify these additional components:
# Check kernel bonding support
grep BONDING /boot/config-$(uname -r)
# Verify module loading order
lsmod | grep -E 'bnx2|bonding'
# Ensure proper initramfs inclusion
dracut -f -v