Click to See Complete Forum and Search --> : Bonding NIC - no Internet access with bond


fishface
01-22-2008, 10:32 AM
Using SLES10 and sysconfig

I've configured bonding of 2 identical NIC's in balance-alb mode and all works well for the most part, if I pull the plug on one NIC the other continues, and vice versa, they also start up again when the cable is re-inserted.

However, I have no Internet access, but all local connections to other file servers work perfectly (funnily enough, the machine in question will not need internet access - but I'm curious!) and I can SSH etc, it just has no Internet!

I can ping the default gateway and tried adding a "GATEWAY" option but it still doesn't work - what have I missed.

Here is my ifcfg-bond0 config file

BOOTPROTO="static"
IPADDR="192.168.199.80"
NETMASK="255.255.255.0"
NETWORK="192.168.199.0"
GATEWAY="192.168.199.1"
REMOTE_IPADDR=""
STARTMODE="onboot"
BONDING_MASTER="yes"
BONDING_MODULE_OPTS="mode=balance-alb miimon=100"
BONDING_SLAVE0="bus-pci-0000:00:0d.0"
BONDING_SLAVE1="bus-pci-0000:00:08.0"

And here is bond0 status form /proc/net/bonding

Ethernet Channel Bonding Driver: v3.0.1 (January 9, 2006)

Bonding Mode: adaptive load balancing
Primary Slave: None <----------------- Should this be eth0 or something?
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Link Failure Count: 2
Permanent HW addr: 00:03:47:6a:c6:80

Slave Interface: eth0
MII Status: up
Link Failure Count: 1
Permanent HW addr: 00:d0:b7:4c:bb:2c

The failure count is where I've been testing it by unplugging the network cable - I guess I'm correct in assuming this?

happybunny
01-22-2008, 01:49 PM
have you loaded the bonding module? Not sure how in Suse, but that driver needs to be loaded.

fishface
01-22-2008, 06:06 PM
I ran the following and got the correct output, from what I've read it gave the correct output
$ rpm -qf /sbin/ifup
$ grep ifenslave /sbin/ifup



....and it does work, as I said, pull the plug on one of the NICs and it all still works, bond0 MII reports that connection is down etc etc

bwkaz
01-22-2008, 10:03 PM
Please define "I have no Internet access". You can ping your default gateway: great! Can you ping Google? Can you ping one of Google's IP addresses? (Say, 64.233.167.104 for instance.) Can you telnet to port 80 at www.google.com -- and if not, what about one of Google's IP addresses?

;)

fishface
01-23-2008, 05:45 AM
Ok - good point.

When using Firefox, if a type an address in in states it is unable to connect, same goes for searches entered via the Google Search Toolbar - I also cannot ping 64.233.167.104, and other IP numbers I have tried it reports "Network is unreachable"

There is no Firewall running, another computer on the same router can do all of the above.

I can ping computers on the local network and do host lookups, they work as expected.

I've missed something but just can't find it......

fishface
01-23-2008, 10:28 AM
I've noticed (when using "ifconfig" that the permanent HWaddr address are different for eth0 and eth1, on all examples I've seen the the HWaddr addresses for eth0 and eth1 have an identical one

My ifconfig:

bond0 Link encap:Ethernet HWaddr 00:03:47:6a:c6:80
inet addr:192.168.199.80 Bcast:192.168.199.255 Mask:255.255.255.0
inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:256418531 errors:36 dropped:1123 overruns:0 frame:19
TX packets:3790992862 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3425938866 (3267.2 Mb) TX bytes:2295899125 (2189.5 Mb)

eth0 Link encap:Ethernet HWaddr 00:d0:b7:4c:bb:2c <------ different

UP BROADCAST NOARP SLAVE MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:129

eth1 Link encap:Ethernet HWaddr 00:03:47:6a:c6:80 <---- different
inet6 addr: fe80::204:23ff:feba:5089/64 Scope:Link
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:256418531 errors:36 dropped:1123 overruns:0 frame:19
TX packets:3790992862 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3425938866 (3267.2 Mb) TX bytes:2295899125 (2189.5 Mb)
Base address:0xec80 Memory:febe0000-fec00000


Example:

bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:0

eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 <---- this is the same as
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080

eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 <---as this
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400

I've screwed some where, just can't figure out where, and it all seems to work perfectly.

Sixstrings
01-23-2008, 10:59 AM
You are doing your bonding in a far different way than I do it.

bond0 file

DEVICE=bond0
BOOTPROTO=none
TYPE=Ethernet
ONBOOT=yes
USERCTL=no
IPADDR=X.X.X.X
NETMASK=255.255.255.128
GATEWAY=X.X.X.X

then the eth0 1 2 etc....

DEVICE=eth0
BOOTPROTO=none
TYPE=Ethernet
ONBOOT=yes
USERCTL=no
MASTER=bond0
SLAVE=yes


/etc/modprobe.conf

alias bond0 bonding
options bond0 mode=1 miimon=100

fishface
01-23-2008, 11:17 AM
Solved the Internet issue - it was DNS.

What distro are you using?

There are many differing ways depending on the distro.

Sixstrings
01-23-2008, 11:47 AM
I use this method on several distros but the main ones are SuSe EL 9, 10 and RHEL 3,4,5.

fishface
01-23-2008, 11:57 AM
I think essentially the same as mine, I've just removed certain options as per recommendations.

Sample of ifcfg-eth

BOOTPROTO='none'
NAME='Intel EtherExpress PRO/100+ Management Adapter'
STARTMODE='off'
UNIQUE='JNkJ.HVgIlgOrmpC'
_nm_name=+'bus-pci-0000:00:0d.0'

In the note it explains why you need "UNIQUE='JNkJ.HVgIlgOrmpC'
_nm_name=+'bus-pci-0000:00:0d.0'

Quote" * Supply one BONDING_SLAVEn='slave_device' for each slave. The 'slave_device' is either an interface name, e.g., 'eth0', or a device specifier for the network device. The interface name is easier to find, but the ethn names are subject to change at boot time if a device early in the sequence has failed. The device specifiers 'bus-pci-0000:06:08.1' specify the physical network device, and will not change unless the device's bus location changes (for example, it is physically moved from one PCI slot to another)."

See here fir further info: https://secure-support.novell.com/KanisaPlatform/Publishing/133/3929220_f.SAL_Public.html

fishface
01-23-2008, 12:00 PM
All I can say is that it all seems to work, I can pull network cables out and (unless I pull both out - obviously! ) and back in, the status changes and traffic still flows, and when I plug them back in it becomes active again, and all works fine after multiple reboots.

Sorry about the additional post

The only thing I've noticed, and this might be normal and I'm expecting to much, is that when I plug the cable back in the network conenctions take around 3 - 4 secs to re-negotiate - I pinging from another machine to test.

Sixstrings - does yor config work on SuSE? From the notes I have is states that you do not need to add anything to the /etc/modprobe.conf (actually modprobe.conf.local strictly speaking) - so I don't understand how you above config wil work with SuSE, seems it should be Ok with RHEL

Update - I've just read the notes and having separate HWaddr (MAC) for each device, when issuing the "ifconfig" command, is correct for balance-alb mode, so everything works just as it should and I don't have any problems.