Using IPMP with link based failure detection
Having one IP address (whether a public or a private, non routable) per data link and also the separate address(es) for the application(s) turns out to be a lot of addresses to allocate and administer. And since the default of five probes spaced two seconds apart meant a failure would take at least ten (10) seconds to be detected, something more was needed.
So in the Solaris 9 timeframe the ability to also do link based failure detection was delivered. It requires specific NICs whose driver has the ability to notify the system that a link has failed. The Introduction to IPMP in the Solaris 10 Systems Administrators Guide on IP Services lists the NICs that support link state notification. Solaris 10 supports configuring IPMP with only link based failure detection.
global# more /etc/hostname.bge[12] :::::::::::::: /etc/hostname.bge1 :::::::::::::: 10.1.14.140/26 group ipmp1 up :::::::::::::: /etc/hostname.bge2 :::::::::::::: group ipmp1 standby upOn system boot, there will be an indication on the console that since no test addresses are defined, probe-based failure detection is disabled.
Apr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge2; disabling probe-based failure detection on it Apr 10 10:57:20 in.mpathd[168]: No test address configured on interface bge1; disabling probe-based failure detection on itLooking at the interfaces configured,
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191
groupname ipmp1
ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
bge2: flags=69000842<BROADCAST,RUNNING,MULTICAST,IPv4,NOFAILOVER,STANDBY,INACTIVE> mtu 0 index 4
inet 0.0.0.0 netmask 0
groupname ipmp1
ether 0:3:ba:e3:42:8d
you will notice that two of the three interfaces have no address (0.0.0.0). Also, the data address is on a physical interface on bge1. At the same time bge2 has the 0.0.0.0 address. On the failure of bge1,
Apr 10 14:34:53 global bge: NOTICE: bge1: link down
Apr 10 14:34:53 global in.mpathd[168]: The link has gone down on bge1
Apr 10 14:34:53 global in.mpathd[168]: NIC failure detected on bge1 of group ipmp1
Apr 10 14:34:53 global in.mpathd[168]: Successfully failed over from NIC bge1 to NIC bge2
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
ether 0:3:ba:e3:42:8b
bge1: flags=19000802<BROADCAST,MULTICAST,IPv4,NOFAILOVER,FAILED> mtu 0 index 3
inet 0.0.0.0 netmask 0
groupname ipmp1
ether 0:3:ba:e3:42:8c
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname ipmp1
ether 0:3:ba:e3:42:8d
bge2:1: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
inet 10.1.14.140 netmask ffffffc0 broadcast 10.1.14.191
the data address is migrated onto bge2:1. I find this a little confusing. However, I don't know any way around it on Solaris 10. The IPMP Re-architecture makes this a lot easier!
Using Probe-based IPMP with non-global zones
Configuring a shared IP Instance non-global zone and utilizing IPMP managed in the global zone is very easy.The IPMP configuration is very simple. Interface bge1 is active, and bge2 is in stand-by mode.
global# more /etc/hostname.bge[12] :::::::::::::: /etc/hostname.bge1 :::::::::::::: group ipmp1 up :::::::::::::: /etc/hostname.bge2 :::::::::::::: group ipmp1 standby upMy zone configuration is:
global# zonecfg -z zone1 info
zonename: zone1
zonepath: /zones/zone1
brand: native
autoboot: false
bootargs:
pool:
limitpriv:
scheduling-class:
ip-type: shared
inherit-pkg-dir:
dir: /lib
inherit-pkg-dir:
dir: /platform
inherit-pkg-dir:
dir: /sbin
inherit-pkg-dir:
dir: /usr
net:
address: 10.1.14.141/26
physical: bge1
Prior to booting, the network configuration is:
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
zone zone1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname ipmp1
ether 0:3:ba:e3:42:8c
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname ipmp1
ether 0:3:ba:e3:42:8d
After booting, the network looks like this:
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
zone zone1
inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
ether 0:3:ba:e3:42:8b
bge1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname ipmp1
ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
zone zone1
inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=21000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,STANDBY> mtu 1500 index 4
inet 0.0.0.0 netmask ff000000 broadcast 0.255.255.255
groupname ipmp1
ether 0:3:ba:e3:42:8d
So a simple case for the use of IPMP, without the need for test addresses! Other IPMP configurations, such as more than two data links, or active-active, are also supported with link based failure detection. The more links involved, the more test addresses are saved with link based failure detection. Since writing this entry I was involved in a customer configuration where this is saving several hundred IP address and their management (such as avoiding duplicate address). That customer is willing to forgo the benefit of probes testing past the local switch port.
Steffen
Does link based failure detection IPMP support Active/Active?
Posted by Herman on April 26, 2009 at 06:46 PM EDT #
Yes, active-active works whether using probe or link based failure detection.
Posted by Steffen Weiberle on April 27, 2009 at 07:37 AM EDT #
Does this link based IPMP work inside the openstorage (amber road) aktty -> shell? It's a underlying Solaris NV / OSOL mixture, this should be possible..
Posted by Tom Stocker on May 27, 2009 at 08:24 AM EDT #
Hi Tom, this is a core feature of IPMP in Solaris 10 and in Nevada (SX-CE) and OpenSolaris. As I understand it, the GUI for the Sun Storage 7000 Unifed Storage System only supports configuring with probe test addresses. If you do anything via the shell, I don't know the effect it has on the GUI. Note that with probe-based failure detection, the interface will also fail over if the link fails. So you get quick failover and failback, while also allowing probes to test beyond the local switch port.
Steffen
Posted by Steffen Weiberle on May 27, 2009 at 03:57 PM EDT #
I am also trying to get IPMP working on an amber road (7110). It is directly attached to a v440 and a v240. No switches involved so probe based will not work for me. I have done this with solaris boxes with link based with no problem, however like Steffen says it is not a feature of the GUI nor the CLI as far as i can tell.
Is there any known SUN supported way to accomplish this?
Thanks
-Pete
Posted by Pete Sorensen on July 01, 2009 at 11:59 AM EDT #