[Update to IPMP testing 2009.01.20]

[Minor update 2009.01.14]

When running Solaris Zones in a shared-IP configuration, all network configurations are determined by how the zone is configured using zonecfg(1M) or by what the global zone's IP determines things should be (such as routes). This has caused some trouble in situations where zones are on different subnets, and especially if the global zone is not on the subnet(s) the non-global zones are on. While exclusive IP Instances were delivered to help address these cases, using exclusive IP Instances requires a data link per zone, and if running a large number of zones there may not be enough data links available.

With Solaris 10 10/08 (Update 6), an additional network configuration parameter is available for shared-IP zones. This is the default router (defrouter) optional parameter.

Using the defrouter parameter, it is possible to set which router to use for traffic leaving the zone. In the global zone, default router entries are created the first time the zone is booted. Note that the entries are not deleted when the zone is halted.

The defrouter property looks like this for a zone with it configured.

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129
And it looks like this if it is not set.
global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter not specified
So I have run a variety of configurations, and some thing I observed are as follows. (Most of the configurations used a separate interface for the global zone (bge0) than for the non-global zones (bge1 and bge2). IPMP is not being used in these configurations. A comment on that at the end.) The [#] indicate examples in the outputs that follow.
  • A default route entry is create for the NIC [1] on which the zone is configured when the zone is booted. [2]
  • Entries are not deleted when a zone is halted. They persist until manually removed[3] or a reboot of the global zone.
  • It is possible to have the same default router configured for multiple zones. [4]
  • It is possible to have the same default router listed on multiple interfaces. * [5]
  • It is possible to have multiple default routers on the same interface, even on different IP subnets. [6]
  • The interface used for outbound traffic is the one the zone is assigned to. [7]
  • It is sufficient to plumb the interface for the non-global zones in the global zone (thus it has 0.0.0.0 as its IP address in the global zone). [8]
  • The physical interface can be down in the global zone. [9]
  • If only one interface is used, and different subnets for the global and non-global zones are configured, routing works when setting defrouter [10] and does not work if it is not set.
The most interesting thing I noticed was that although two non-global zones may be on the same IP subnet, if they are configured on different interfaces, the traffic leaves the system on the interface that the zone is configured to be on. This is not the case typically when using shared IP and also having an IP address for the subnet in the global zone.

* Note: Having two interfaces on the same IP subnet without configuring IP Multipathing (IPMP) may not be a supported configuration. I am looking for documentation that states this one way or another. [2009.01.14]

Examples

1. Single Zone, Single Interface--The Basics

Create a single non-global zone.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129

global# zoneadm -z shared1 boot [2]

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
default              10.1.14.129          UG        1          0 bge1 [1]
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zoneadm -z shared1 halt

global# zoneadm list -v
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# route delete default 10.1.14.129 [3]
delete net default: gateway 10.1.14.129

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

2. Multiple Interfaces, Same Default Router

Three zones, where two use bge1 and the third uses bge2. All use the same default router.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          1 bge0
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared1 info net
net:
        address: 10.1.14.141/26
        physical: bge1
        defrouter: 10.1.14.129 [4]

global# zonecfg -z shared2 info net
net:
        address: 10.1.14.142/26
        physical: bge1
        defrouter: 10.1.14.129 [4]

global# zonecfg -z shared3 info net
net:
        address: 10.1.14.143/26
        physical: bge2
        defrouter: 10.1.14.129 [5]

global# zoneadm -z shared1 boot

global# zoneadm -z shared2 boot

global# zoneadm -z shared3 boot

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1 [4]
default              139.164.63.215       UG        1          1 bge0
default              10.1.14.129          UG        1          2 bge2 [5]
139.164.63.0         139.164.63.125       U         1          1 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zoneadm list -v
  ID NAME             STATUS     PATH                           BRAND    IP
   0 global           running    /                              native   shared
   3 shared1          running    /zones/shared1                 native   shared
   4 shared2          running    /zones/shared2                 native   shared
   5 shared3          running    /zones/shared3                 native   shared

global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared1
        inet 127.0.0.1 netmask ff000000
lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared2
        inet 127.0.0.1 netmask ff000000
lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared3
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared1
        inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared2
        inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8d
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared3
        inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191

3. Multiple Subnets

Add another zone, using bge2 and on a different subnet.
global# zonecfg -z shared4 info net
net:
        address: 192.168.16.144/24
        physical: bge2
        defrouter: 192.168.16.129

global# zoneadm -z shared4 boot

global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.14.129          UG        1          0 bge1
default              10.1.14.129          UG        1          4 bge2
default              139.164.63.215       UG        1          3 bge0
default              192.168.16.129       UG        1          0 bge2 [6]
139.164.63.0         139.164.63.125       U         1          4 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1

4. Interface Usage

Issue some pings within the non-global zones and see which network interfaces are used. From the global zone, I issue a ping to a remote system (on the same network as the global zone (139.164.63.0), and see which interfaces are being used. [7]
global# zlogin shared1 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared2 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared3 ping 139.164.63.38
139.164.63.38 is alive

global# zlogin shared4 ping 139.164.63.38
139.164.63.38 is alive
This shows the pings originating from shared1 and shared2 going out on bge1.
global1# snoop -d bge1 icmp
Using device /dev/bge1 (promiscuous mode)
 10.1.14.141 -> 139.164.63.38 ICMP Echo request (ID: 4677 Sequence number: 0)
139.164.63.38 -> 10.1.14.141  ICMP Echo reply (ID: 4677 Sequence number: 0)
 10.1.14.142 -> 139.164.63.38 ICMP Echo request (ID: 4681 Sequence number: 0)
139.164.63.38 -> 10.1.14.142  ICMP Echo reply (ID: 4681 Sequence number: 0)
And this shows the pings originating from shared3 and shared4 going out on bge2.
global2# snoop -d bge2 icmp
Using device /dev/bge2 (promiscuous mode)
 10.1.14.143 -> 139.164.63.38 ICMP Echo request (ID: 4685 Sequence number: 0)
139.164.63.38 -> 10.1.14.143  ICMP Echo reply (ID: 4685 Sequence number: 0)
192.168.16.144 -> 139.164.63.38 ICMP Echo request (ID: 4689 Sequence number: 0)
139.164.63.38 -> 192.168.16.144 ICMP Echo reply (ID: 4689 Sequence number: 0)
Just to confirm where each zone is configured, here is the ifconfig output.
global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared1
        inet 127.0.0.1 netmask ff000000
lo0:2: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared2
        inet 127.0.0.1 netmask ff000000
lo0:3: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared3
        inet 127.0.0.1 netmask ff000000
lo0:4: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared4
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3 [9]
        inet 0.0.0.0 netmask 0 [8]
        ether 0:3:ba:e3:42:8c
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared1
        inet 10.1.14.141 netmask ffffffc0 broadcast 10.1.14.191
bge1:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        zone shared2
        inet 10.1.14.142 netmask ffffffc0 broadcast 10.1.14.191
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0 [8]
        ether 0:3:ba:e3:42:8d
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared3
        inet 10.1.14.143 netmask ffffffc0 broadcast 10.1.14.191
bge2:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        zone shared4
        inet 192.168.16.144 netmask ffffff00 broadcast 192.168.16.255

5. Using a Single Interface

Only using bge0 and using different subnets for the global and non-global zones. [10]

Before booting the zone.

global# netstat -nr

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
139.164.63.0         139.164.63.125       U         1          2 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# zonecfg -z shared17 info net
net:
        address: 192.168.17.147/24
        physical: bge0
        defrouter: 192.168.17.16

global# zoneadm -z shared17 boot
Once the zone is booted, netstat shows both default routes, and a ping from the zone works.
global# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              139.164.63.215       UG        1          2 bge0
default              192.168.17.16        UG        1          0 bge0
139.164.63.0         139.164.63.125       U         1          2 bge0
224.0.0.0            139.164.63.125       U         1          0 bge0
127.0.0.1            127.0.0.1            UH        1         42 lo0

global# ifconfig -a4
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        zone shared17
        inet 127.0.0.1 netmask ff000000
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 139.164.63.125 netmask ffffff00 broadcast 139.164.63.255
        ether 0:3:ba:e3:42:8b
bge0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        zone shared17
        inet 192.168.17.147 netmask ffffff00 broadcast 192.168.17.255

global# zlogin shared17 ping 139.164.63.38
139.164.63.38 is alive

IP Multipathing (IPMP)

I did some testing with IPMP and similar examples as above. At this time the combination of IPMP and the defrouter configuration does not work. I have filed bug 6792116 to have this looked at.

[Updated 2009.01.20] After some addtional testing, especially with test addresses and probe based failure detection, I have seen IPMP work well only when zones are configured such that at least one zone is on each NIC in an IPMP group, including a standby NIC. For example, if you have two NICs, bge1 and bge2, at least one zone must be configured on bge1 and at least one on bge2. This is even the case when one of the NICs is in failed mode when the system or zone(s) boot. It turns out that the default route is added when the zone boot, and there is no later check for default route requirements as a zone is moved from one NIC to another based on IPMP failover or failback. Thus, I would recommend not using defrouter and IPMP together until the conbination is confirmed to work.

If this is important for your deployments, please add a service record to change request 6792116 and work with your service provide to have this addressed. Please also note that this works well with the IPMP Re-architecture coming soon to OpenSolaris.

Comments:

Fantastic article. I ask permission to be able tradutte Italian and insert it in my blog(see the URL on post comment). I guarantee that you will be referred to the original source(pingback with permanent link) and its author(Your Name & Blog). What do you think?

Best Regards
Michele Vecchiato

P.S.: Excuse me for my bad English, but i'm Italian ;-)
--
http://michelevecchiato.wordpress.com

Posted by Michele Vecchiato on January 14, 2009 at 09:42 AM EST #

Hi Michele, thank you for the feedback. I am happy that you like it and thankful and excited that you are willing to take the effort to translate it for others to better understand. It is OK with me that you translate it.

Steffen

Posted by Steffen Weiberle on January 14, 2009 at 08:16 PM EST #

Thanks Steffen, the kind of articles that we need.

Just one question/observation about this statement: "If only one interface is used, and different subnets for the global and non-global zones are configured, routing works when setting defrouter [10] and does not work if it is not set."

I thought that sharing a NIC and configuring multiple zones on different subnets was not advisable or supported. I mean: the same NIC would have different IPs on different subnets so that NIC would appear on two different broadcast domains. The first thing I wonder would be: to what subnet switch would you plug that cable into having multiple subnets on the same NIC? I thought it was the kind of problem that VLAN should solve.

Thanks for your comments.

Posted by Enrico Maria Crisostomo on January 18, 2009 at 10:14 AM EST #

Hi Enrico,

Thank you for your comments.

Configuring multiple subnets on a single interface works, and I think/hope it is supported (I'm interested in a pointer to where it is stated not to be supported if it is not).

Because the different IP subnets are sharing (what I would refer to) a single broadcast domain (all broadcast traffic to either subnet is received by the NIC), this is a challenging configuration, and one I only run in test cases. The switches I use (low end) don't have a problem with this. I would expect router setups to be more challenging, and I use this configuration sometimes knowing that the second IP subnet is private (as it will not be routed by the local router).

I would recommend using multiple NICs, if possible, or VLANs (as you suggest) if you have hardware capable of that.

I included the case as it is informational and may help those that need a private subnet and only have one non-VLAN NIC available. At home that is often the case for me. My laptop only has one NIC, and my switches do not support VLANs.

Posted by Steffen Weiberle on January 18, 2009 at 10:31 AM EST #

We've worked with this quite a bit at Joyent. For reference, the important changes to facilite default routes with zones was actually Bug 4963362 (http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4963362) (master CR is 6391685 (http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6391685)) , which was commited to snv_49

Here are 3 rules of thumb:

1) For best success, any given zone (including global) should have only one path available to it. This isn't required, but helps avoid the router round-robin issue.
2) VLAN's work well in these situations. You DO NOT need to supply an IP address on the global zone interface, just insert "0.0.0.0 netmask 255.255.255.0 up" into the /etc/hostname.<interface><vlan>00<instance> file and restart network/physical:default.
3) VNIC's, of course, bypass the issue entirely thanks to IP Instances.

Posted by benr on January 18, 2009 at 03:06 PM EST #

Steffen, this was pretty funny: I just realized who was contributing posts in the thread I posted and it was you

Posted by Enrico Maria Crisostomo on January 18, 2009 at 03:17 PM EST #

Hi Steffen,

Do you know of any problem with this when using ce interfaces?

I've installed a sparse zone on a 480, attached it to ce5, and set the defrouter, which shows up in the global's netstat. I can SSH into the zone from the global, and the zone can ping "gateway". So far so good. However, the zone's traffic won't leave the box, nor does it show up on a snoop.

As far as I can tell, the only difference is the interface type.

Thanks

J

Posted by JohnA on March 12, 2009 at 04:15 AM EDT #

Hi J, if I understand your question, are you asking about traffic between zones that is not leaving the system or snoop-able? If so, shared IP will automatically loop IP datagrams back up to the destination if it is also on the same system in the same global IP Instance. Setting 'defrouter' only affects the default gateway, of off net destinations that are not also co-located.

To not clutter the comments, you can reach me at steffen -dot- weiberle -at- sun -dot- com, or post to the network-discuss alias here.

Steffen

Posted by Steffen Weiberle on March 12, 2009 at 07:54 AM EDT #

Hi all,
i've a question.
I've configured 2 zones and i setted up the defrouter for both of them.
in the global zone i see two default router and it's should be right.
But in I go inside the non-global zone and execute the "netstat -nrv" i see the following situation:

root@global # zlogin zone
[Connected to zone 'zone' pts/4]
Last login: Wed Apr 8 17:32:50 on pts/4
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
Sourcing //.profile-EIS.....
root@as638ops # bash
root@as638ops # netstat -nrv

IRE Table: IPv4
Destination Mask Gateway Device Mxfrg Rtt Ref Flg Out In/Fwd
-------------------- --------------- -------------------- ------ ----- ----- --- --- ----- ------
default 0.0.0.0 10.31.146.1 e1000g1 1500* 0 1 UG 41 0
default 0.0.0.0 10.29.76.1 bnx0 1500* 0 1 UG 5 0
10.29.76.0 255.255.254.0 10.29.76.27 bnx0:1 1500* 0 1 U 1 0
10.30.228.0 255.255.254.0 10.29.76.1 1500* 0 1 UG 4 0
10.31.30.0 255.255.255.0 10.31.30.33 e1000g0:1 1500* 0 1 U 0 0
10.31.146.0 255.255.254.0 10.31.146.23 e1000g1:1 1500* 0 1 U 3 0
10.31.146.0 255.255.254.0 10.31.146.25 e1000g1:2 1500* 0 1 U 0 0
224.0.0.0 240.0.0.0 10.29.76.27 bnx0:1 1500* 0 1 U 0 0
127.0.0.1 255.255.255.255 127.0.0.1 lo0:1 8232* 0 5 UH 51 0

How come i see 2 defrouter inside the zone?
Should i see both of them?
Any idea?

Thank to everybody

Mirko

Posted by Mirko Resca on April 08, 2009 at 01:19 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by stw