I recently received a call from someone who has helped me out a lot on some performance issues (thanks, Jim Fiori), and I was glad to be able to return even a small part of those favors!

He had been contacted to help a customer who was ready to deploy a web application, and they were experiencing intermittent lack of connection to the web site. Interestingly, they were also using zones, a bunch of them (OK, a handful)--and so right up my alley.

The customer was running a multi-tiered web application on an x4600 (so Solaris on x86 as well!), with the web server, web router, and application tiers in different zones. They were using shared IP Instances, so all the network configuration was being done in the global zone.

Initially, we had to modify some configuration parameters, especially regarding default routes. Since the system was installed with Solaris 10 5/08 and had more recent patches, we could use the defrouter feature introduced in 10/08 to make setting up routes for the non-global zones a little easier. This was needed because the global zone was using only one NIC, and it was not going to be on the networks that the non-global zones were on.

What made the configuration a little unique was that the web server needs a default router to the Internet, while the application server needs a route to other systems behind a different router. Individually, everything is fine. However, the web1 zone also needs to be on the network that the application and web router are on, so it ends up having two interfaces.

Lets look at web1 when only it is running.

web1# ifconfig -a4
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge1:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 172.16.1.41 netmask ffffff00 broadcast 172.16.1.255
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 192.168.51.41 netmask ffffff00 broadcast 192.168.51.255
web1# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              172.16.1.1           UG        1          0 bge1
172.16.1.0           172.16.1.41          U         1          0 bge1:1
192.168.51.0         192.168.51.41        U         1          0 bge2:1
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        5         34 lo0:1

The zone is on two interface, bge1 and bge2, and has a default route that uses bge1. However, when zone app1 is running, there is a second default route, on bge2. The same is true if app2 or odr are running. Note that these three zones are only on bge2.

app1# ifconfig -a4
lo0:1: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
        inet 127.0.0.1 netmask ff000000
bge2:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 192.168.51.43 netmask ffffff00 broadcast 192.168.51.255
app1# netstat -rn
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              192.168.51.1         UG        1          0 bge2
192.168.51.0         192.168.51.43        U         1          0 bge2:1
224.0.0.0            192.168.51.43        U         1          0 bge2:1
127.0.0.1            127.0.0.1            UH        3         51 lo0:1

In the meantime, this is what happens in web1.

web1# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- --------- 
default              192.168.51.1         UG        1          0 bge2
default              172.16.1.1           UG        1          0 bge1 
172.16.1.0           172.16.1.41          U         1          0 bge1:1
192.168.51.0         192.168.51.41        U         1          0 bge2:4
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        6        132 lo0:4

With any of the other zones running, web1 now has two default routes. And it only happens in web1, as it is the only zone with its public facing data link bge1 and a shared data link (bge2).

Traffic to any system on either the 192.168.51.0 or 172.16.1.1 network will have no issues. Every time IP needs to determine a new path for a system not on either of those two networks, it will pick a route, and it will round-robin between the two default routes. Thus approximately half the time, connections will fail to establish, or possibly existing connections will not work if they have been idle for a while.

This is how IP is supposed to work, so there is technically nothing wrong. It is a features of zones and a shared IP Instance. [2009.06.23: For background on why IP works this way, see James' blog].

The only problem is that this is not what the customer wants!

One option would be to force all traffic between the web and application tier out the bge1 interface, putting it on the wire. This may not be desirable for security reasons, and introduces latencies since traffic now goes on the wire. Another option would be to use exclusive IP Instances for the web servers. For each web zone, and this example only has one, it would required two additional data links (NICs). That would add up. Also, this configuration is targeted to be used with Solaris Cluster's scalable services, and those must be in shared IP Instance zones. Hummm....as I like to say.

We didn't know about the shared IP Instance restriction of Solaris Cluster, and as the customer was considering how they were going to add additional NICs to all the systems, something slowly developed in my mind. How about creating a shared, dummy network between the web and application tier? They had one spare NIC, and with shared IP it does not even need to be connected to a switch port, since IP will loop all traffic back anyway!

The more I thought about it, the more I liked it, and I could not see anything wrong with it. At least not technically as I understood Solaris. Operationally, for the customer, it might be a little awkward.

Here is what I was thinking of...

With this configuration the web1 zone has a default router only to the Internet and it can reach odr, and if necessary, app1 and app2, directly via the new network. And app1 and app2 only have a single default route to get to the Intranet. The nice thing is that bge3 does not even need to be up. That is visible with ifconfig output, where bge3 is not showing a RUNNING flag, which indicates the port is not connected (or in my case has been disabled on the switch).

global# ifconfig -a4
...
bge0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 129.154.53.125 netmask ffffff00 broadcast 129.154.53.255
        ether 0:3:ba:e3:42:8b
bge1: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8c
bge2: flags=1000842<BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8d 
bge3: flags=1000802<BROADCAST,MULTICAST,IPv4> mtu 1500 index 5 
        inet 0.0.0.0 netmask 0
        ether 0:3:ba:e3:42:8e
...
And within web1 there is now only one default route.
web1# netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- --------- 
default              172.16.1.1           UG        1         17 bge1 
172.16.1.0           172.16.1.41          U         1          2 bge1:1
192.168.52.0         192.168.52.41        U         1          2 bge3:1
224.0.0.0            172.16.1.41          U         1          0 bge1:1
127.0.0.1            127.0.0.1            UH        4        120 lo0:1
In the customer's case, multiple systems were being used, so the private networks were connected together so that a web zone on one system could access an odr zone on another. I am showing the simple, single system case since it is so convenient.

If I were using Solaris Express Community Edition (SX-CE) or OpenSolaris 2009.06 Developer Builds, with the Crossbow bits and virtual NICs (VNICs) available, I wouldn't even have needed to use that physical interface. Both are available here.

I hope this trick might help others out in the future.

Steffen

Comments:

Does this trick with bge3 also work with IPv6 enabled? We run all Solaris Boxes / Zones dual stacked with IPv4 and IPv6.

Posted by Thorleif Wiik on April 16, 2009 at 04:52 PM EDT #

I don't know of a reason why it should not. A brief test says it works.

Posted by Steffen Weiberle on April 16, 2009 at 05:35 PM EDT #

The steps I took to test were:

1. global# ifconfig bge3 inet6 plumb

2. global# zonecfg -z web1
add net
set physical=bge3
set address=fe80::c0a8:3329/120
end

3. global# ifconfig bge3 inet6 addif fe80::c0a8:332b/120 zone app1 up

4. web1# ping fe80::c0a8:332b
fe80::c0a8:332b is alive

Posted by Steffen Weiberle on April 16, 2009 at 05:51 PM EDT #

Hi Steffen,

what an article, this is what i have been looking for a long time to do my home lab experiment. so simple but very clear i think. thank you :)

anyway, i have these few questions which i hope you dont mind to answer them for me since i'm a bit clueless? :

can i replace that dummy bge3 interface using some kind of tunnel interface inside the system to interconnect those zones (inter-zone tunnels)? will it going to be VNIC to etherstub? or at least, can i route zone1 traffic to zone2 using any virtual interface inside the system - like loopback? so that the traffic stays local inside the system?

and, can i just put a single physical NIC to be owned by a single zone (not controlled by the global-zone)? so that zone can manage that NIC freely - like doing VLAN etc?

i have this small drawing which is almost the same as yours, so perhaps i can send you then?

thanking you in advanced.

regards,

abdi.

Posted by abdi on June 18, 2009 at 10:51 AM EDT #

Hi adbi, thank you for the feedback.

Regarding inter-zone tunnels, I don't think so. Are you looking for a 'device' to use, or expecting some other benefit from the tunnel?

You mention VNICs, so are you running SX-CE 105 or later, or OpenSolaris 2009.06? If that is the case, you can create an etherstub to do the same.

If no, one other option is to use a VLAN tagged interface, even if you don't use VLANs. I would use bge2, as it is internal, and plumb bge3999002 (VLAN tag 3999 on bge2) for example.

I don't follow the last question, about a single NIC. Maybe your drawing would help. Or exclusive IP Instances and VLANs. http://blogs.sun.com/stw/entry/using_ip_instances_with_vlans

While I'd like to keep the discussion public, maybe on network-discuss@opensolaris.org, you can reach me at s t e f f e n 'dot' w e i b e r l e 'at' s u n 'dot' c o m .

Posted by Steffen Weiberle on June 18, 2009 at 11:35 AM EDT #

Hi Steffen,

thank you for your prompt reply :)
btw, drawings sent.

regards,

Abdi.

Posted by abdi on June 18, 2009 at 04:48 PM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by stw