Thursday Jul 12, 2007
Thursday Jul 12, 2007
Solaris Containers (aka Zones) is a virtualization tool that has other powerful, but less well known uses. These rely on a unique combination of features:
By default, Solaris Containers are more secure than general-purpose operating systems in many ways. For example, even the root user of a Container with a default configuration cannot modify the Container's operating system programs. That limitation prevents trojan horse attacks which replace those programs. Also, a process running in a Container cannot directly modify any kernel data, nor can it modify kernel modules like device drivers. Glenn Brunette created an excellent slide deck that describes the multiple layers of security in Solaris 10, of which Containers can be one layer.
Even considering that level of security, the ability to selectively remove Solaris privileges can be used to further tighten a zone's security boundary. In addition, the ability to disable network services prevents almost all network-based attacks. This is very difficult to accomplish in most operating systems without making the system unusable or unmanageable.
The combination of those abilities and the resource controls that are part of Containers' functionality enables you to configure an application environment that can do little more than fulfill the role you choose for it.
This blog entry describes a method that can be used to slightly expand a Container's abilities, and then tighten the security boundary snugly around the Container's intended application.
Imagine that you want to run an application on a Solaris system, but the workload(s) running on this system should not be directly attached to the Internet. Further, imagine that the application needs an accurate sense of time. Today this can be done by properly configuring a firewall to allow the use of an NTP client. But now there's another way... (If this concept sounds familiar, it is because this idea has been mentioned before here and here.)
To achieve the same goal without a firewall, you could use two Solaris "virtual environments" (zones): one that has "normal" behavior, for the application, and one that has the ability to change the system's clock, but has been made extremely secure by meeting the following requirements:
Any zone can be configured to have access to one or more network ports (NICs). Further, OpenSolaris build 57 and newer builds, and the next update to Solaris 10, enable a zone to have exclusive access to a NIC, further isolating network activity of different zones. This feature is called IP Instances and will be mentioned again a bit later. A zone has its own SSM (Solaris Services Manager). Most of the services managed by SSM can be disabled if you are limiting the abilities of a zone. The zone that will manage the time clock can be configured so that it does not respond to any network connection requests by disabling all non-essential services. Also, Solaris Configurable Privileges enables us to remove unnecessary privileges from the zone, and add the one non-default privilege it needs: sys_time. That privilege is needed in order to use the stime(2) system call.
|
|
Here is the configuration for the zone when I initially created it:
zonecfg -z timelord zonecfg:timelord> create zonecfg:timelord> set zonepath=/zones/roots/timelord zonecfg:timelord> exit
After the zone has been booted and halted once, disabling services in Solaris is easy - the svcadm(1M) command does that. Through experimentation I found that this script disabled all of the network services - and some non-network services, too - but left enough services running that the zone would boot and NTP client software would run. Note that this is less important starting with Solaris 10 11/06: new installations of Solaris 10 will offer the choice to install "Secure By Default" with almost all network services turned off.
To use that script, I booted the zone and logged into it from the global zone - something you can do with zlogin(1) even if the zone does not have access to a NIC. Then I copied the script from the global zone into the non-global zone. A secure method to do this is: as the root user of the global zone, create a directory in <zonepath>/root/tmp, change its permissions to prevent access by any user other than root, and then copy the script into that directory. All of that allowed the script to be run by the root user of the non-global zone. Those steps can be accomplished with these commands:
global# mkdir /zones/roots/timelord/root/tmp/ntpscript global# chmod 700 /zones/roots/timelord/root/tmp/ntpscript global# cp ntp-disable-services /zones/roots/timelord/root/tmp/ntpscript global# zlogin timelord timelord# chmod 700 /tmp/ntpscript/disable-services timelord# /tmp/ntpscript/disable-services
Now we have a zone that only starts the services needed to boot the zone and run NTP. Incidentally, many other commands will still work, but they don't need any additional privileges.
The next step is to gather the minimum list of Solaris privileges needed by the reduced set of services. Fortunately, a tool has been developed that helps you determine the minimum necessary set of privileges: privdebug.
Here is a sample use of privdebug, which was started just before booting the zone, and stopped after the zone finished booting:
global# ./privdebug.pl -z timelord STAT PRIV USED sys_mount USED sys_mount USED sys_mount USED sys_mount USED sys_mount USED proc_exec USED proc_fork USED proc_exec USED proc_exec USED proc_fork USED contract_event USED contract_event <many lines deleted> ^C global#Running that output through sort(1) and uniq(1) summarizes the list of privileges needed to boot the zone and our minimal Solaris services. Limiting a zone to a small set of privileges requires using the zonecfg command:
global# zonecfg -z timelord zonecfg:timelord> set limitpriv=file_chown,file_dac_read,file_dac_write,file_owner,prov_exec,proc_fork,proc_info,proc_session,proc_setid,proc_taskid,sys_admin,sys_mount,sys_resource zonecfg:timelord> exitAt this point the zone is configured without unnecessary privileges and without network services. Next we must discover the privileges needed to run our application. Our first attempt to run the application may succeed. If that happens, there is no need to change the list of privileges that the zone has. If the attempt fails, we can determine the missing privilege(s) with privdebug.
For this example I will use ntpdate(1M) to synchronize the system's time clock with time servers on the Internet. In order for ntpdate to run, it needs network access, which must be enabled with zonecfg. When adding a network port, I increased zone isolation with a new feature in OpenSolaris called IP Instances. Use of this feature is not required, but it does improve network isolation and network configuration flexibility. You can choose to ignore this feature if you are using a version of Solaris 10 which does not offer it, or if you do not want to dedicate a NIC to this purpose.
To use IP Instances, I added the following parameters via zonecfg:
global# zonecfg -z timelord zonecfg:timelord> set ip-type=exclusive zonecfg:timelord> add net zonecfg:timelord:net> set physical=bge1 zonecfg:timelord:net> end zonecfg:timelord> zonecfg:timelord> exit global#Setting ip-type=exclusive quietly adds the net_rawaccess privilege and the new sys_ip_config privilege to the zone's limit set. This happens whenever the zone boots. These privileges are required in exclusive-IP zones.
We can assign a static address to the zone with the usual methods of configuring IP addresses on Solaris systems. For example, you could boot the zone, login to it, and enter the following command:
timelord# echo "192.168.1.11/24" > /etc/hostname.bge1However, because the root user of the global zone can access any of the zone's files, you can do the same thing without booting the zone by using this command instead:
global# echo "192.168.1.11/24" > /zones/roots/timelord/root/etc/hostname.bge1
With network access in place, we can discover the list of privileges necessary to run the NTP client. First boot the zone:
global# zoneadm -z timelord bootAfter the zone boots, in one window run the privdebug script, and then in another window run the NTP client in the NTP zone:
global# ./privdebug.pl -z timelord STAT PRIV USED proc_fork USED proc_exec USED proc_fork USED proc_exec NEED sys_time ^Cglobal# |
global# zlogin timelord
timelord# ntpdate -u <list of NTP server IP addresses>
16 May 13:12:27 ntpdate[24560]: Can't adjust the time of day: Not owner
timelord#
|
That output shows us that the privilege 'sys_time' is the only additional one needed to enable the zone to set the system time clock using ntpdate(1M).
Again we use zonecfg to modify the zone's privileges:
global# zonecfg -z timelord zonecfg:timelord> set limitpriv=file_chown,file_dac_read,file_dac_write,file_owner,prov_exec,proc_fork,proc_info,proc_session,proc_setid,proc_taskid,sys-admin,sys_mount,sys_resource,sys_time zonecfg:timelord> exit
While isolating the zone, why not also limit the amount of resources that it can consume? If the zone is operating normally the use of resource management features is unnecessary, but they are easy to configure and their use in this situation could be valuable. These limits could reduce or eliminate the effects of a hypothetical bug in ntpdate which might cause a memory leak or other unnecessary use of resources.
Capping the amount of resources which can be consumed by the zone is also another layer of security in this environment. Resource constraints can reduce or eliminate risks associated with a denial of service attack. Note that the use of these features is not necessary. Their use is shown for completeness, to demonstrate what is possible.
A few quick tests with rcapstat(1) showed that the zone needed less than 50MB of memory to do its job. A cap on locked memory further minimized the zone's abilities without causing a problem for NTP. As with IP Instances, these features are available in OpenSolaris and will be in the next update to Solaris 10.
global# zonecfg -z timelord zonecfg:timelord> add capped-memory zonecfg:timelord:capped-memory> set physical=50m zonecfg:timelord:capped-memory> set swap=50m zonecfg:timelord:capped-memory> set locked=20m zonecfg:timelord:capped-memory> end zonecfg:timelord> set scheduling-class=FSS zonecfg:timelord> set cpu-shares=1 zonecfg:timelord> set max-lwps=200 global#
Assigning one share to the zone prevents the zone from using too much CPU power and impacting other workloads. It also guarantees that other workloads will not prevent this zone from getting access to the CPU. Capping the number of threads (lwps) limits the ability to use up a fixed resource: process table slots. That limit is probably not necessary given the strict memory caps, but it can't hurt.
Now that we have 'shrink-wrapped' the security boundary even more tightly than the default, we're ready to use this zone.
global# zoneadm -z timelord boot global# zlogin timelord timelord# ntpdateThe output of ntpdate shows that that it was able to contact an NTP server and adjust this system's time clock by almost 0.4 seconds.16 May 14:40:35 ntpdate[25070]: adjust time server
offset -0.394755 sec
Experience with Solaris privileges can allow you to further tighten the security boundary. For example, if you want to prevent the zone from changing its own host name, you could remove the sys_admin privilege from the zone's limit set. Doing so, and then rebooting the zone, would allow you to demonstrate this:
timelord# hostname drwho hostname: error in setting name: Not owner timelord#What privilege is needed to use the hostname(1M) command?
timelord# ppriv -e -D hostname drwho hostname[4231]: missing privilege "sys_admin" (euid = 0, syscall = 139) needed at systeminfo+0x139 hostname: error in setting name: Not owner
Before disabling services, I ran "netstat -a" on another zone which had just been created. It showed a list of 13 ports to which services were listening, including ssh and sunrpc services. After hardening the zone 'timelord' by disabling unneeded services, "netstat -a" doesn't show any open ports.
In order to further evaluate the security of the configuration described above, Nessus was used to evaluate the possible attack vectors. It did not find any security weaknesses.
What else can be secured using this method? Typical Unix services like sendmail and applications like databases are ideal candidates. What application do you want to secure?
Thanks to Glenn Brunette for assistance with security techniques and to Bob Bownes for providing a test platform and assistance with Nessus.