
Tuesday January 09, 2007
Privilege (Set Me Free)
One of the perhaps lessor known features of
Solaris Containers
or
Zones
is that applications running inside these virtualized environments
execute with less privileges than applications executing outside the
container. This is enforced through the
Solaris Privileges
framework which was also introduced in the
Solaris 10
release.
When comparing virtualization solutions, typically OS level
virtualization mechanisms like Zones or
FreeBSD Jails
are thought to provide less security than mechanisms where a machine
architecture is virtualized, such as with the family of products from
VMware
or with
paravirtualization
mechanisms such as
Xen,
in which the guest OS is ported to the virtualized machine
architecture. One reason for that is there is usually weaker
separation between virtualized OS environments since at several levels
in the kernel there is some sharing of data structures and code paths.
However in some cases, OS level virtualization provides an advantage
for certain aspects of security. For example, with Solaris Containers
the privilege mechanism in the kernel enforces limitations on the types
of operations an application can perform. Consider the case of the
ability to create or "plumb" a software networking interface using
ifconfig(1M)
or set an IP address on that interface. In some situations, one wants
to allow such operations inside a virtualized environment because a
particular application requires the ability to change an existing IP
address or to toggle an interface up or down. The ramification of
this, however, is that a malicious or naive user inside the environment
might change their IP address to something not expected with the
results ranging from disruption in the network topology to the
potential of
spoofing
another machine on the network. In addition, most applications do not
actually require the ability to change their environment's IP address
or create new network interfaces or even know the name of the interface
in their environment. Rather, they typically want one or more IPv4 or
IPv6 addresses which they can
bind(3SOCKET)
to.
In the case of Solaris Containers, the
privilege
to
set the IP address of an interface
is not given to any applications running inside a container and there
is no way for an application to escalate or grow the set of privileges
from those they started out with. The end result, in this example, is
that the root password or super-user privileges can be given to a user
inside a container but they will be unable to manipulate or affect the
topology of the network or impersonate another machine and potentially
gain access to its network traffic.1
Until recently, the set of privileges a container's applications were
limited to was fixed. However starting with both
Solaris Express 5/06
and
Solaris 10 11/06,
the global zone administrator can change this set of privileges. What
this means from a practical point of view is that containers can become
more capable by adding some of the privileges that are not usually
present. An example here might be the ability to run
DTrace
from within the container2.
Dan
provided an excellent writeup on the details for doing so
here.
As another example, by adding some additional privileges to the
container's default privilege set, a Network Time Protocol (NTP) server
can be deployed in the container which is preferable from a security
point of view, especially for a server that might be facing a hostile
Internet. In order to configure the container appropriately, the list
of privileges that it requires needs to be known. Solaris 10 currently
ships with the
3-5.93e
version of
xntpd(1M),
which is the daemon that implements the NTP server capability. This
particularly daemon actually can take advantages of three
privileges that are not normally present within a container. The
first, perhaps obviously, is the privilege to change the system clock -
sys_time.
With the addition of this privilege, xntpd will be able to successfully
set the system clock when it needs to.
However it also turns out that the daemon tries to both lock down its
memory pages and also run in the real-time scheduling class. It does
this so that the daemon can maintain accurate time particularly in the
face of other system activity. These two operations are also covered
by unique privileges -
proc_lock_memory
and
proc_priocntl.
Tying these privileges3
together, we can take an existing container and configure it to be a
NTP server. In this example, Sun's internal network routes IP
multicast and so I will leverage that to connect to the network's NTP
servers listening on the standard multicast address of 224.0.1.1 for
NTP:
For example, consider this update to the configuration of the zone
myzone:
global# zonecfg -z myzone
zonecfg:myzone> set limitpriv=default,proc_lock_memory,proc_priocntl,sys_time
zonecfg:myzone> commit
zonecfg:myzone> exit
global# zoneadm -z myzone boot
Then from within the newly booted container, I will set up the
configuration of the server itself and start the service:
myzone# cp -p /etc/inet/ntp.client /etc/inet/ntp.conf
myzone# svcadm enable network/ntp
The property that was set in the container's configuration,
limitpriv, consists of a list of privileges similar to the form
expected by
user_attr(4)
and
priv_str_to_set(3C).
In this particular example, the container's privilege set is limited to
the standard default set of privileges plus the three additional
privileges required by the NTP server.
It is worthwhile to note that privileges can also be taken away
by preceding them with an exclamation mark (!) or a minus sign (-).
This can allow a container to be booted in which applications have even
fewer privileges than usual. For example, to take away the ability to
generate ICMP datagrams from the zone named twilight, the global
zone administrator would configure the container as follows:
global# zonecfg -z twilight set limitpriv=default,!net_icmpaccess
There are a few restrictions on what privilges can be added to a
container as well as some concerning which ones can be removed. For
more details, please see the
original proposal
and the ensuing discussion on the
zones-discuss mailing list.
This proposal and many others concerning containers and other parts of
OpenSolaris have benefited greatly from the participation of the
OpenSolaris Zones Community.
Information about each of these proposals can be found
here.
1
The actual privilege check in the kernel for this particular case
occurs
here.
2
The ability to use DTrace inside a non-global zone is at the present
time restricted to Solaris Express as some additional changes to DTrace
were required. However, these changes should be appearing in an
upcoming Solaris 10 release.
3
Starting with Solaris Express 11/06, the privilege to lock memory has
actually been added to the container's default set. This is because
additional resource controls
have been added that can limit the amount of memory applications within
a container can lock so it is no longer necessary to make this
privilege an optional one.
Technorati Tag:
Containers
Technorati Tag:
OpenSolaris
Technorati Tag:
Solaris
Technorati Tag:
Virtualization
Technorati Tag:
Zones
( Jan 09 2007, 05:08:42 PM PST )
Permalink
|