darren_moffat@blog$ cat /dev/mem | grep /dev/urandom

« CoolThreads == Cool... | Main | Comments back on »

20051215 Thursday December 15, 2005

Fast Teamware Bringover with ZFS & RBAC I run a pair (one SPARC, one x86) of reasonable sized build servers for the Solaris security group, a V880 with 16G Ram and 1.6Tb of ZFS disk space and a v40z which NFS mounts that same disk space.

I have a single ZFS pool with all the disks in it. There are two top level filesystems /cube/projects and /cube/builds. /cube/builds is scratch space for people to have what ever workspaces, test results etc etc they need for work in progress. /cube/builds doesn't get backed up (people use wx(1) backup to backup their work). Every user that uses the machine has their own ZFS filesystem in /cube/builds, eg /cube/builds/darrenm.

During the development of ZFS internally at Sun I heard there was a zbringover commmand to do really fast bringovers of the ZFS gate. I never used it but I guessed that it was probably using clones.

Yesterday I decieded to implement a similar thing for the build machine I run. I didn't bother looking at the old zbringover script since I wanted to use this to really understand the relationships between clones snapshots and filesystems in ZFS as they are now.

So I created a new filesystem /cube/builds/onnv-clone and did a teamware bringover of the ON consolidation into it. I then created a snapshot of this using the date in ISO format as the name of the snapshot, ie onnv-clone@2005-12-14. I then created a clone of that to hold the changes for a bug fix I was working on. Great the bringover took about 1 second instead of the previous record for an initial bringover of the full ONNV gate which is about 15 minutes and normally over the network this could be much longer depending on how close your machine is to the gate machine.

So at this point it was all working wonderful for me, but I wanted everyone who uses the machine to be able to take advanatage of /cube/builds/onnv-clone as well and I wanted it to be really easy to use, so I put this into a script that does the zfs clone followed by a chown -R of the clone workspace, that chown increases the average time to make the initial workspace out of the zfs clone from 1 second to about 12-15 seconds still a fantastic improvement, and not worth IMO trying to get it any faster than that.

#!/bin/ksh -p
# Copyright 2005 Sun Microsystems Inc.  All rights reserved.
# Use is subject to license terms.

if [ "$_" != "/usr/bin/pfexec" -a -x /usr/bin/pfexec ]; then
        /usr/bin/pfexec $0 $*
        exit $?
fi

NEW_WS=$1
BUILDS_BASE=cube/builds

if [ -z "$NEW_WS" ]; then
        echo "zbringover <workspacename>"
        echo "\t<workspacename> is relative to /$BUILDS_BASE/$LOGNAME"
        exit 1
fi

if [ -d "/$BUILDS_BASE/$LOGNAME/$NEW_WS" -o -f "/$BUILDS_BASE/$LOGNAME/$NEW_WS" ]; then
        echo $BUILDS_BASE/$LOGNAME/$NEW_WS already exists
        exit 1
fi

CLONE_FS=$BUILDS_BASE/onnv-clone
CLONE_SNAP=`date +%F`

echo "Creating new $NEW_WS by cloneing ${CLONE_FS}@${CLONE_SNAP}...\c"
pfexec /sbin/zfs clone ${CLONE_FS}@${CLONE_SNAP} $BUILDS_BASE/$LOGNAME/$NEW_WS
[ $? != 0 ] && exit $?
echo "changing ownership...\c"
chown -R $LOGNAME /$BUILDS_BASE/$LOGNAME/$NEW_WS 2>&1 | \
        egrep -v "chown: .zfs:|chown: snapshot:"
echo "done.\n"

That funny little pfexec dance at the top ensures that for all users that have the RBAC profile then zbringover will be run with pfexec so that it gets the file_chown privilege. This means users can just type zbringover rather than pfexec zbringover.

I then added the following line into /etc/security/exec_attr

ZFS Builds:solaris:cmd:::/cube/builds/zbringover:privs=file_chown,basic

and this one into /etc/security/prof_attr

ZFS Builds:::ZFS Builds on Borg:profiles=ZFS File System Management

note that the second one uses the hierarchy capabilities of Solaris RBAC profiles to include the standard "ZFS File System Management" profile into my new "ZFS Builds" profile.

I trust all the users on this machine, prior to us having the ability to build as a normal user all of them had root access anwyay - they don't now because they don't need it. So I'm okay with giving out the ability to manage ZFS file systems. So I want all of the users to get this profile by default. The easiest way to do that is to update /etc/security/policy.conf thus:

PROFS_GRANTED=ZFS Builds,Basic Solaris User

So now when the users login they have that profile by default and can run zbringover with the necessary privilege to change the permissions of the clone file system they have just created to themselves.

So I test this a bit with a dummy account and was so proud of myself for such a quick job (start to finish all of about 30 minutes work including the testing) that I sent an email off to all the users of the machine annoucing the new functionality.

Spoke to soon! I had another small script to do the nightly bringovers from the ONNV gate machine into the /cube/builds/onnv-clone file system and I had that setup to run as me from cron. Well I got in this morning and found that the teamware bringover had failed. It was getting confused by the things it saw in the .zfs/snapshot directory. Easy fix for that:

    $ pfexec zfs set snapdir=hidden cube/builds/onnv-clone

Note the pfexec and the $, this was run as me since I have the appropriate RBAC profile from what we did above. Now teamware's bringover command is happy. The script that does the teamware bringover into the file system and creates the snapshot is this one:

#!/bin/pfksh
# Copyright 2005 Sun Microsystems Inc.  All rights reserved.
# Use is subject to license terms.

PATH=/usr/bin:/usr/sbin:/usr/X11/bin:/usr/dt/bin:/usr/openwin/bin:/opt/onbld/bin:/opt/onbld/bin/sparc:/opt/teamware/bin:/usr/ccs/bin:/usr/sfw/bin
export PATH

CLONE_FS=cube/builds/onnv-clone
CLONE_SNAP=`date +%F`

bringover -p /ws/onnv-clone -w /$CLONE_FS
zfs snapshot ${CLONE_FS}@${CLONE_SNAP}

Remeber I said above that this runs from cron as me not as root, the script using pfksh rather than ksh so the zfs snapshot automatically runs with the appropriate privilege since I have the necessary RBAC profile.

All of this is was really simple and pretty obvious to implement once you know RBAC and ZFS. The bit that caught me out though was the .zfs/snapshot directory, but I was glad that the ZFS team had made that a tunable - I now have a real use case for this.

Thanks ZFS team for a great file system, it makes my life as the admin of these two builds machines not just easier but actually fun!


The scripts and RBAC configuration above are released under the CDDL if you copy them out of this blog you MUST place the CDDL header into them that can be found here. It is only missing from here to make the article easier to read.



Technorati Tags: ( Dec 15 2005, 02:27:51 PM GMT ) Permalink Comments [2]

Comments:

With respect to setting the path it turns out not to be that neccessary because the script is only running with file_chown which really on chown(1) or a clone of it can use usefully anyway. The script isn't setuid and doesn't get euid=0 either it runs as the user. The zfs(1m) runs as euid=0 as a result of the pfexec, but in my RBAC config I'm explicitly giving the users the "ZFS File System Managment" profile by including it in the "ZFS Builds" profile. By using pfexec it will only give the euid=0 to exactly /sbin/zfs and nothing else. So no I don't believe there is a security bug or a huge need to set the path.

Posted by Darren J Moffat on December 16, 2005 at 04:07 PM GMT #

Should the script not set the PATH for it to be secure?

Posted by Chris Gerhard on December 16, 2005 at 05:00 PM GMT #

Post a Comment:

Comments are closed for this entry.

Valid HTML! Valid CSS!


follow darrenmoffat at http://twitter.com
Get OpenSolaris  Use OpenOffice.org

This is a personal weblog, I do not speak for my employer.