Lessons learnt with OpenSolaris 2008.11
Couple of weeks back, for some weird reason my desktop (running on OpenSolaris 2008.11) failed to boot successfully. After some digging into the boot log files, it turned out that hald daemon does not want to start and this failed service is causing other dependent services to fail as well. I was charging my Palm Treo cell phone through USB within my OpenSolaris 2008.11 based workstation before it happened. Not sure, if it had anything to do with it. Once I recover my system, I will need to try it again to see if that is what kept the system from booting up successfully..
Since, this is my primary workstation, I decided to quickly restore my workstation using some of the tools provided by OpenSolaris 2008.11. Hope, some one wanting to recover their installation finds this useful.
One of the advantage(s) of OpenSolaris 2008.11 is that it uses the power of ZFS intrinsically to provide us with the ability to take a snapshot and rollback to the previous , if necessary. For example, OpenSolaris Boot Environment Manager (beadm(1M) , Image Package Manager tool - pkg(1M) - instantly takes a ZFS snapshot of your boot environment and installation files after every successful operation. This allows us to roll back to the snap shot if we screw some thing up and want to restore to previous state.
Now, in my case, I booted my system in single user mode and checked out the list of snapshots available for my boot environment by running the following command:
|
sriramn@sriramn:~$ pfexec beadm list -s (-s informs beadm command to provide the list of snapshots) (I was doing all this in single user mode. So, I haven't written down the output here.. ) |
Now, in my case, I was not able to use those snapshot(s) because I had upgraded my zpool to run on newer version of kernel. If I hand't upgraded my zpool , I could have simply done some thing like
|
sriramn@sriramn:~$ pfexec beadm create -e opensolaris@install opensolaris-1 sriramn@sriramn:~$ pfexec beadm activate opensolaris-1 sriramn@sriramn:~$ pfexec init 6 |
Well, in any case, this is what I had to do to recover my system
- Use OpenSolaris 2008.11 LiveCD to boot up my system. Unforuntately, OpenSolaris LiveCD based installer does not allow one to do reinstallation.So, I had to backup my home directory (using zfs send along with zfs receive ) before proceeding to install.
- Opened a command line terminal window to back up my home directory to an external hard disk by doing some thing like
|
sriramn@sriramn:~$ pfexec zpool import -f rpool (this is where my corrupted opensolaris installation along with my home directory exists) sriramn@sriramn:~$ pfexec zfs list (above command should list all the dataset that is available in the pool. You could use this information to determine which one to be backed up). NAME USED AVAIL REFER MOUNTPOINT sriramn@sriramn:~$ pfexec zfs snapshot rpool/export/home/sriramn@jan09 sriramn@sriramn:~$ pfexec zfs send rpool/export/home/sriramn@jan09 | gzip > /mnt/backupfile.gz |
- Finally, I had to start a new installation by clicking on 'Install Now' icon. Once the system came up successfully, I simply recovered my old data by running the following command
|
<my external had disk is mounted as /mnt where I have saved a ZFS stream of my home directory to /mnt> |
Thought, I would capture some of the lessons that I have learnt during this recovery experience...
- Think twice before running 'zpool update' on your OpenSolaris 200[8-9].xx based desktop. Why ?
- Within OpenSolaris, invoking zpool upgrade command ensures that your disk pool is instantly converted to run on newer version of ZFS there by supporting the newer features of ZFS. However, older version of OpenSolaris kernel will not be able to read your data any more.
- If you would like to know what is in each version of ZFS, you could simply run the following command from the command line
- pfexec /usr/sbin/zpool upgrade -v
- For example, with OpenSolaris 2008.11 (build 101b), released version of ZFS is version 13 and with the latest released Nevada build 106, ZFS version is 14 !. Now, let us say, if you update your system to run on Nevada build 106, and then you explicitly invoked zpool upgrade , now you will not be able to use your OpenSolaris 2008.11 LiveCD and access your home directory !
- Take regular snapshot of your file system as well as boot environment because unlike other platforms, snapshots on ZFS doesn't cost that much in space and the benefits are huge.. OpenSolaris installer automatically create a '@install' snapshot of the boot environment as well as the files created as part of an installation. This is very cool. This would allow us to revert back to this snapshot any time some thing gets messed up.


I ran into the zpool upgrade issue and I have question. If you have more than one boot images in the same zpool and you update the pool, and then run update_grub, are both images still bootable, or is just the image you ran update_grub in.
Posted by martyduffy on February 10, 2009 at 01:59 PM PST #
let say you have some thing like this
rpool/ROOT/opensolaris (say running on build 101)
and
rpool/ROOT/opensolaris-1 (say with build 106)
where you have 1 pool but more than 1 OS image.
now, if you run zpool update command - then your zpool (in this case which is "rpool") will get updated to the zfs version of the current running kernel.
for example, after pkg image-update to build 106 - and after a reboot, if you did zpool update, you won't be able to go back to build 101 !.
now, if you just did pkg image-update but did not do a reboot (then your kernel is still running on build 101) and if you do zpool update - then you actually didn't change anything.
Posted by Sriram Natarajan on February 10, 2009 at 02:22 PM PST #