|
Thursday March 29, 2007 For the last while, I've been helping with the testing of the recently announced ZFS Boot bits that Lin putback yesterday.
We've got the regression testing on these bits completed - these changes don't break existing ZFS functionality, and we've validated that the basic functionality of ZFS bootable datasets works as designed.
I'm now looking at some additional tests for these bits, trying to boot mirrors with missing/detached disks, that sort of thing. (This week, I brought up a Thumper with root on a 47-way mirrored pool! :-)
As with the ZFS Mountroot bits before, I thought that writing a script to automate the install of these bits would be pretty useful while we don't yet have full ZFS support in the installer. Here it is: zfs-actual-root-install.sh.
This is how you use it:
root@usuki[88] ./zfs-actual-root-install.sh --help Usage : zfs-actual-root-install.sh [options to pass to zpool] eg. ./zfs-actual-root-install.sh mirror c0t0d0s0 c0t0d1s0 You need to be running a fresh install of at least snv_50 (with a BFU of Lin's zfsboot bits) for this to work. Note also, you must supply a disk using slice notation: we need SMI labels to boot, whereas "zpool create c0t0d0" would use EFI labels. Only single disks, or mirrors are supported. No stripes or raidz please. If you set the environment variable $ROOT_FS, we use that as the root filesystem.
As mentioned above, ZFS root boot only works with SMI labeled disks - if you've ever given ZFS the entire disk before, it'll have put an EFI label on the disk, so you need to remove that using fdisk, then rewrite the label using format, or fmthard. Not too scary - here's me having just changed the disk type:
Total disk size is 8924 cylinders
Cylinder size is 16065 (512 byte) blocks
Cylinders
Partition Status Type Start End Length %
========= ====== ============ ===== === ====== ===
1 Active Solaris2 1 8923 8923 100
SELECT ONE OF THE FOLLOWING:
1. Create a partition
2. Specify the active partition
3. Delete a partition
4. Change between Solaris and Solaris2 Partition IDs
5. Exit (update disk configuration and exit)
6. Cancel (exit without updating disk configuration)
Enter Selection: 5
format> l
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
Warning: This disk has an EFI label. Changing to SMI label will erase all
current partitions.
Continue? y
Auto configuration via format.dat[no]?
Auto configuration via generic SCSI-2[no]?
Here's the script in action:
root@usuki[92] ./zfs-actual-root-install.sh mirror c2t0d0s0 c2t1d0s0 Updating vfstab on UFS root Starting to copy data from UFS root to /zfsroot - this may take some time. . . . 10576640 blocks . . There's a copy of the old UFS root in /zfsroot/etc/vfstab.old-ufs-root diffs are new vs. old : 6a7 > /dev/dsk/c0d0s0 /dev/rdsk/c0d0s0 / ufs 1 no - 12,13c13 < rootpool/rootfs - / zfs - no - < /dev/dsk/c0d0s0 /dev/rdsk/c0d0s0 /ufsroot ufs - yes - --- > rootpool/rootfs - /zfsroot zfs - yes - Creating ram disk for /zfsroot updating /zfsroot/platform/i86pc/amd64/boot_archive...this may take a minute updating /zfsroot/platform/i86pc/boot_archive...this may take a minute Installing grub on /dev/rdsk/c2t0d0s0 stage1 written to partition 0 sector 0 (abs 16065) stage2 written to partition 0, 260 sectors starting at 50 (abs 16115) Installing grub on /dev/rdsk/c2t1d0s0 stage1 written to partition 0 sector 0 (abs 16065) stage2 written to partition 0, 260 sectors starting at 50 (abs 16115) Okay, assuming we haven't broken anything, when you next reboot, you should be able to select a grub menu entry for ZFS on root! Remember to report anything suspicious via bugster or zfs-discuss@opensolaris.org. If your boot device has changed because of this, remember to change your bios settings. (you should now boot from /dev/dsk/c2t0d0s0 /dev/dsk/c2t1d0s0)
And finally, here's me booting with the new root:
# df -h /
Filesystem size used avail capacity Mounted on
rootpool/rootfs 67G 4.6G 62G 7% /
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
rootpool 4.57G 62.4G 24K /rootpool
rootpool/rootfs 4.57G 62.4G 4.57G legacy
# zpool status -v
pool: rootpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rootpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c2t0d0s0 ONLINE 0 0 0
c2t1d0s0 ONLINE 0 0 0
errors: No known data errors
#
You can install a root pool to a slice that isn't slice 0, but in that case, the script won't work this out, and will run installgrub to that slice - if that's the case, you should manually run the installgrub command to put the new ZFS-capable grub on whatever your boot device is.
One other thing to watch out for, is that if you're BFUing development archives, you might run into 6528202 - so take a copy of /boot/platform/i86pc/kernel/unix before you BFU! If you're happy to wait for a full install of nv_62 or later, then you don't need to worry about this step.
I've said it before, but having ZFS on your root filesystem is just completely awesome - being able to incrementally backup, snapshot and rollback your root filesystem really is amazingly useful. I wrote mountrootadm to help out even more. Of course, eventually I suspect LiveUpgrade will handle all this for you, but in the meantime this does the trick.
Let me know if you've any thoughts or comments about the script. Happy rumbling!
Update: Bart quite rightly pointed out to us that there's no need for all that mucking around with failsafe-boot in order to reconstruct the /dev and /devices filesystems. Much easier and faster is:
mkdir -p /zfs-root-tmp.$$ mount -F lofs -o nosub / /zfs-root-tmp.$$ (cd /zfs-root-tmp.$$; tar cvf - devices dev ) | (cd /zfsroot; tar xvf -) umount /zfs-root-tmp.$$ rm -rf /zfs-root-tmp.$$
So I've updated the post above to change that, fixed the script and tested it - works just fine. Thanks Bart!
Update: Lin pointed out a typo in the create_dirs script where /tmp was being given the wrong permissions, so I've fixed that in this version of the script too.
Update: - we were wrong about it being a typo. Normal service resuming..
(2007-03-29 06:32:47.0) Permalink Comments [9]
Please wait while my microblog loads
Posted by Dick Davies on March 30, 2007 at 01:31 AM IST #
Nice script, but I believe you missed a Note in #(4) Populate the UFS root content to the ZFS root filesystem:
Copy all of the files in the UFS root filesystem to the newly created ZFS root filesystem. The following command does this without crossing mountpoints. This command will take on the order of 30 minutes, give or take. Note, this will not cross mountpoints, if /usr, /var, or other filesystems are on other mountpoints, they will need to be copied over following this command.
# cd / # find . -xdev -depth -print | cpio -pvdm /zfsroot
I have modified your script to copy /usr, /var. I also create zfs filesystems for them. If you like, I can email my modified script.
Ron Halstead
Posted by Ron Halstead on April 24, 2007 at 06:05 PM IST #
Thanks for pointing that out Ron - you're right about the script not crossing mountpoints, so users beware!
I guess I could check in /etc/vfstab to see if /var and /usr are separate filesystems, but where do I end - should I check /opt, /usr/local and all other mounted filesystems ? There's an interesting thread on zfs-discuss about this idea, started from Lori's blog post - now that we can easily carve up the filesystem namespace, where should we start ? All good things to keep in mind!
Posted by Tim Foster on April 26, 2007 at 09:32 PM IST #
Posted by Nicolas Linkert on June 02, 2007 at 10:25 PM IST #
Hi Nicolas, Glad you're finding zfs-boot useful! I see you asked zfs-discuss about this too - I'll just point to my reply there - hope this helps ?
Posted by Tim Foster on June 03, 2007 at 04:17 PM IST #
Posted by ylon on June 07, 2007 at 08:17 PM IST #
Posted by Tim Foster on June 11, 2007 at 12:14 AM IST #
gosh! i felt so crazy doing diz one
Posted by marjorie on March 08, 2008 at 07:45 AM GMT #
Tim,
You might want to add some checking for the dlmgmt bug that causes issues with the /etc/.dlmgmt_door object. To get your script to work, I had to do 'svcadm disable datalink-management' before feeding the script my slices. Upon rebooting, the system drops to maintenance mode since the datalink service is running, but the service can be re-enabled from the console. Without doing this, the datalink service tries to create the door file before /etc has become writeable. There is a bug for this that is expected to be fixed in nv86 I think?
Posted by Blake on March 19, 2008 at 01:33 AM GMT #