ElHam is a filesystem testing tool designed to detect corruption, be multiprotocol, and stress a filesystem. It isn't designed to be a benchmark. I'm in the middle of debugging a real nasty NFSv4 bug, read that to mean we haven't a real clue as to what is going on or how to reproduce it, and I need to generate sufficient load on a test system.
So I went and got ElHam from SourceForge.net. I wrote it when I was at NetApp as a tool we could use internally to get multiprotocol lock testing, generate metadir traffic, and to hand out to customers for corruption testing. As such, we stuck a BSD license on it and hung it off of SourceForge.net.
It still needs work done on it - for example, I figured out that it wasn't detecting big endianess. I also have to make a pass through it and make sure that I capture all returns from function calls and check that they are valid. One of the things you need for corruption testing is early detection of problems.
Sometimes in trying to detect corruption, you can get a false positive because of client side caching. If your focus is strictly on the server, i.e., you are testing a filer, that is bad. So you might be tempted to turn off client side caching. It also appears to go faster, but again, ElHam is not designed to be a benchmark.
The other evil with turning off client side caching is that it effectively negates both locking in general and NFSv4 delegations. ElHam is designed to have multiple readers and writers, both local and remote, changing files in a directory tree. Client side caching issues are something it should have to live with.
Anyway, multiple instances (from different architectures and OSes) are possible because ElHam records what is supposed to be in every data block. So when another instance comes along, it is able to compute what should be in the data block and then it can see if the on-disk image is corrupt. I need to write a small application to inject corruption - this will help me get signatures to show people what ElHam has detected.
The current big issue is that ElHam is designed to push a filesystem to capacity and back off. I.e., reads and writes in the face of a full filesystem are interesting. To aid in that testing, it is best that the 'data', 'meta', and 'history' (see ElHam docs) directories each be on a different filesystem. Well yesterday I had all three on the same filesystem and it got full. So I'm trying to reproduce that and see what is happening.
A really neat way to do this is to use ZFS to create different filesystems and then set quotas to control how much space each filesystem is allowed:
# zfs create zoo/elham # zfs set sharenfs=on zoo/elham # zfs create zoo/elham/data # zfs create zoo/elham/meta # zfs create zoo/elham/history # zfs list zoo/elham/* NAME USED AVAIL REFER MOUNTPOINT zoo/elham/data 36.7K 654G 36.7K /zoo/elham/data zoo/elham/history 36.7K 654G 36.7K /zoo/elham/history zoo/elham/meta 36.7K 654G 36.7K /zoo/elham/meta # zfs set quota=2G zoo/elham/data # zfs set quota=20G zoo/elham/meta # zfs set quota=20G zoo/elham/history # zfs list zoo/elham/* NAME USED AVAIL REFER MOUNTPOINT zoo/elham/data 36.7K 2.00G 36.7K /zoo/elham/data zoo/elham/history 36.7K 20.0G 36.7K /zoo/elham/history zoo/elham/meta 36.7K 20.0G 36.7K /zoo/elham/meta
Note that I give the 'history' and 'meta' filesystems much more of a quota. I don't want to run out of space on them.
I'm going to kick off several instances of ElHam and see if I can fill this puppy up.
Okay, I just took the machine which has been running Fedora Core 4 for the longest time and installed Solaris Nevada b56 on it. And I had one of the most painful experiences ever with Solaris. The install went fine, but when it came up, GRUB dropped to a command line prompt and gave out:
error 17, cannot mount selected partition
When pushed with a 'cat /', it would also mention that it did not like partition type 0xbf.
I did everything, reboot the DVD, dropping into single user mode. I reinstalled GRUB, etc. No luck.
I thought it was my BIOS, I kept on changing the boot device. But that didn't make sense - it was at least booting into GRUB. In retrospect, it does. The BIOS would get the hard drive to boot, but GRUB had no idea about the very same hard drive that it was on.
Okay, I noticed that when I was booting in single user mode and when the bios was reporting the hard drives, that the single hard drive was on the 2nd IDE loop. I.e., it was /dev/dsk/c1d0s0. I checked /etc/vfstab, and it was slated to read from there.
I finally got mad enough and swapped the IDE cables - this took 10-15 minutes because the cables in my Shuttle SS51G are tight and I had to pull out the drive cage. Anyway, when I rebooted, I did get farther. It would go through the GRUB menu and reboot.
I got in single user mode and fixed up /etc/vfstab to use /dev/dsk/c0d0s0. Still no luck. A quick search turned up this goldmine: Swapping drives between Solaris machines. Okay, it wasn't as quick as I wanted, I had to go through several pages first. Anyway, I had suspected I had to touch 'devfsadm' and 'bootadm'. I was right.
I followed the instructions:
The system came up.
Can I get a big "Doh!" from the crowd? I'm trying to upgrade my domain server from Fedora Core 4 to Fedora Core 6. I want to isolate what I will need to change to go to Solaris. Everything is kinda going okay, network addresses did not change after a reboot.
But sendmail is queuing my outgoing mail and not logging anything. And it was telling me what was going wrong, but the verbage was just too weird.
Make my changes to sendmail.mc and make:
[root@adept mail]# make WARNING: 'sendmail.mc' is modified. Please install package sendmail-cf to update your configuration.
This actually means do the following:
[tdh@adept doc]> sudo yum install sendmail-cf
I just couldn't parse it correctly. Here is how I found my "Doh!" moment:
The mail queues have entries: [root@adept mail]# mailq /var/spool/mqueue (4 requests) -----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient----------- l1J0feON002810* 9 Sun Feb 18 18:41 <root@adept.internal.excfb.com> <tdh@sun.com> ...
Some testing:
[root@adept mail]# sendmail -v loghyr@loghyr.com kdjfjklfs . loghyr@loghyr.com... Connecting to [127.0.0.1] via relay... 220 adept.internal.excfb.com ESMTP Sendmail 8.13.8/8.13.8; Sun, 18 Feb 2007 18:52:31 -0600 >>> EHLO adept.internal.excfb.com 250-adept.internal.excfb.com Hello [127.0.0.1], pleased to meet you 250-ENHANCEDSTATUSCODES 250-PIPELINING 250-8BITMIME 250-SIZE 250-DSN 250-ETRN 250-AUTH DIGEST-MD5 CRAM-MD5 250-DELIVERBY 250 HELP >>> MAIL From:<root@adept.internal.excfb.com> SIZE=10 AUTH=root@adept.internal.excfb.com 250 2.1.0 <root@adept.internal.excfb.com>... Sender ok >>> RCPT To:<loghyr@loghyr.com> >>> DATA 250 2.1.5 <loghyr@loghyr.com>... Recipient ok 354 Enter mail, end with "." on a line by itself >>> . 250 2.0.0 l1J0qV2d002864 Message accepted for delivery loghyr@loghyr.com... Sent (l1J0qV2d002864 Message accepted for delivery) Closing connection to [127.0.0.1] >>> QUIT 221 2.0.0 adept.internal.excfb.com closing connectionNote that it is talking to 127.0.0.1 and that is not right. What does the sendmail config files look like:
[root@adept mail]# ls -la send* -rw-r--r-- 1 root root 58203 Feb 11 10:58 sendmail.cf -rw-r--r-- 1 root root 7257 Feb 18 17:29 sendmail.mc -rw-r--r-- 1 root root 7209 Feb 18 17:19 sendmail.mc.stock
Okay, that hasn't changed today.
[root@adept mail]# make WARNING: 'sendmail.mc' is modified. Please install package sendmail-cf to update your configuration.
I then get the "Doh!" and install sendmail-cf as shown above!
[root@adept mail]# make [root@adept mail]# ls -la send* -rw-r--r-- 1 root root 59161 Feb 18 18:54 sendmail.cf -rw-r--r-- 1 root root 58203 Feb 11 10:58 sendmail.cf.bak -rw-r--r-- 1 root root 7257 Feb 18 17:29 sendmail.mc -rw-r--r-- 1 root root 7209 Feb 18 17:19 sendmail.mc.stock [root@adept mail]# service sendmail restart Shutting down sm-client: [ OK ] Shutting down sendmail: [ OK ] Starting sendmail: [ OK ] Starting sm-client: [ OK ]
Still not delivering and I am suspicious of why is it trying to talk to domains directly:
l1J0o9AP002853 33 Sun Feb 18 18:50 <tdh@adept.internal.excfb.com>
(Deferred: Connection timed out with www.loghyr.com.)
<loghyr@loghyr.com>
I have to send outgoing mail through cox.net. Look what I have in my sendmail.mc:
[root@adept mail]# grep cox.net sendmail.mc dnl define(`SMART_HOST', `smtp.central.cox.net')dnl
Bzzt, fix it!
And that flushes a bunch of requests after a make and restart!
I had originally implemented the In Kernel Sharetab with GFS and followed the requirement that everything had to be in a directory. As such, I made /system/dfs/sharetab and symlinked /etc/dfs/sharetab to it. Well, while I believe that is really the proper place for it to be at in the name space, I decided to hack GFS to allow me to have a single file as a filesystem - think /etc/mnttab. Between a lab shutdown yesterday and redoing the entire set of changes on kanigix and the latest OpenSolaris drop, I've got the thing working:
[tdh@kanigix dfs]> ls -la total 17 drwxr-xr-x 2 root sys 512 Feb 13 07:16 . drwxr-xr-x 88 root sys 4608 Feb 17 14:19 .. -rw-r--r-- 1 root sys 354 Feb 17 11:46 dfstab -rw-r--r-- 1 root root 68 Feb 17 12:07 fstypes -r--r--r-- 1 root root 246 Feb 17 14:22 sharetab [tdh@kanigix dfs]> cat sharetab /export/zfs/tdh - nfs rw /export/zfs/monster - nfs rw /export/zfs/nfsv4 - nfs rw /export/zfs/nfsv2 - nfs rw / - nfs rw /zoo/isos - nfs rw /export/zfs/nfsv3 - nfs rw /export/home - nfs sec=sys,rw=engineering home dirs /export/zfs - nfs rw
What I don't have working just right is the attribute changes:
[tdh@kanigix dfs]> sudo unshare -F nfs / [tdh@kanigix dfs]> ls -la total 17 drwxr-xr-x 2 root sys 512 Feb 13 07:16 . drwxr-xr-x 88 root sys 4608 Feb 17 14:19 .. -rw-r--r-- 1 root sys 354 Feb 17 11:46 dfstab -rw-r--r-- 1 root root 68 Feb 17 12:07 fstypes -r--r--r-- 1 root root 246 Feb 17 14:34 sharetab
The size and time will not change until I read the file:
[tdh@kanigix dfs]> cat sharetab /export/zfs/tdh - nfs rw /export/zfs/monster - nfs rw /export/zfs/nfsv4 - nfs rw /export/zfs/nfsv2 - nfs rw /zoo/isos - nfs rw /export/zfs/nfsv3 - nfs rw /export/home - nfs sec=sys,rw=engineering home dirs /export/zfs - nfs rw [tdh@kanigix dfs]> ls -la sharetab -r--r--r-- 1 root root 234 Feb 17 14:34 sharetab
I can easily fix that. Instead of recompiling the BFUs, I'm just going to rebuild 'sharefs'. Note if I could unload this module, I could do all of this without rebooting:
[tdh@kanigix sharefs]> pwd /home/tdh/ws/kanigix/usr/src/uts/common/fs/sharefs tdh@kanigix sharefs]> cd ../../../intel/sharefs [tdh@kanigix sharefs]> dmake dmake: defaulting to parallel mode. See the man page dmake(1) for more information on setting up the .dmakerc file. kanigix --> 1 job ...
To see what I need to get into /kernel, a 'dmake install' will tell me a lot:
[tdh@kanigix sharefs]> dmake install dmake: defaulting to parallel mode. See the man page dmake(1) for more information on setting up the .dmakerc file. /usr/bin/rm -f /home/tdh/ws/kanigix/proto/root_i386/kernel/fs/amd64/sharefs; install -s -m 755 -f /home/tdh/ws/kanigix/proto/root_i386/kernel/fs/amd64 debug64/sharefs /usr/bin/rm -f /home/tdh/ws/kanigix/proto/root_i386/kernel/fs/sharefs; install -s -m 755 -f /home/tdh/ws/kanigix/proto/root_i386/kernel/fs debug32/sharefs [tdh@kanigix sharefs]> sudo cp /home/tdh/ws/kanigix/proto/root_i386/kernel/fs/amd64/sharefs /kernel/fs/amd64/sharefs [tdh@kanigix sharefs]> sudo cp /home/tdh/ws/kanigix/proto/root_i386/kernel/fs/sharefs /kernel/fs/sharefs
Now I reboot and test!
The attributes are now following the changes:
[tdh@kanigix dfs]> ls -la sharetab -r--r--r-- 1 root root 182 Feb 17 14:43 sharetab [tdh@kanigix dfs]> sudo share -F nfs / [tdh@kanigix dfs]> ls -la sharetab -r--r--r-- 1 root root 194 Feb 17 14:46 sharetab
I'm pretty much done with the project. I have to pull out some code changes, tidy things up, do some more unit testing, and ship it off for some quality assurance. I'll check to see if I can get code put up to OpenSolaris.org in case people want to play with it.
Been pretty busy, so I popped off a download on mrx to test Starting a performance analysis of my Frankenstien vs Sun w2100z. It is currently running Fedora Core 6 (for another project) and saw it get:
[tdh@mrx ~/]> wget http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso 0% [] 2,753,773 791K/s eta 1h 47m
I think it is safe to say that the cache at Sun is not helping my w2100z to beat my Frankenstien. Also, it could be that the VPN is imposing a penalty on the w2100z.
My guess is going to go either with the NIC/driver or the harddisk or ZFS. There, I've narrowed it down.
First, lets look at ZFS:
[tdh@kanigix ~]> zfs get all zoo NAME PROPERTY VALUE SOURCE zoo type filesystem - zoo creation Sun Jan 14 14:08 2007 - zoo used 15.2G - zoo available 668G - zoo referenced 39.6K - zoo compressratio 1.04x - zoo mounted yes - zoo quota none default zoo reservation none default zoo recordsize 128K default zoo mountpoint /zoo default zoo sharenfs off default zoo shareiscsi off default zoo checksum on default zoo compression off default zoo atime on default zoo devices on default zoo exec on default zoo setuid on default zoo readonly off default zoo zoned off default zoo snapdir hidden default zoo aclmode groupmask default zoo aclinherit secure default zoo canmount on default zoo xattr on default
So, no compression enabled. Bzzt, we have to dig deeper:
[tdh@kanigix ~]> zfs get all zoo/home NAME PROPERTY VALUE SOURCE zoo/home type filesystem - zoo/home creation Sun Jan 14 14:10 2007 - zoo/home used 8.48G - zoo/home available 668G - zoo/home referenced 44.1K - zoo/home compressratio 1.08x - zoo/home mounted yes - zoo/home quota none default zoo/home reservation none default zoo/home recordsize 128K default zoo/home mountpoint /export/zfs local zoo/home sharenfs on local zoo/home shareiscsi off default zoo/home checksum on default zoo/home compression on local zoo/home atime on default zoo/home devices on default zoo/home exec on default zoo/home setuid on default zoo/home readonly off default zoo/home zoned off default zoo/home snapdir hidden default zoo/home aclmode groupmask default zoo/home aclinherit secure default zoo/home canmount on default zoo/home xattr on default
Okay, before we fiddle with ZFS, lets check to see if we can eliminate it as a suspect:
[tdh@kanigix /kanigix]> df -h . Filesystem size used avail capacity Mounted on /dev/dsk/c1d1s4 21G 21M 20G 1% /kanigix [tdh@kanigix /kanigix]> wget http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso 0% [ ] 4,213,237 291.40K/s ETA 3:22:18^C
So it does look like a factor! Not really - while I am getting better speeds than the other day, they are on par with the zfs filesystem today:
[tdh@kanigix ~]> df -h . Filesystem size used avail capacity Mounted on zoo/home/tdh 683G 8.5G 668G 2% /export/zfs/tdh [tdh@kanigix ~]> wget http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso 0% [ ] 2,832,077 323.02K/s ETA 3:22:50^C
I think I've eliminated both the disk and ZFS from being the problem. I think the issue is probably the network card or the driver. I'll have to see if there is a fix for my nge0 problem and then I can try it instead of the rge0.
[tdh@kanigix ~]> ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
rge0: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> mtu 1500 index 2
inet 192.168.2.115 netmask ffffff00 broadcast 192.168.2.255
I started downloading the Ubuntu DVD image and it said it would take 10 hours:
[tdh@kanigix isos]> wget http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso
--11:44:04-- http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso
=> `ubuntu-6.10-dvd-i386.iso'
Resolving mirror.mcs.anl.gov... 146.137.96.7, 146.137.96.15
Connecting to mirror.mcs.anl.gov|146.137.96.7|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3,725,318,144 (3.5G) [application/x-iso9660-image]
0% [ ] 2,078,715 100.13K/s ETA 10:09:22
This is on the desktop I built over the Christmas break. I routinely download the Solaris Nevada DVDs in a much shorter time, so I decided it must be the server was gating my performance. I wanted to download the DVD to a lab system at work and then download it to my home system. The SWAN dns looks hosed, so I couldn't do that. Instead, I started a download on my w2100z and was told it would take about 2 hours:
[tdh@warlock ~]> wget http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso
--11:44:33-- http://mirror.mcs.anl.gov/pub/ubuntu-iso/DVDs/ubuntu/edgy/release/ubuntu-6.10-dvd-i386.iso
=> `ubuntu-6.10-dvd-i386.iso.1'
Resolving webcache.central.sun.com... 129.147.62.26, 129.147.62.30, 129.147.62.25
Connecting to webcache.central.sun.com|129.147.62.26|:8080... connected.
Proxy request sent, awaiting response... 200 OK
Length: 3,725,318,144 (3.5G) [application/x-iso9660-image]
1% [> ] 67,507,355 541.70K/s ETA 1:57:51
Since the two systems share the same link into my house and they are both running close to the same OS, I think they should have the same transfer speeds. I'm wrong. Why?
For kanigix, it could be:
For warlock, it could be:
A rule of thumb I use is that VPN software imposes a 33% penalty on transfers. I could be wrong in this scenario.
Without looking at solid data, I think my next step would be to disconnect the VPN session, which will remove the first two points on warlock. If I'm still getting much better transfer speeds, I'll know to look at the drivers.
I can also look at transfer speeds to other systems in the house.
I also need to start learning some performance tools (probably dtrace) to see what is going.
Normally I don't summarize what I'm about to write about, however, I think this entry is all over the place. But there is useful information in here, So, I'm trying to get first kerberos and then NFSv4 working on a NSLU2 running OpenSlug. In order to validate my results, I also try to get a Linux NFSv4 server up and running on one of my Shuttle SS51G boxes. I finally get that to work, but I have no luck on getting the NSLU2 working correctly as either a server or client.
I decided to try another Linux client to see if I could get the process streamlined:
[tdh@sandman ~]> kadmin -p tdh/admin Couldn't open log file /var/krb5/kdc.log: Permission denied Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin: addprinc -randkey nfs/mrbill.internal.excfb.com WARNING: no policy specified for nfs/mrbill.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "nfs/mrbill.internal.excfb.com@INTERNAL.EXCFB.COM" created. kadmin: addprinc -randkey host/mrbill.internal.excfb.com WARNING: no policy specified for host/mrbill.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "host/mrbill.internal.excfb.com@INTERNAL.EXCFB.COM" created. kadmin: ktadd -k /export/keytabs/mrbill.keytab -e des-cbc-crc:normal nfs/mrbill.internal.excfb.com kadmin: No such file or directory while adding key to keytab
Okay, not only do I need to fix the above, I also need to fix not being able to add to /var/krb5/kdc.log. We can get the keytab generated with:
[tdh@sandman /export]> sudo chown tdh:staff keytabs/
And we see:
kadmin: ktadd -k /export/keytabs/mrbill.keytab -e des-cbc-crc:normal nfs/mrbill.internal.excfb.com Entry for principal nfs/mrbill.internal.excfb.com with kvno 4, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/export/keytabs/mrbill.keytab. kadmin: ktadd -k /export/keytabs/mrbill.keytab -e des-cbc-crc:normal host/mrbill.internal.excfb.com Entry for principal host/mrbill.internal.excfb.com with kvno 3, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/export/keytabs/mrbill.keytab.
Okay, the first thing to note is that mrbill is running OpenSlug:
root@mrbill:~# uname -a Linux mrbill 2.6.16 #1 PREEMPT Fri Jun 9 07:34:31 PDT 2006 armv5teb unknown unknown GNU/Linux
We try to get the keytab:
root@mrbill:~# mount sandman:/export/keytabs /mnt/sandman/keytabs mount: can't get address for sandman root@mrbill:~# host sandman -sh: host: not found
Why? Well it turns out that:
root@mrbill:~# cat /etc/resolv.conf search mshome nameserver 192.168.2.108 nameserver 182.168.2.1
I thought that the domain entered in the turnup init was for the CIFS domain. Easy enough to fix...
root@mrbill:~# cat /etc/resolv.conf search internal.excfb.com nameserver 192.168.2.108 nameserver 182.168.2.1 root@mrbill:~# mount sandman:/export/keytabs /mnt/sandman/keytabs root@mrbill:~# cd /etc root@mrbill:/etc# cp /mnt/sandman/keytabs/mrbill.keytab krb5.keytab cp: cannot open `/mnt/sandman/keytabs/mrbill.keytab' for reading: Permission denied
What now? (Permissions)
root@mrbill:/etc# ls -la /mnt/sandman/keytabs total 9 drwxr-xr-x 2 tdh uucp 512 Feb 12 2007 . drwxr-xr-x 5 root root 4096 Feb 12 08:22 .. -rw-r--r-- 1 root root 1968 Feb 12 06:50 krb5.conf -rw------- 1 tdh uucp 161 Feb 12 2007 mrbill.keytab -rw-r--r-- 1 root root 155 Feb 12 06:48 mrx.keytab
Fix them up on the server and:
root@mrbill:/etc# cp /mnt/sandman/keytabs/mrbill.keytab krb5.keytab
We need to get a good copy of krb5.conf, idmapd.conf, and sysconfig/nfs. For now, we will leave idmapd.conf alone, to illustrate the NFSv4 mapid issue.
root@mrbill:/etc# scp mrx:/etc/krb5.conf . root@mrbill:/etc# scp mrx:/etc/sysconfig/nfs sysconfig
Now this time I know kerberos is not installed:
root@mrbill:/# ls -la ./usr/kerberos/bin/kinit ls: ./usr/kerberos/bin/kinit: No such file or directory
And we can easily add it:
root@mrbill:/# ipkg list | grep krb5 kernel-module-rpcsec-gss-krb5 - 2.6.16-r6.6 - rpcsec-gss-krb5 kernel module root@mrbill:/# ipkg install kernel-module-rpcsec-gss-krb5 Installing kernel-module-rpcsec-gss-krb5 (2.6.16-r6.6) to root... Downloading http://ipkg.nslu2-linux.org/feeds/slugos-bag/cross/3.10-beta/kernel-module-rpcsec-gss-krb5_2.6.16-r6.6_ixp4xxbe.ipk Installing kernel-module-auth-rpcgss (2.6.16-r6.6) to root... Downloading http://ipkg.nslu2-linux.org/feeds/slugos-bag/cross/3.10-beta/kernel-module-auth-rpcgss_2.6.16-r6.6_ixp4xxbe.ipk Configuring kernel-module-auth-rpcgss Configuring kernel-module-rpcsec-gss-krb5
Still not there for me:
root@mrbill:/# ls -la ./usr/kerberos/bin/kinit ls: ./usr/kerberos/bin/kinit: No such file or directory root@mrbill:/# find . -name kinit
My guess is that you can export with kerberos, you just can't mount it.
We should confirm that!
root@mrbill:~# mkdir /home/nfs4 root@mrbill:~# chmod 777 /home/nfs4 root@mrbill:~# cd /home/nfs4 root@mrbill:/home/nfs4# touch see_me root@mrbill:/home/nfs4# chown tdh:10 see_me root@mrbill:/home/nfs4# ls -la total 8 drwxrwxrwx 2 root root 4096 Feb 12 09:00 . drwxrwxr-x 8 root root 4096 Feb 12 09:00 .. -rw-r--r-- 1 tdh uucp 0 Feb 12 09:00 see_me
And I try to add the export:
root@mrbill:/home/nfs4# more /etc/exports /home/NFS4 172.16.0.0/16(rw,fsid=0,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534) root@mrbill:/home/nfs4# cd .. root@mrbill:/home# ls -la total 32 drwxrwxr-x 8 root root 4096 Feb 12 09:00 . drwxr-xr-x 18 root root 4096 Feb 5 22:44 .. drwxrwxrwx 2 tdh uucp 4096 Feb 5 23:03 NFS4 drwxrwxrwx 2 root root 4096 Feb 12 09:00 nfs4 drwxr-xr-x 2 root root 4096 Feb 5 22:53 nfsv2 drwxr-xr-x 2 root root 4096 Feb 5 22:53 nfsv3 drwxr-xr-x 2 root root 4096 Feb 5 22:53 nfsv4 lrwxrwxrwx 1 root root 7 Feb 5 22:26 root -> ../root drwxr-xr-x 2 tdh staff 4096 Feb 7 21:21 tdh root@mrbill:/home#
Looks like /home/NFS4 was created for me, or I'm suffering from severe memory loss...
I could have done this last week, note the time stamp.
root@mrbill:/home# ls -la NFS4 total 8 drwxrwxrwx 2 tdh uucp 4096 Feb 5 23:03 . drwxrwxr-x 8 root root 4096 Feb 12 09:00 .. -rw-r--r-- 1 200096 uucp 0 Feb 5 23:03 ut
Must be memory loss!
root@mrbill:/home# cd NFS4/ root@mrbill:/home/NFS4# touch see_me root@mrbill:/home/NFS4# chown tdh:10 see_me root@mrbill:/home/NFS4# ls -la total 8 drwxrwxrwx 2 tdh uucp 4096 Feb 12 09:03 . drwxrwxr-x 8 root root 4096 Feb 12 09:00 .. -rw-r--r-- 1 tdh uucp 0 Feb 12 09:03 see_me -rw-r--r-- 1 200096 uucp 0 Feb 5 23:03 ut
And yes:
[tdh@mrx ipk]> showmount -e mrbill Export list for mrbill: /home/NFS4 172.16.0.0/16
I was in 172.16.0.0/16 space last week. Touch up the export and:
[tdh@mrx ipk]> showmount -e mrbill Export list for mrbill: /home/NFS4 192.168.2.0/24
Okay, I do the mount and I'll claim it gets done as nfsv3:
[tdh@mrx ipk]> sudo mount mrbill:/home/NFS4 /mnt/mrbill/NFS4 [tdh@mrx ipk]> ls -la /mnt/mrbill/NFS4 total 8 drwxrwxrwx 2 tdh wheel 4096 Feb 12 03:03 . drwxr-xr-x 3 root root 4096 Feb 12 11:08 .. -rw-r--r-- 1 tdh wheel 0 Feb 12 03:03 see_me -rw-r--r-- 1 200096 wheel 0 Feb 5 17:03 ut
Why do I claim it is nfsv3? Because I suspect that the idmapping should be hosed. Can we verify this? Yes:
[tdh@mrx ipk]> sudo umount /mnt/mrbill/NFS4 [tdh@mrx ipk]> sudo mount -o vers=3 mrbill:/home/NFS4 /mnt/mrbill/NFS4 [tdh@mrx ipk]> ls -la /mnt/mrbill/NFS4 total 8 drwxrwxrwx 2 tdh wheel 4096 Feb 12 03:03 . drwxr-xr-x 3 root root 4096 Feb 12 11:08 .. -rw-r--r-- 1 tdh wheel 0 Feb 12 03:03 see_me -rw-r--r-- 1 200096 wheel 0 Feb 5 17:03 ut [tdh@mrx ipk]> sudo umount /mnt/mrbill/NFS4 [tdh@mrx ipk]> sudo mount -o vers=4 mrbill:/home/NFS4 /mnt/mrbill/NFS4 'vers=4' is not supported. Use '-t nfs4' instead. [tdh@mrx ipk]> sudo mount -t nfs4 mrbill:/home/NFS4 /mnt/mrbill/NFS4 mount.nfs4: mount point /mnt/mrbill/NFS4 does not exist
Okay, mrbill knows nothing about NFSv4 as far as I can tell:
root@mrbill:/home/NFS4# mount -t nfs4 sandman:/export/home /mnt/sandman/home mount: unknown filesystem type 'nfs4'
I'm sensing protocol discrimination here:
root@mrbill:/home/NFS4# ipkg list | grep -i nfs kernel-module-lockd - 2.6.16-r6.6 - lockd kernel module; NFS file locking service version 0.5. kernel-module-nfs - 2.6.16-r6.6 - nfs kernel module kernel-module-nfs - 2.6.16-r6.4 - kernel-module-nfsd - 2.6.16-r6.6 - nfsd kernel module nfs-utils - 1.0.6-r7 - userspace utilities for kernel nfs nfs-utils-doc - 1.0.6-r7 - userspace utilities for kernel nfs
Time to check the log file:
Feb 12 09:08:29 (none) user.warn kernel: nfsd: nfsv4 idmapping failing: has idmapd not been started?
Okay, configure idmapping and reboot:
Feb 12 09:16:37 (none) user.info kernel: Installing knfsd (copyright (C) 1996 okir@monad.swb.de). Feb 12 09:16:37 (none) user.warn kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Feb 12 09:16:37 (none) user.warn kernel: NFSD: unable to find recovery directory /var/lib/nfs/v4recovery Feb 12 09:16:37 (none) user.warn kernel: NFSD: starting 90-second grace period
Try the mount again:
[tdh@mrx ipk]> sudo mount -t nfs4 mrbill:/home/NFS4 /mnt/mrbill/NFS4 mount.nfs4: Permission denied
And try it from a Solaris client:
[tdh@sandman keytabs]> sudo mount mrbill:/home/NFS4 /mnt/mrbill/NFS4 [tdh@sandman keytabs]> sudo mount mrbill:/home/NFS4 /mnt/mrbill/NFS4 NFS compound failed for server mrbill: error 7 (RPC: Authentication error) NFS compound failed for server mrbill: error 7 (RPC: Authentication error) NFS compound failed for server mrbill: error 7 (RPC: Authentication error) NFS compound failed for server mrbill: error 7 (RPC: Authentication error) NFS compound failed for server mrbill: error 7 (RPC: Authentication error) NFS compound failed for server mrbill: error 7 (RPC: Authentication error) nfs mount: mount: /mnt/mrbill/NFS4: Permission denied
Okay, can we get Kerberos working at all on the NSLU2?
root@mrbill:~# more /etc/exports /home/NFS4 192.168.2.0/24(rw,fsid=0,sec=krb5,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534) root@mrbill:~# exportfs -rv exportfs: /etc/exports:1: unknown keyword "sec=krb5" unexporting sandman.internal.excfb.com:/home/NFS4 from kernel
The keyword is not correct? Time to try on a known good linux config:
[tdh@mrx ipk]> cat /etc/exports /home/tdh 192.168.2.0/24(rw,fsid=0,sec=krb5,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534) [tdh@mrx ipk]> sudo exportfs -rv exportfs: /etc/exports:1: unknown keyword "sec=krb5"
Okay, here is what we are supposed to do:
[tdh@mrx ipk]> cat /etc/exports /home/tdh gss/krb5(rw,fsid=0,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534) [tdh@mrx ipk]> sudo exportfs -rv exporting gss/krb5:/home/tdh exporting gss/krb5:/home/tdh to kernel gss/krb5:/home/tdh: Cannot allocate memory
By sheer effort of will, I determined that the firewall was on.
root@mrbill:~# showmount -e mrx Export list for mrx: /home/tdh gss/krb5
First lets see what happens without kerberos:
[tdh@sandman ~]> sudo mount -o vers=3 mrx:/home/tdh /mnt/mrx/tdh [tdh@sandman ~]> ls -la /mnt/mrx/tdh total 230394 drwxr-xr-x 7 tdh staff 4096 Feb 12 02:01 . drwxr-xr-x 3 root root 512 Feb 12 11:49 ..
And NFSv4:
[tdh@sandman ~]> sudo mount mrx:/home/tdh /mnt/mrx/tdh nfs mount: mrx:/home/tdh: No such file or directory
Okay, I knew about this, but forgot it. I think I heard Bruce complaining about still having it:
[tdh@sandman ~]> sudo mount mrx:/ /mnt/mrx/tdh [tdh@sandman ~]> ls -al /mnt/mrx/tdh total 230394 drwxr-xr-x 7 tdh nobody 4096 Feb 12 02:01 . drwxr-xr-x 3 root root 512 Feb 12 11:49 .. -rw------- 1 tdh nobody 68 Feb 12 01:51 .Xauthority -rw------- 1 tdh nobody 96 Feb 12 11:31 .lesshst
And now we turn on kerberos:
[tdh@sandman ~]> sudo mount mrx:/ /mnt/mrx/tdh NFS compound failed for server mrx: error 7 (RPC: Authentication error) NFS compound failed for server mrx: error 7 (RPC: Authentication error) NFS compound failed for server mrx: error 7 (RPC: Authentication error) nfs mount: mount: /mnt/mrx/tdh: Permission denied
We can be very specific about what security flavor we want to use:
[tdh@sandman ~]> sudo mount -o sec=krb5 mrx:/ /mnt/mrx/tdh nfs mount: mount: /mnt/mrx/tdh: Permission denied
Note that the compound fails messages must have been about AUTH_NONE, AUTH_SYS, and AUTH_DH.
I think I've found the answer in Mike Eisler's blog Real Authentication in NFS, scroll down into the comments:
> Also, does NetApp require a root principle like Solaris did prior to 10? Actually even prior to Solaris 10, the Solaris NFS server would allow an NFSv3 mount if root didn't have Kerberos credentials. ONTAP is the same way. However, if using NFSv4, because NFSv4 has no separate mount protocol, an NFSv4 server cannot distinguish a mount from a LOOKUP. If a volume is exported with sec=krb5, then the NFSv4 requests need to be using Kerberos. Since UNIX clients usually require one to be superuser to do an NFS mount, superuser (root) needs to have credentials. Root credentials aren't required, but whatever uid the credentials map to has to have search permissions for the path name.
And we can try that here:
kadmin: addprinc root WARNING: no policy specified for root@INTERNAL.EXCFB.COM; defaulting to no policy Enter password for principal "root@INTERNAL.EXCFB.COM": Re-enter password for principal "root@INTERNAL.EXCFB.COM": Principal "root@INTERNAL.EXCFB.COM" created.
And then we grab a ticket:
[tdh@sandman ~]> sudo kinit root Password for root@INTERNAL.EXCFB.COM: [tdh@sandman ~]> sudo mount -o sec=krb5 mrx:/ /mnt/mrx/tdh
Aargh!
[tdh@sandman ~]> ls -la /mnt/mrx/tdh total 230394 drwxr-xr-x 7 tdh nobody 4096 Feb 12 02:01 . drwxr-xr-x 3 root root 512 Feb 12 11:49 .. -rw------- 1 tdh nobody 68 Feb 12 01:51 .Xauthority -rw------- 1 tdh nobody 96 Feb 12 11:31 .lesshst
Since we can't even get the export shared without kerberos on mrbill, that does not explain the issue on that machine.
This works:
[tdh@sandman ~]> sudo mount -o vers=3 mrbill:/home/NFS4 /mnt/mrbill/NFS4
And this does not:
[tdh@sandman ~]> sudo mount -o vers=4 mrbill:/ /mnt/mrbill/NFS4 nfs mount: mount: /mnt/mrbill/NFS4: Resource temporarily unavailable
I'll come back to this later...
We always seem to have problems at Connectathon setting up Kerberos. So I decided to take the cookbook we use there and get kerberos working on my home systems. Please note that I could easily clean up the notes to not show some errors I make. But then, where is the love?
Also, as with any first foray into a new tool, I have no clue what I am doing. I kinda understand tickets and the ideas behind Kerberos, but I'm really in the dark as to what I'm supposed to do.
First edit /etc/krb5/krb5.conf:
# diff krb5.conf stock/krb5.conf
35c35
< default_realm = INTERNAL.EXCFB.COM
---
> default_realm = ___default_realm___
38,41c38,43
< INTERNAL.EXCFB.COM = {
< kdc = sandman.internal.excfb.com
< kdc = ultralord.internal.excfb.com
< admin_server = sandman.internal.excfb.com
---
> ___default_realm___ = {
> kdc = ___master_kdc___
> kdc = ___slave_kdc1___
> kdc = ___slave_kdc2___
> kdc = ___slave_kdcN___
> admin_server = ___master_kdc___
Then edit /etc/krb5/kdc.conf:
# diff kdc.conf stock/kdc.conf
32c32
< INTERNAL.EXCFB.COM = {
---
> ___default_realm___ = {
41,42d40
< sunw_dbprob_enable = true
< sunw_dbprop_master_ulogsize = 1000
Make sure you can get at the kdcs via DNS (or whatever name service in /etc/resolv.conf)
# host sandman sandman.internal.excfb.com has address 192.168.2.109 # host sandman.internal.excfb.com sandman.internal.excfb.com has address 192.168.2.109
Create the kerberos database
# /usr/sbin/kdb5_util create -r INTERNAL.EXCFB.COM -s Initializing database '/var/krb5/principal' for realm 'INTERNAL.EXCFB.COM', master key name 'K/M@INTERNAL.EXCFB.COM' You will be prompted for the database Master Password. It is important that you NOT FORGET this password. Enter KDC database master key: Re-enter KDC database master key to verify:
Start getting some principals:
# /usr/sbin/kadmin.local Authenticating as principal root/admin@INTERNAL.EXCFB.COM with password. kadmin.local: addprinc tdh/admin WARNING: no policy specified for tdh/admin@INTERNAL.EXCFB.COM; defaulting to no policy Enter password for principal "tdh/admin@INTERNAL.EXCFB.COM": Re-enter password for principal "tdh/admin@INTERNAL.EXCFB.COM": Principal "tdh/admin@INTERNAL.EXCFB.COM" created.
Get some kiprop installed:
kadmin.local: addprinc -randkey kiprop/sandman.internal.excfb.com WARNING: no policy specified for kiprop/sandman.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy add_principal: Principal or policy already exists while creating "kiprop/sandman.internal.excfb.com@INTERNAL.EXCFB.COM". kadmin.local: addprinc -randkey kiprop/ultralord.internal.excfb.com WARNING: no policy specified for kiprop/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "kiprop/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM" created.
Enable kadmin and changepw:
kadmin.local: ktadd -k /etc/krb5/kadm.keytab kadmin/sandman.internal.excfb.com Entry for principal kadmin/sandman.internal.excfb.com with kvno 3, encryption type AES-128 CTS mode with 96-bit SHA-1 HMAC added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kadmin/sandman.internal.excfb.com with kvno 3, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kadmin/sandman.internal.excfb.com with kvno 3, encryption type ArcFour with HMAC/md5 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kadmin/sandman.internal.excfb.com with kvno 3, encryption type DES cbc mode with RSA-MD5 added to keytab WRFILE:/etc/krb5/kadm.keytab. kadmin.local: ktadd -k /etc/krb5/kadm.keytab changepw/sandman.internal.excfb.com Entry for principal changepw/sandman.internal.excfb.com with kvno 3, encryption type AES-128 CTS mode with 96-bit SHA-1 HMAC added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal changepw/sandman.internal.excfb.com with kvno 3, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal changepw/sandman.internal.excfb.com with kvno 3, encryption type ArcFour with HMAC/md5 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal changepw/sandman.internal.excfb.com with kvno 3, encryption type DES cbc mode with RSA-MD5 added to keytab WRFILE:/etc/krb5/kadm.keytab.
Enable kiprop:
kadmin.local: ktadd -k /etc/krb5/kadm.keytab kiprop/sandman.internal.excfb.com Entry for principal kiprop/sandman.internal.excfb.com with kvno 3, encryption type AES-128 CTS mode with 96-bit SHA-1 HMAC added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kiprop/sandman.internal.excfb.com with kvno 3, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kiprop/sandman.internal.excfb.com with kvno 3, encryption type ArcFour with HMAC/md5 added to keytab WRFILE:/etc/krb5/kadm.keytab. Entry for principal kiprop/sandman.internal.excfb.com with kvno 3, encryption type DES cbc mode with RSA-MD5 added to keytab WRFILE:/etc/krb5/kadm.keytab.
Quit:
kadmin.local: quit
Enable the services:
# svcadm enable -r network/security/krb5kdc # svcadm enable -r network/security/kadmin
Authenticate the admin account:
# /usr/sbin/kadmin -p tdh/admin Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin: Communication failure with server while initializing kadmin interface
Hmm, I got the right password. I can see what happens when it is wrong:
# /usr/sbin/kadmin -p tdh/admin Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin: Incorrect password while initializing kadmin interface
Ahh, lets see if kerberos is up and running:
# grep kadmin /var/adm/messages Feb 11 23:31:19 sandman svc.startd[7]: [ID 748625 daemon.error] network/security/kadmin:default failed repeatedly: transitioned to maintenance (see 'svcs -xv' for details) Feb 11 23:31:57 sandman kadmin[4143]: [ID 737709 user.error] unable to open connection to ADMIN server (t_error 9) Feb 11 23:33:56 sandman kadmin[4146]: [ID 737709 user.error] unable to open connection to ADMIN server (t_error 9)
No, it is not.
# svcs -xv svc:/network/security/kadmin:default (Kerberos administration daemon) State: maintenance since Sun Feb 11 23:31:19 2007 Reason: Restarting too quickly. See: http://sun.com/msg/SMF-8000-L5 See: man -M /usr/share/man -s 1M kadmind See: /var/svc/log/network-security-kadmin:default.log Impact: This service is not running.
Clear the maintenance state:
# svcadm clear /network/security/kadmin:default
Restart:
# svcadm enable -r network/security/kadminCheck:
# svcs -xv #
And try again:
# /usr/sbin/kadmin -p tdh/admin Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin: Communication failure with server while initializing kadmin interface
If we look at kadm5.acl:
*/admin@___default_realm___ *
Hmm, touch that up:
*/admin@INTERNAL.EXCFB.COM *
And for sanity:
# grep default * kdc.conf:[kdcdefaults] kdc.conf: default_principal_flags = +preauth krb5.conf:[libdefaults] krb5.conf: default_realm = INTERNAL.EXCFB.COM krb5.conf: ___domainname___ = ___default_realm___ krb5.conf: default = FILE:/var/krb5/kdc.log krb5.conf:[appdefaults]
Okay, time to fix up krb5.conf as well:
[domain_realm]
___domainname___ = INTERNAL.EXCFB.COM
And restart:
# svcadm restart network/security/krb5kdc # svcadm restart network/security/kadmin
And try again:
# /usr/sbin/kadmin -p tdh/admin Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin: Communication failure with server while initializing kadmin interface
Okay, we know it is talking to something, i.e., it understands a bad password.
Lets try something else:
# kadmin.local Authenticating as principal root/admin@INTERNAL.EXCFB.COM with password. kadmin.local: addprinc admin/admin@INTERNAL.EXCFB.COM WARNING: no policy specified for admin/admin@INTERNAL.EXCFB.COM; defaulting to no policy Enter password for principal "admin/admin@INTERNAL.EXCFB.COM": Re-enter password for principal "admin/admin@INTERNAL.EXCFB.COM": Principal "admin/admin@INTERNAL.EXCFB.COM" created. kadmin.local: quit
Okay, time to search. If we look at System Administration Guide: Security Services :
Communication failure with server while initializing kadmin interface
Cause: The host that was entered for the admin server, also called the master KDC,
did not have the kadmind daemon running.
Solution: Make sure that you specified the correct host name for the master KDC.
If you specified the correct host name, make sure that kadmind is running on
the master KDC that you specified.
But wait:
# svcs | grep krb online 23:43:04 svc:/network/security/krb5kdc:default # svcs | grep kad maintenance 23:42:54 svc:/network/security/kadmin:default # svcs -vx svc:/network/security/kadmin:default (Kerberos administration daemon) State: maintenance since Sun Feb 11 23:42:54 2007 Reason: Restarting too quickly. See: http://sun.com/msg/SMF-8000-L5 See: man -M /usr/share/man -s 1M kadmind See: /var/svc/log/network-security-kadmin:default.log Impact: This service is not running.
Lets look at the log file:
Feb 11 23:42:53 sandman kadmind[4275](Error): Keytab file "/etc/krb5/kadm5.keytab" does not exist
Feb 11 23:42:53 sandman kadmind[4275](Error): Keytab file "/etc/krb5/kadm5.keytab" does not exist
Feb 11 23:42:53 sandman kadmind[4275](info): No dictionary file specified, continuing without one.
Feb 11 23:42:53 sandman kadmind[4275](Error): Unable to set RPCSEC_GSS service names ('kadmin@sandman.internal.excfb.com,changepw@sandman.internal.excfb.com')
krb5kdc: Interrupted system call - while selecting for network input(1)
Feb 11 23:43:03 sandman krb5kdc[4105](info): shutting down
Hmm, we need to create a keytab:
# ls -la /etc/krb5/kadm5.keytab /etc/krb5/kadm5.keytab: No such file or directory
Ack, why do I have a kadm.keytab and not a kadm5.keytab?
# mv kadm.keytab kadm5.keytab
Because that is what I frigging entered in my session!
# /usr/sbin/kadmin -p tdh/admin Authenticating as principal tdh/admin with password. Password for tdh/admin@INTERNAL.EXCFB.COM: kadmin:
The correct incantations should have been:
kadmin.local: ktadd -k /etc/krb5/kadm5.keytab kadmin/sandman.internal.excfb.com kadmin.local: ktadd -k /etc/krb5/kadm5.keytab changepw/sandman.internal.excfb.com kadmin.local: ktadd -k /etc/krb5/kadm5.keytab kiprop/sandman.internal.excfb.com
Okay, back to our regularly scheduled programming:
What principals exist?
kadmin: listprincs K/M@INTERNAL.EXCFB.COM admin/admin@INTERNAL.EXCFB.COM changepw/sandman.internal.excfb.com@INTERNAL.EXCFB.COM kadmin/changepw@INTERNAL.EXCFB.COM kadmin/history@INTERNAL.EXCFB.COM kadmin/sandman.internal.excfb.com@INTERNAL.EXCFB.COM kiprop/sandman.internal.excfb.com@INTERNAL.EXCFB.COM kiprop/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM krbtgt/INTERNAL.EXCFB.COM@INTERNAL.EXCFB.COM tdh/admin@INTERNAL.EXCFB.COM
To kerberize NFS, we need to touch up /etc/nfssec.conf:
# diff nfssec.conf nfssec.conf.stock 48,50c48,50 < krb5 390003 kerberos_v5 default - # RPCSEC_GSS < krb5i 390004 kerberos_v5 default integrity # RPCSEC_GSS < krb5p 390005 kerberos_v5 default privacy # RPCSEC_GSS --- > #krb5 390003 kerberos_v5 default - # RPCSEC_GSS > #krb5i 390004 kerberos_v5 default integrity # RPCSEC_GSS > #krb5p 390005 kerberos_v5 default privacy # RPCSEC_GSS
We need to add a nfs principal:
kadmin: addprinc -randkey nfs/sandman.internal.excfb.com WARNING: no policy specified for nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM" created. kadmin: ktadd nfs/sandman.internal.excfb.com Entry for principal nfs/sandman.internal.excfb.com with kvno 3, encryption type AES-128 CTS mode with 96-bit SHA-1 HMAC added to keytab WRFILE:/etc/krb5/krb5.keytab. Entry for principal nfs/sandman.internal.excfb.com with kvno 3, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/etc/krb5/krb5.keytab. Entry for principal nfs/sandman.internal.excfb.com with kvno 3, encryption type ArcFour with HMAC/md5 added to keytab WRFILE:/etc/krb5/krb5.keytab. Entry for principal nfs/sandman.internal.excfb.com with kvno 3, encryption type DES cbc mode with RSA-MD5 added to keytab WRFILE:/etc/krb5/krb5.keytab.
Verify that is does indeed exist:
# klist -k Keytab name: FILE:/etc/krb5/krb5.keytab KVNO Principal ---- -------------------------------------------------------------------------- 3 nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM 3 nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM 3 nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM 3 nfs/sandman.internal.excfb.com@INTERNAL.EXCFB.COM
And now we are going to have to make a share that is kerberized and setup a client to access it:
# /usr/sbin/kclient
Starting client setup
---------------------------------------------------
Do you want to use DNS for kerberos lookups ? [y/n]: n
No action performed.
Enter the Kerberos realm: INTERNAL.EXCFB.COM
Specify the KDC hostname for the above realm: sandman.internal.excfb.com
sandman.internal.excfb.com
Note, this system and the KDC's time must be within 5 minutes of each other for Kerberos to function. Both systems should run some form of time
synchronization system like Network Time Protocol (NTP).
Setting up /etc/krb5/krb5.conf.
Enter the krb5 administrative principal to be used: tdh/admin
Obtaining TGT for tdh/admin ...
Password for tdh/admin@INTERNAL.EXCFB.COM:
Do you have multiple DNS domains spanning the Kerberos realm INTERNAL.EXCFB.COM ? [y/n]: n
No action performed.
Do you plan on doing Kerberized nfs ? [y/n]: y
nfs/ultralord.internal.excfb.com entry ADDED to KDC database.
nfs/ultralord.internal.excfb.com entry ADDED to keytab.
host/ultralord.internal.excfb.com entry ADDED to KDC database.
host/ultralord.internal.excfb.com entry ADDED to keytab.
Do you want to copy over the master krb5.conf file ? [y/n]: y
Enter the pathname of the file to be copied: /etc/krb5/krb5.conf
cp: /etc/krb5/krb5.conf and /etc/krb5/krb5.conf are identical
Copy of /etc/krb5/krb5.conf failed, exiting.
---------------------------------------------------
Setup FAILED.
Hmm, how are we supposed to enter that? I bet we need to use /net. Which I don't have configured right now. Okay, the hard way:
# scp sandman:/etc/krb5/krb5.conf /etc/krb5/krb5.conf
Now, lets set up a test share:
# cd /export # mkdir kerberos # cd kerberos # touch see_me # chown tdh:staff see_me # ls -la total 4 drwxr-xr-x 2 root root 512 Feb 12 00:23 . drwxr-xr-x 4 root sys 512 Feb 12 00:23 .. -rw-r--r-- 1 tdh staff 0 Feb 12 00:23 see_me # share -F nfs -o sec=krb5:krb5i:krb5p -d "Kerberos" /export/kerberos # share -F nfs -d "Home dirs" /export/home # share - /export/kerberos sec=krb5,sec=krb5i,sec=krb5p "Kerberos" - /export/home rw "Home dirs"
Now try to get some access:
[tdh@ultralord ~]> kinit kinit(v5): Client not found in Kerberos database while getting initial credentials [tdh@ultralord ~]> sudo klist -k Keytab name: FILE:/etc/krb5/krb5.keytab KVNO Principal ---- -------------------------------------------------------------------------- 4 nfs/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 nfs/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 nfs/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 nfs/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 host/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 host/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 host/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM 4 host/ultralord.internal.excfb.com@INTERNAL.EXCFB.COM
Okay, I think I need to add user principals for tdh:
kadmin: addprinc tdh WARNING: no policy specified for tdh@INTERNAL.EXCFB.COM; defaulting to no policy Enter password for principal "tdh@INTERNAL.EXCFB.COM": Re-enter password for principal "tdh@INTERNAL.EXCFB.COM": Principal "tdh@INTERNAL.EXCFB.COM" created. [tdh@ultralord ~]> kinit Password for tdh@INTERNAL.EXCFB.COM:
And now I want to get a mount:
[tdh@ultralord ~]> sudo mkdir -p /mnt/sandman/home [tdh@ultralord ~]> sudo mkdir -p /mnt/sandman/kerberos [tdh@ultralord ~]> sudo showmount -e sandman export list for sandman: /export/kerberos (everyone) /export/home (everyone) [tdh@ultralord ~]> sudo mount sandman:/export/kerberos /mnt/sandman/kerberos [tdh@ultralord ~]> sudo mount sandman:/export/home /mnt/sandman/home [tdh@ultralord ~]> ls -al /mnt/sandman/kerberos total 4 drwxr-xr-x 2 root root 512 Feb 12 00:23 . drwxr-xr-x 4 root root 512 Feb 12 00:36 .. -rw-r--r-- 1 tdh staff 0 Feb 12 00:23 see_me [tdh@ultralord ~]> ls -la /mnt/sandman/home total 22 drwxr-xr-x 4 root root 512 Dec 30 15:01 . drwxr-xr-x 4 root root 512 Feb 12 00:36 .. drwx------ 2 root root 8192 Dec 20 11:28 lost+found drwxr-xr-x 4 tdh staff 512 Jan 21 20:48 tdh
Success!
But wait, we need to show that a client without kerberos enabled will be denied access to sandman:/export/kerberos:
[tdh@kanigix ~]> sudo mkdir -p /mnt/sandman/home [tdh@kanigix ~]> sudo mkdir -p /mnt/sandman/kerberos [tdh@kanigix ~]> sudo mount sandman:/export/kerberos /mnt/sandman/kerberos nfs mount: mount: /mnt/sandman/kerberos: Permission denied
Some other things to do would be to setup /etc/pam.conf to allow single signon - i.e., use ssh without a password. We also need to setup ultralord as a slave.
But before I tune this out, we need to get a Linux client up and running. Why? Because we need to show we can interoperate.
Some systems only support single DES, so we need to create special keytabs for them:
kadmin: addprinc -randkey nfs/mrx.internal.excfb.com WARNING: no policy specified for nfs/mrx.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "nfs/mrx.internal.excfb.com@INTERNAL.EXCFB.COM" created. kadmin: addprinc -randkey host/mrx.internal.excfb.com WARNING: no policy specified for host/mrx.internal.excfb.com@INTERNAL.EXCFB.COM; defaulting to no policy Principal "host/mrx.internal.excfb.com@INTERNAL.EXCFB.COM" created.
Now, I've created /export/keytabs to store the keytab files we will need:
# cd /export # mkdir keytabs # share -F nfs -o ro /export/keytabs
And we can create the keytab:
kadmin: ktadd -k /export/keytabs/mrx.keytab -e des-cbc-crc:normal nfs/mrx.internal.excfb.com Entry for principal nfs/mrx.internal.excfb.com with kvno 3, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/export/keytabs/mrx.keytab. kadmin: ktadd -k /export/keytabs/mrx.keytab -e des-cbc-crc:normal host/mrx.internal.excfb.com Entry for principal host/mrx.internal.excfb.com with kvno 3, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/export/keytabs/mrx.keytab.
We see we are in business:
# cp /etc/krb5/krb5.conf /export/keytabs/ # ls -la total 10 drwxr-xr-x 2 root root 512 Feb 12 00:50 . drwxr-xr-x 5 root sys 512 Feb 12 00:46 .. -rw-r--r-- 1 root root 1968 Feb 12 00:50 krb5.conf -rw------- 1 root root 155 Feb 12 00:48 mrx.keytab # chmod +r mrx.keytab
And now we setup the Linux machine:
[root@mrx ~]# mkdir -p /mnt/sandman/keytabs [root@mrx ~]# showmount -e sandman Export list for sandman: /export/kerberos (everyone) /export/home (everyone) /export/keytabs (everyone) [root@mrx ~]# mount sandman:/export/keytabs /mnt/sandman/keytabs
We should make sure we do not have access to sandman:/export/kerberos:
[root@mrx ~]# mkdir -p /mnt/sandman/kerberos [root@mrx ~]# mkdir -p /mnt/sandman/home [root@mrx ~]# mount sandman:/export/kerberos /mnt/sandman/kerberos mount: sandman:/export/kerberos failed, security flavor not supported
What do we need to change:
[root@mrx ~]# cd /etc [root@mrx etc]# ls -la k* -rw-r--r-- 1 root root 657 Jan 9 14:03 krb5.conf -rw-r--r-- 1 root root 2241 Jul 13 2006 krb.conf -rw-r--r-- 1 root root 1296 Jul 13 2006 krb.realms [root@mrx etc]# mkdir stock [root@mrx etc]# cp k* stock [root@mrx etc]# cp /mnt/sandman/keytabs/krb5.conf . cp: overwrite `./krb5.conf'? y [root@mrx etc]# cp /mnt/sandman/keytabs/mrx.keytab krb5.keytab
And we try to authenticate:
[tdh@mrx ~]> kinit kinit: Command not found.
Okay, we need to install the kerberos packages:
[tdh@mrx /]> sudo yum install krb5-workstation Loading "installonlyn" plugin Setting up Install Process Setting up repositories Reading repository metadata in from local files Parsing package install arguments Nothing to do
No, we don't. Where is that rascally rabbit?
[tdh@mrx /]> sudo find . -name kinit ./usr/kerberos/bin/kinit [tdh@mrx /]> ./usr/kerberos/bin/kinit Password for tdh@INTERNAL.EXCFB.COM:
And we try the mount:
[tdh@mrx /]> sudo mount sandman:/export/kerberos /mnt/sandman/kerberos
mount: sandman:/export/kerberos failed, security flavor not supported
[tdh@mrx /]> ./usr/kerberos/bin/klist
Ticket cache: FILE:/tmp/krb5cc_1066
Default principal: tdh@INTERNAL.EXCFB.COM
Valid starting Expires Service principal
02/12/07 01:01:42 02/12/07 09:01:42 krbtgt/INTERNAL.EXCFB.COM@INTERNAL.EXCFB.COM
renew until 02/13/07 00:59:17
Kerberos 4 ticket cache: /tmp/tkt1066
klist: You have no tickets cached
What is up here?
# snoop -x 0,2000 -o /tmp/m2s.snoop sandman mrx Using device /dev/hme (promiscuous mode) 33 ^C
Note: I used -x 0,2000 to get payload data. I knew I would want to look at most of the packet.
And
[tdh@mrx ~]> sudo mount -t nfs4 sandman:/export/kerberos /mnt/sandman/kerberos mount.nfs4: Operation not permitted 26 0.00034 mrx.internal.excfb.com -> sandman NFS C 4 () PUTFH FH=324D LOOKUP export GETFH GETATTR 10011a 30a23a 27 0.00030 sandman -> mrx.internal.excfb.com NFS R 4 () NFS4_OK PUTFH NFS4_OK LOOKUP NFS4_OK GETFH NFS4_OK FH=30E6 GETATTR NFS4_OK 28 0.00033 mrx.internal.excfb.com -> sandman NFS C 4 () PUTFH FH=30E6 LOOKUP kerberos GETFH GETATTR 10011a 30a23a 29 0.00021 sandman -> mrx.internal.excfb.com NFS R 4 () NFS4ERR_WRONGSEC PUTFH NFS4_OK LOOKUP NFS4ERR_WRONGSEC
I popped into wireshark and I found out that mrx is only sending AUTH_SYS and AUTH_NULL.
Note: I used wireshark because it will parse the payload data for me. I didn't want to be doing byte conversions and consulting some specs!
In NetApp Filer, NFSv4, and Linux, we find using -o sec=krb5. We can try that:
[tdh@mrx ~]> sudo mount -t nfs4 -o sec=krb5 sandman:/export/kerberos /mnt/sandman/kerberos Warning: rpc.gssd appears not to be running. mount.nfs4: Invalid argument
Which is strange, since it is running:
[tdh@mrx ~]> sudo chkconfig --list | grep rpcgssd rpcgssd 0:off 1:off 2:off 3:on 4:on 5:on 6:off [tdh@mrx ~]> sudo chkconfig --list | grep rpcidmapd rpcidmapd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
What does the log state:
RPC: Couldn't create auth handle (flavor 390003)
I've copied the stock krb5.conf back and now the diffs are:
[tdh@mrx /etc]> diff krb5.conf stock/krb5.conf
7c7
< default_realm = INTERNAL.EXCFB.COM
---
> default_realm = EXAMPLE.COM
14,17c14,17
< INTERNAL.EXCFB.COM = {
< kdc = sandman.internal.excfb.com:88
< admin_server = sandman.internal.excfb.com:749
< default_domain = internal.excfb.com
---
> EXAMPLE.COM = {
> kdc = kerberos.example.com:88
> admin_server = kerberos.example.com:749
> default_domain = example.com
21,22c21,22
< .internal.excfb.com = INTERNAL.EXCFB.COM
< internal.excfb.com = INTERNAL.EXCFB.COM
---
> .example.com = EXAMPLE.COM
> example.com = EXAMPLE.COM
You know what, rpc.gssd is not running!
[tdh@mrx /etc]> ps -ef | grep rpc rpc 1877 1 0 01:49 ? 00:00:00 portmap root 1898 1 0 01:49 ? 00:00:00 rpc.statd root 1931 1 0 01:49 ? 00:00:00 rpc.idmapd tdh 2697 2519 0 02:04 pts/0 00:00:00 grep rpc [tdh@mrx /etc]> sudo sh -c "ulimit -c unlimited;/usr/sbin/rpc.gssd -f -vvv" Using keytab file '/etc/krb5.keytab' Processing keytab entry for principal 'nfs/mrx.internal.excfb.com@INTERNAL.EXCFB.COM' We will use this entry (nfs/mrx.internal.excfb.com@INTERNAL.EXCFB.COM) Processing keytab entry for principal 'host/mrx.internal.excfb.com@INTERNAL.EXCFB.COM' We will NOT use this entry (host/mrx.internal.excfb.com@INTERNAL.EXCFB.COM) Using (machine) credentials cache: 'MEMORY:/tmp/krb5cc_machine_INTERNAL.EXCFB.COM'
And I put it in the background. Hmm, why doesn't it like the host entry?
Alright, I went back to why isn't rpc.gssd starting up at boot:
[ -f /etc/sysconfig/nfs ] && . /etc/sysconfig/nfs
[ "${SECURE_NFS}" != "yes" ] && exit 0
# ls -la /etc/sysconfig/nfs
#
Time to create it (look at Learning NFSv4 with Fedora Core 2 (Linux 2.6. 5 kernel))
# This entry should be "yes" if you are using RPCSEC_GSS_KRB5 (auth=krb5,krb5i, or krb5p) SECURE_NFS="yes" # This entry sets the number of NFS server processes. 8 is the default RPCNFSDCOUNT=8 [tdh@mrx sysconfig]> sudo /etc/init.d/rpcgssd start Starting RPC gssd: [ OK ]
God I'm totally hacked about this:
[tdh@mrx sysconfig]> sudo mount -o sec=krb5 sandman:/export/kerberos /mnt/sandman/kerberos [tdh@mrx sysconfig]> ls -la /mnt/sandman/kerberos total 5 drwxr-xr-x 2 root root 512 Feb 12 00:23 . drwxr-xr-x 5 root root 4096 Feb 12 00:49 .. -rw-r--r-- 1 tdh wheel 0 Feb 12 00:23 see_me
I've started posting the slides for Connectathon 2007: Talks 2007. As I get the remaining slides, I'll add them there.
Went to the local Starbucks here at Connectathon 2007. The guy looked up and said "Awake, right?" The guy I was with was floored. I told him, what is so hard - I'm 6'5", currently sporting a handlebar, and always wearing a Green Lantern hoodie.
The event is going along fine. The main problem is that NFSv3 is too solid and the NFSv4 implementations are also getting that way. The NFSv4.1 stuff is really still in the design phase. But developers are getting small victories when they either get code to compile or even run against other vendors. I think that Connectathon 2008 will be more frantic and the victories will be larger.
Ben's talk was interesting - the take home point I got was that sysadmins are not dumb and they can create unique architectures off of the building blocks you provide them. The more tools you can give them (dtrace and source code), the more they can do. That might not have been what others took home. I can't help that.
One of the things I had a problem with when I was a sysadmin was in talking to developers who would discount my ideas. I had one discount my suggestions about a new command syntax. The product has been deployed for over 5 years and everyone probably uses that syntax without thinking. My way wasn't necessarily better, just a different way of approaching the syntax. What was frustrating then though (and still to this day with other products) was the fact that the engineer who didn't have to administer the box didn't want to listen to the guy who did.
I'm back to wearing my developer hat, but I still try to listen to the sysadmins. I made a recent decision with the In Kernel Sharetab to use a symlink to solve a problem I could have coded over. I decided to scrap that idea, not because of a design review, but because I finally listened to that sysadmin in my head who told me the symlink would be a pain to work with.
So for me, I liked listening to Ben tell developers how he deploys their products and makes money doing so. He came in, said he was nervous and explained how his wife had told him that was silly. He said he told her it was like going to 3M to give a presentation on sticky notes - the audience laughed. I told him after the talk the reason why he got invited to 3M was because he was doing things with the sticky notes that 3M couldn't envision.
I.e., the innovation of sticky notes was in the past. In order for 3M to make more money, they needed to go outside their safe idea of what they thought people could do with sticky notes.
The thing which really seemed to spark the most debate (and which I started) was when Ben claimed in a room full of protocol developers that NFSv4 was too risky compared to NFSv3. Yet this was right after he said he was using ZFS and not UFS. What he wasn't articulating very well was that they went to ZFS for feature sets that they could exploit to sell to customers. The provisioning and manageability of ZFS far outweighed the stability of UFS.
In comparing NFSv4 vs NFSv3, his company did not find that overwhelming a need with respect to their business model. Or in other words, NFSv3 is sufficient for their customer's needs. Another business might find that NFSv3 is not sufficient for their customer's needs.
The other point he made was that they needed the replication (and to some extent migration) that NFSv4.1 was going to provide. This message was well received. I think it gives developers here ammo to take back to their management trees.
I found out late Sunday that my presentation was Monday instead of Tuesday. Not a problem! I was working off of a set of slides that Doug McCallum had put together for an internal presentation on just the sharemgr work. I tied it together with my work by looking at a case study on unshareall. You can read all about it here: The Management of Shares.
What was really interesting for me was the contrast between what I presented and what was presented before me. The discussion before mine was a heated debate about the state of pnfs and NFSv4.1. This stuff is in the early design phase. By that I mean there are prototypes which interoperate to a degree, but the spec is changing.
Anyway, my presentation was not on that technical level. And I felt a little bit indifference to what I was talking about. I presented a very simple problem, one that when the design was drafted, made perfect sense. And I talked about how we are keeping the spirit of the design intact and fixing what will be a performance issue.
I felt better after the presentation when two different people approached me and told me how they had similar issues facing them. They were interested in the approaches Doug and I took to solve our problems. One of them even took an OpenSolaris starter kit in order to look at how Doug solved his management problems. In short, this was a big win for OpenSolaris. By the way, I had plenty of people asking me for starter kits once they knew I had them.
The other thing which came out of my presentation was that it was related to the one I gave last year: Scaling NFS Services. In that one, I looked at what outside of the server can cause issues (think processor farms) and in this one, I looked at how we can fix some of the problems caused by scalability.
We've decided that everything at Connectathon has to be secured by Kerberos - we want the additional testing that we can get. It wasn't clear to me how to invoke a complex share command on a zfs filesystem. In particular, I couldn't find an example which set a security style or had multiple options. So here is what I did.
First I prepare some areas, note I'm pretty explicit about what is available.
# zfs create zoo/home/krb5 # zfs create zoo/home/all # zfs create zoo/home/krb5i # zfs create zoo/home/krb5p # zfs create zoo/home/sys # zfs create zoo/home/krb
And now we let zfs know what we want:
# zfs set sharenfs="sec=krb5:krb5i:krb5p:sys,rw" zoo/home/all # zfs set sharenfs="sec=krb5:krb5i:krb5p,rw" zoo/home/krb # zfs set sharenfs="sec=krb5i,rw" zoo/home/krb5i # zfs set sharenfs="sec=krb5p,rw" zoo/home/krb5p # zfs set sharenfs="sec=krb5,rw" zoo/home/krb5
And to check the properties:
# zfs list -o name,sharenfs NAME SHARENFS zoo off zoo/home on zoo/home/all sec=krb5:krb5i:krb5p:sys,rw zoo/home/krb sec=krb5:krb5i:krb5p,rw zoo/home/krb5 sec=krb5,rw zoo/home/krb5i sec=krb5i,rw zoo/home/krb5p sec=krb5p,rw zoo/home/nfsv2 on zoo/home/nfsv3 on zoo/home/nfsv4 on zoo/home/sys on zoo/home/tdh on zoo/ws off
And since I am testing my bits for the In Kernel Sharetab:
# cat /system/dfs/sharetab /export/zfs/tdh - nfs rw /export/zfs/krb5p - nfs sec=krb5p,rw /export/zfs/nfsv4 - nfs rw /export/zfs/nfsv2 - nfs rw /export/zfs/krb5i - nfs sec=krb5i,rw /export/zfs/krb5 - nfs sec=krb5,rw /export/zfs/sys - nfs rw /export/zfs/nfsv3 - nfs rw /export/zfs/all - nfs sec=krb5,rw,sec=krb5i,rw,sec=krb5p,rw,sec=sys,rw /export/zfs - nfs rw /export/zfs/krb - nfs sec=krb5,rw,sec=krb5i,rw,sec=krb5p,rw
Hmm, I think those entries should be compacted.
By the way, if there is no sec=, then the default is sys.
When I installed a machine, instead of figuring out how to leave a lot of space in a slice, I went ahead and made a slice to be mounted as /zfs. I knew that I wanted to be able to reuse that space later for zfs. When I went to create a pool, this is what I did:
First I found the slice number:
# df -h
Filesystem size used avail capacity Mounted on
/dev/dsk/c1d0s0 20G 6.8G 13G 36% /
/devices 0K 0K 0K 0% /devices
/dev 0K 0K 0K 0% /dev
ctfs 0K 0K 0K 0% /system/contract
proc 0K 0K 0K 0% /proc
mnttab 0K 0K 0K 0% /etc/mnttab
swap 10G 788K 10G 1% /etc/svc/volatile
objfs 0K 0K 0K 0% /system/object
/usr/lib/libc/libc_hwcap2.so.1
20G 6.8G 13G 36% /lib/libc.so.1
fd 0K 0K 0K 0% /dev/fd
swap 10G 52K 10G 1% /tmp
swap 10G 32K 10G 1% /var/run
/dev/dsk/c1d0s3 20G 487M 19G 3% /altroot
/dev/dsk/c1d0s5 163G 64M 161G 1% /zfs
/dev/dsk/c1d0s7 20G 20M 19G 1% /export/home
/dev/dsk/c0t0d0s2 3.6G 3.6G 0K 100% /media/CDROM
/dev/lofi/1 467M 467M 0K 100% /isos/mnt/companion
Next I took the UFS filesystem off the system and out of /etc/vfstab:
# umount /zfs # vi /etc/vfstab ...
Then I tried to create the new pool:
# zpool create zoo /dev/dsk/c1d0s5 invalid vdev specification use '-f' to override the following errors: /dev/dsk/c1d0s5 contains a ufs filesystem.
One of the features I really like about zfs is not only does it tell me exactly what is wrong, it also tells me how to fix it. I don't have to go look something up. So to fix it up:
# zpool create -f zoo /dev/dsk/c1d0s5 #
And here is is later:
[tdh@sunnfsv4-109 ~]> zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT zoo 165G 3.10G 162G 1% ONLINE -