We had a recent integration that exposed some nasty interactions between a OpenSolaris client and a Linux server. There are bugs on both sides, but what I want to do here is document the behavior you'll see and what you can do to fix it.
The first problem was that the fix for 6790413 AUTH_NONE implementation in kernel RPC caused a nasty interaction with a Linux server in that it tried the first security flavor in the array returned by the MOUNTD request to the server. The issue can be seen here:
[thud@adept nfs]> more /etc/exports / *(sync) /home 192.168.1.0/255.255.255.0(rw,async,no_subtree_check,insecure,no_root_squash)
And a mount request from an OpenSolaris client:
[thud@witch ~]> sudo mount -o vers=3 wont:/home /mnt [thud@witch ~]> cd /mnt [thud@witch /mnt]> ls -la total 35 drwxr-xr-x 3 root root 4096 Feb 25 2008 . drwxr-xr-x 27 root root 30 Jul 17 00:34 .. drwx------ 25 thud staff 4096 Mar 19 00:22 thud [thud@witch /mnt]> cd thud thud: Permission denied.
Why, well look at what the server sends back:
MOUNT:----- NFS MOUNT ----- MOUNT: MOUNT:Proc = 1 (Add mount entry) MOUNT:Status = 0 (OK) MOUNT:File handle = [DADF] MOUNT: 01000700010005000000000053CF6DE4FF1C4572BB2950392EB6993C MOUNT:Authentication flavor = none,unix,390003,390004,390005 MOUNT:
The OpenSolaris server selected AUTH_NONE, as it was first. If we try this again:
[thud@witch ~]> sudo umount /mnt [thud@witch ~]> sudo mount -o vers=3,sec=sys wont:/home /mnt [thud@witch ~]> cd /mnt/thud
We are happy.
Note that this case works for Linux because if there is no command line option, the client will default to AUTH_SYS. It ignores the list from the server.
Well, we discussed whether we wanted to use the default security flavor as defined in nfssec.conf(4) or if we wanted to re-order the array on strongest flavor or if we wanted to do both (i.e., re-order only if the default was not present).
It turns out that you should honor the array's order as much as possible (See Section 2.7 of RFC2623). We've decided to use any option provided on the command line, then the default, and then the first entry in the array. I.e., if no command line option and no default, we consult the server's list. Also, if there is a command line option, it has to be present in the list or the mount fails. If on the other hand the default is not present, then we take the first entry in the list.
You can track this fix in 6860784 mount_nfs needs to choose default auth first for NFSv3 mounts. If you need relief, for now specify 'sec=sys' on your mount command or add it to your automount maps.
In the meantime, I started a discussion with the Linux NFS developers about the issue (Security negotiation), and it turns out that they decided that returning AUTH_NONE as the first flavor was a bug. This was fixed in nfs-utils (commit 3c1bb23c0379864722e79d19f74c180edcf2c36e in version 1.1.3).
And sure enough, my stock Fedora Core 8 server has a version of 1.1.0. So I updated my server to Fedora Core 11 to see what would happen. I was actually surprised, with version 1.1.5 that the mount failed:
[root@witch ~]> mount -o vers=3 adept:/home /mnt nfs mount: security mode does not match the server exporting adept:/home
It turns out that the Linux server is not returning any security flavors with the exact same exports as before!
MOUNT:----- NFS MOUNT ----- MOUNT: MOUNT:Proc = 1 (Add mount entry) MOUNT:Status = 0 (OK) MOUNT:File handle = [DADF] MOUNT: 01000700010005000000000053CF6DE4FF1C4572BB2950392EB6993C MOUNT:Authentication flavor = MOUNT:
Again, this works with a Linux client, and that is because they basically ignore the array of security flavors and try AUTH_SYS by default.
The bug (which I later verified has been seen by others (Red Hat Bugzilla – Bug 467613 rpc.mountd does not announce any flavors) is that if no 'sec=' is mentioned in the export definition, then no security flavor is set. If we change the export to instead be:
/home 192.168.1.0/255.255.255.0(sec=sys,rw,async,no_subtree_check,insecure,no_root_squash)
Then we restore interoperability.
There is a lesson buried in here, don't just test against your own client/server. Both sides failed that lesson at different points. Also, we do cutting edge pNFS and NFSv4.1 interoperability testing all the time, but we don't with NFSv3. While as developers we may think that development work is over, we do make bug fixes to support customers and we need to be careful to reduce customer pain.