Today's Page Hits: 1875
I have more hair and it isn't so grey. :->
This page validates as XHTML 1.0, and will look much better in a browser that supports web standards, but it is accessible to any browser or Internet device. It was created using techniques detailed at glish.com/css/.
Down for routine maintenance - you can at least enjoy the following until then:
Cthon '08 went off without a hitch. It started out uneventfully as Kerberos worked right out of the box. Evidently Sun's Kerberos team have been working on making initial configuration being painless. And they succeeded.
The public talks were well received and we've started posting the slides as they are sent in. You can check them out on Talks 08.
I'll post more as they arrive.
Also, we videoed most of the talks this year. As that content becomes available, we will post it up as well.
Be sure to visit www.connectathon.org and see when the talks are scheduled. These are open to the public.
Sun Microsystems, Inc. is involved with 6 presentations and then NetApp has 5 of them. I'll be giving two of them, but I'm actually more excited about the one on nfsreplay by Shehjar Tikoo and the Linux development git one by Bruce Fields and Benny Halevy.
Normally we can't share images of the event, but here is one from before the other vendors setting up their gear:
Each of the Sun workstations is probably a node in a pNFS community.
I've given up trying to explain to people what it is I do for a living. But I think I'm going to keep on trying to explain what a BakeAThon is to them.
How can I do one without the other? Well, I can abstract the process.
So a BakeAThon (and a ConnectAThon) is an interoperability testing event. And the best way to describe it is that you have 10 different people with a set of rules for a game (if you say AD&D edition 3 rules, you have another set of problems to deal with). Each of them has read the rules and believes that they know all of the intricacies. But none of them have played the game with anyone else.
So they all get together and start to play each other. And they start to argue about each and every move. Sometimes it is pretty obvious whose interpretation is wrong. And sometimes they call someone else over to help decide.
As soon as player A is done with player B, they start with player C. Except sometimes they are also playing with player D at the same time. Or player B comes back to see if they have gotten rule 5.3 correct now.
Sometimes they all vote on how to interpret a rule and even change the rule book. And sometimes player F was at the bathroom when that happened and causes the debate to start back up again.
Then they all go away again for 3 months, promising to play games against each other remotely. They meet back up again at the next BakeAThon - sometimes there is a new player or someone didn't show up. But they are willing to chime in over email.
But they have to start all over from scratch because no one played remotely and they've been busy playing with themselves.
A further complication is that some people only play defense and some only play offense. Sometimes you get a team where they split those duties. So when you talk to one person about how they run their offense, they shrug and say that they only do defense. And the problem that arises here is when the team's offense only plays against their defense - they get pretty good at it and understand some simple shortcuts that make it easy. But when they play another team, those same shortcuts cause problems.
So a BakeAThon is pretty much like that. The major difference is that the competitiveness isn't in winning a game but in getting the game adopted by other people. I.e., Foo Inc. and Bar Inc. may have differences and fight over customers, but while at a BakeAThon, they work together to make NFS a better protocol.
I've slammed the Linux NFSv4 implementation before for not having the same namespace as NFSv3. I.e., it used the 'fsid=0' hack to export the root of the v4 namespace and thus that path may not be the same as '/'.
Well, over on the nfsv4 <at> linux-nfs.org mailing list, Steve just announced a prototype which fixes that problem! And the crowd goes wild!
The following patch series gives rpc.mountd the ability to allocate a dynamic pseudo root, so the 'fsid=0' export option is no longer required. This allows v2, v3 and v4 clients mounts without any changes to the server's exports list. One anomaly of the Linux NFS server is that it requires a pseudo root to be defined. Currently the only way a pseudo root can be defined is by setting the fsid to zero (i.e. fsid=0). So if we wanted to make v4 the default mounting version and have things just work like v2/v3 all of the existing exports configurations would have to change (i.e. a 'fsid=0' would have to be added) to support a v4 mounts, which, imho, is unacceptable. So this patch series address this problem.
I think this might also mark the first major piece of work on the Linux NFSv4 code to come from some place other than CITI. I might be wrong, but I think this is a sign of the maturity of code.
What a group effort. I love how teammates come out of the woodwork to help you get the final touches on a project. The code was putback today and you should start seeing it in build 77. I'll leave you with a teaser of things to come:
The putback for
PSARC 2007/563 Add _AT_TRIGGER to fstatat(2)
PSARC 2007/416 NFSv4 Mirror-mounts
5035401 allow clients to cross server filesystem boundaries if the fs is visible
6613892 nftw(3C) has potential security issues
enhances the NFSv4 clients to automatically mount filesystems when they are encountered at the NFSv4 server; this enhancement does not require the use of the automounter and therefore does not rely on the content or propagation of automounter maps. An example of the utility of this feature is in the presence of ZFS at the NFS server. With the ease of creation and management of numerous ZFS filesystems, the enhanced NFSv4 client will immediately provide access to the newly created and shared ZFS filesystems.
And here is a roll call:
Core team:
Calum Mackay: Cambridge, UK (team lead)
Tom Haynes: Tulsa, OK, senior engineer
Bill Baker: Austin, TX, srstaff engr/architect/advisor
Rob Thurlow: Ft Collins, CO, senior engr/advisor
Spencer Shepler: Austin, TX, srstaff engr/RTI advocate/advisor
Helen Chao: Menlo Park, CA, QE lead
Lily Li: Beijing, QE engineer
Evan Layton: Broomfield, CO, engineer
Alok Aggarwal: Atlanta, GA, engineer
Rich Brown: Chicago, IL, sr engr/PSARC Intern
And there were more who stepped up and made significant contributions to get the work out of the door.
I've been very busy trying to get Mirror Mounts out the door. Our last task was to fix a find bug:
[tdh@mrx ~]> sudo mount kanigix:/ /mnt [tdh@mrx ~]> cd /mnt [tdh@mrx /mnt]> cd zoo [tdh@mrx zoo]> cd mms [tdh@mrx mms]> ls -la total 20 drwxr-xr-x 5 root sys 5 Jul 19 21:41 . drwxr-xr-x 11 root sys 11 Oct 8 13:08 .. drwxr-xr-x 5 root sys 5 Jul 19 21:41 node1 drwxr-xr-x 5 root sys 5 Jul 19 21:41 node2 drwxr-xr-x 5 root sys 5 Jul 19 21:42 node3 [tdh@mrx mms]> df -F nfs -h Filesystem size used avail capacity Mounted on kanigix:/ 20G 8.9G 11G 46% /mnt kanigix:/zoo 637G 42K 637G 1% /mnt/zoo kanigix:/zoo/mms 637G 31K 637G 1% /mnt/zoo/mms
Now if we try to find, it should mount a subdirectory for us and traverse down it.
[tdh@mrx mms]> find . . ./node3 find: cannot open .: Resource temporarily unavailable
It fails because of a security check to make sure the stat(2) information from before an opendir(2) matches a fstat(2) from after the opendir(2). The pre-stat gets the vnode which will be mounted on and the post-fstat gets the root of the new filesystem.
But now we can see that since the mount took place, that we pass the test which just failed.
[tdh@mrx mms]> df -F nfs -h
Filesystem size used avail capacity Mounted on
kanigix:/ 20G 8.9G 11G 46% /mnt
kanigix:/zoo 637G 42K 637G 1% /mnt/zoo
kanigix:/zoo/mms 637G 31K 637G 1% /mnt/zoo/mms
kanigix:/zoo/mms/node3
637G 31K 637G 1% /mnt/zoo/mms/node3
[tdh@mrx mms]> find .
.
./node3
./node3/sub2
find: cannot open .: Resource temporarily unavailable
[tdh@mrx mms]> df -F nfs -h
Filesystem size used avail capacity Mounted on
kanigix:/ 20G 8.9G 11G 46% /mnt
kanigix:/zoo 637G 42K 637G 1% /mnt/zoo
kanigix:/zoo/mms 637G 31K 637G 1% /mnt/zoo/mms
kanigix:/zoo/mms/node3
637G 31K 637G 1% /mnt/zoo/mms/node3
kanigix:/zoo/mms/node3/sub2
637G 31K 637G 1% /mnt/zoo/mms/node3/sub2
But we puke on the next mirror mount.
The problem is actually deep down in nftw(2C) and you can look at the PSARC case at Add S_IFTRIGGER to st_mode [PSARC/2007/563 FastTrack timeout 10/04/2007]. This is pretty interesting reading. Most people told me that my proposal would generate considerable controversy.
The PSARC community decided that there was still a hole in the security check inside nftw(2C). While I can argue against this, it is pretty hard to rename/move an export on a live filesystem, in the end I decided that a counter-proposal made more sense that mine.
You've got to realize, I've been working on this project for the past 6 months. We were going to putback on Friday - time to turn over onto the pNFS project.
We decided as a group that the new proposal was the right thing to do for both the Mirror Mount project and Solaris (and indirectly OpenSolaris).
I believe that the majority of OpenSolaris development occurs within Sun Microsystems Engineering. As much as we would like for it to snowball in the wild, that has not happened. I'm saying this from my biased view, I know some projects have been proposed externally from Sun, e.g., the i18n port of the closed library. I also acknowledge the work that Dennis Clark is leading for the PPC port. There are more and I am not trying to take away from them. I am relating my experience with trying to get projects off the ground on OpenSolaris - see for example OpenSolaris Project: NFS Server in non-Global Zones.
So what does happen is that a new project gets started and there is no external indication of forward progress. People might start asking for code drops and the reality is that because of the huge internal pressure towards quality in Sun Engineering, that is not going to happen until the code has baked a bit. It gets to the point that a prime question on new project proposals is will code be released. Again, there isn't some hidden agenda within Sun to withhold the code - we are just new to this model and we want things to be perfect, not just good enough.
Look back at the discussion that went on for Project Proposal -- Honeycomb Information and dev tools and the lack of a code drop. The OpenSolaris Project: HoneyComb Fixed Content Storage already shows a binary drop and plans for a code drop in the Fall of 2007. Some valid reasons for a group to not drop code right away are that they do not understand the process (they need someone to help them) and they need to clear a legal hurdle to make sure that they are not violating the rights of either an individual or a company. I've seen both occur internally. The good news is that we have internal people ready and willing to help development groups.
What I find really exciting are projects that have a significant external presence. And sometimes that external pressure doesn't contribute directly to the code work. In NFSv4 and NFSv4.1, the external collaboration takes place through the IETF and Connectathon. Both companies and open source developers come together to design and implement future NFS protocol extensions. Interoperability across multiple OS platforms is ensured via the yearly meetings at Connectathon. And with the UMICH CITI developers working on Projects: NFS Version 4 Open Source Reference Implementation, which is mainly distributed to Linux, but forms a reference for both BSD directly and OSX indirectly, and Sun working on OpenSolaris, it is possible for vendors to do compatibility testing all year long.
Take for example NetApp, which provides only a NFS server. They are able to test new NFSv4.1 features against Linux and OpenSolaris clients. Admittedly this isn't new, NetApp was able to use the Solaris 10 beta code to test NFSv4. And the companies in question all sign NDAs and exchange hardware and engineering drops of binaries for testing.
So there is almost no work being driven from OpenSolaris into this open design project. There is a OpenSolaris Project: NFS version 4.1 pNFS, but it is mainly a portal to the Sun NFS team's work. A question that they asked themselves was whether they were going to do binary drops, code drops, or any drop at all. It wasn't a legal issue, the design is done in the open and all of the coding is new development. It wasn't a fear of the unknown, they had already shared binaries in the past. No, rather it was a concern on the impact of providing a drop on the development schedule. Would the overhead of publishing code and/or binaries kill the final deliverable?
Another OpenSolaris reality is that Sun expects to make money. I know that is an evil concept to some open source developers, but we bet the company on being able to deliver quality and sell service along with the source. So making the deadline for the pNFS deliverable is a major concern for the group.
I'm happy that the group decided that they could both deliver on time and make code and binary drops. Lisa just announced for the group the latest drop in FYI: pNFS Code and BFU Archives posted. You can check out the b66 implementation by downloading it. The code is rough in the sense that you wouldn't want to put it in production, but it gives other developers a chance to see what is going on and allows them to test their own implementations. Remember, this code has not been putback into Nevada - it lives in a group workspace. Before OpenSolaris, it would have only been shared under NDA and the expectation that the person installing the code assumed responsibility for any problems.
Project development in OpenSolaris is different than that occurring in other open source communities. There are different hurdles to jump, but there are different expectations as well. Internal developers are proud of the quality that they demand of the code and want to keep that bar high. That in turn makes early code drops hard for them to deliver. It is something they are learning to do. And the pNFS team is leading the way.
We occasionally get a bug assigned to us about how a Solaris client can not mount a Linux export over NFSv4, but can over NFSv3. In some cases we get it as an automounter bug. We will close the bug because it is an intrinsic problem with the NFSv4 implementation on Linux - well, it is actually false advertising.
Consider the following export on a Linux box:
[tdh@adept ~]> more /etc/exports /home *(rw,fsid=0,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534) [tdh@adept ~]> exportfs /home <world> [tdh@sandman ~]> showmount -e adept export list for adept: /home/mrx * /home/tdh * /home/spud * /home/coach * /home/loghyr *
Hmm, that looks wrong - I know I had them exported like that earlier, but clearly the server is only exporting one export.
Anyway, it would seem that the one export is '/home' and that we should be able to go to '/home' and then access the remote filesystem. And we can't:
[tdh@sandman ~]> cd /net/adept/home [tdh@sandman home]> ls -la total 7 dr-xr-xr-x 6 root root 6 May 2 16:31 . dr-xr-xr-x 2 root root 2 May 2 16:31 .. dr-xr-xr-x 1 root root 1 May 2 16:31 coach dr-xr-xr-x 1 root root 1 May 2 16:31 loghyr dr-xr-xr-x 1 root root 1 May 2 16:31 mrx dr-xr-xr-x 1 root root 1 May 2 16:31 spud dr-xr-xr-x 1 root root 1 May 2 16:31 tdh [tdh@sandman home]> cd tdh tdh: Permission denied.
What happened?
A snoop trace shows right off the bat that things are not going well:
sandman -> adept.internal.excfb.com NFS C NULL4
adept.internal.excfb.com -> sandman NFS R NULL4
sandman -> adept.internal.excfb.com NFS C 4 (secinfo ) PUTROOTFH LOOKUP home SECINFO tdh
adept.internal.excfb.com -> sandman NFS R 4 (secinfo ) NFS4ERR_NOENT PUTROOTFH NFS4_OK LOOKUP NFS4ERR_NOENT
We should remove the automounter from our testing:
[tdh@sandman home]> sudo mount -o vers=4 adept:/home /mnt nfs mount: adept:/home: No such file or directory
Again, we see:
sandman -> adept.internal.excfb.com NFS C 4 (secinfo ) PUTROOTFH SECINFO home
adept.internal.excfb.com -> sandman NFS R 4 (secinfo ) NFS4ERR_OP_ILLEGAL PUTROOTFH NFS4_OK ILLEGAL NFS4ERR_OP_ILLEGAL
Okay, but it works with NFSv3:
[tdh@sandman home]> ls -la /mnt total 162 drwxr-xr-x 20 root root 4096 Feb 18 17:49 . drwxr-xr-x 27 root root 1024 Apr 23 09:31 .. drwxr-xr-x 2 1094 100 4096 Feb 18 13:12 nfsv2 drwxr-xr-x 2 1813 100 4096 Feb 18 13:12 nfsv3 drwxr-xr-x 4 3530 100 4096 Feb 18 17:46 nfsv4
Note that we now have uids and earlier we had root ownerships.
What is going on here? Well another simple example shows what the Linux server is doing to us:
[tdh@sandman home]> sudo umount /mnt [tdh@sandman home]> sudo mount -o vers=4 adept:/ /mnt [tdh@sandman home]> ls -la /mnt total 162 drwxr-xr-x 2 nobody nobody 4096 Feb 18 13:12 nfsv2 drwxr-xr-x 2 nobody nobody 4096 Feb 18 13:12 nfsv3 drwxr-xr-x 4 nobody nobody 4096 Feb 18 17:46 nfsv4
And note that I have chopped some of the output for space.
Okay, it is clear to me that adept:/home is actually adept:/ as far as NFSv4 is concerned. The problem is that the Linux NFSv4 implementation of the pseudo-fs namespace is done such that one of the exports is made the root fs. If we look at the export again, we see:
[tdh@adept ~]> more /etc/exports /home *(rw,fsid=0,insecure,no_subtree_check,sync,anonuid=65534,anongid=65534)
The 'fsid=0' option is assigning '/home' to be '/' as far as NFSv4 is concerned. The problem is that this breaks automounter impelmentations which use the MOUNTPROC_EXPORTALL RPC call to get a list a mountable exports. It also breaks humans reading such output to get a sense of what to mount.
You can find a discussion of how the NFSv4 namespace is built for Linux at Exporting Directories.
I've started posting the slides for Connectathon 2007: Talks 2007. As I get the remaining slides, I'll add them there.
Went to the local Starbucks here at Connectathon 2007. The guy looked up and said "Awake, right?" The guy I was with was floored. I told him, what is so hard - I'm 6'5", currently sporting a handlebar, and always wearing a Green Lantern hoodie.
The event is going along fine. The main problem is that NFSv3 is too solid and the NFSv4 implementations are also getting that way. The NFSv4.1 stuff is really still in the design phase. But developers are getting small victories when they either get code to compile or even run against other vendors. I think that Connectathon 2008 will be more frantic and the victories will be larger.
I just posted the presentation schedule for Connectathon 2007 as Talks 2007. The talks are open to the public (go ahead, jam the room and ask probing questions):
Parkside Hall
180 Park Ave.
San Jose, CA 95113
This is the building which is connected to the The Tech Museum of Innovation in downtown San Jose. In most other cities, I'd tell you to look for the introverted and pale computer geeks. That wouldn't work for that area.
Wow, I was in the Sun blogging tool, it is Apache Roller, and I deleted an image file. The name had a typo. Anyway, I caught it displaying a .nfs file:
.nfs797 is a file which was deleted, but which some other application has a file handle open. Since it was deleted and that file name might be reused, a temporary file name is assigned to the deleted file. You might see this a lot on some of your systems. Now, if the server reboots before the client releases the file handle, it might keep .nfs on disk for a very long time. And if these files are large, they will eat up your space.
Under Solaris, you can use /usr/lib/fs/nfs/nfsfind to clean up ones which are over a week old:
if [ ! -s /etc/dfs/sharetab ]; then exit ; fi
# Get all NFS filesystems exported with read-write permission.
DIRS=`/usr/bin/nawk '($3 != "nfs") { next }
($4 ~ /^rw$|^rw,|^rw=|,rw,|,rw=|,rw$/) { print $1; next }
($4 !~ /^ro$|^ro,|^ro=|,ro,|,ro=|,ro$/) { print $1 }' /etc/dfs/sharetab`
for dir in $DIRS
do
find $dir -type f -name .nfs\* -mtime +7 -mount -exec rm -f {} \;
done
For fun, what do you think happens to any of those .nfs files which are still referenced by a client? Well, they get a new .nfs name. The name isn't special, it is just a convention. Think about it for a while.
I've got my airline tickets, I've got my hotel room, and I've let everyone know I'll be in town. Connectathon 2007 is about to happen. Watch that space for a list of public presentations - if you are in the downtown San Jose area, you are welcome to drop in to listen.
In Some fun with NFSv4 and automount across a ssh tunnel, I revealed the work going on in Solaris for Mirror Mounts. The example was a desire to automount across a ssh tunnel. Well, I dusted off wont, the box from hell (being used by my son for video games) and created some zfs filesystems on it:
# zfs list NAME USED AVAIL REFER MOUNTPOINT zoo 398K 118G 24.5K /zoo zoo/home 256K 118G 35.5K /export/zfs zoo/home/braves 24.5K 118G 24.5K /export/zfs/braves zoo/home/kanigix 24.5K 118G 24.5K /export/zfs/kanigix zoo/home/loghyr 24.5K 118G 24.5K /export/zfs/loghyr zoo/home/mrx 24.5K 118G 24.5K /export/zfs/mrx zoo/home/nfsv2 24.5K 118G 24.5K /export/zfs/nfsv2 zoo/home/nfsv3 24.5K 118G 24.5K /export/zfs/nfsv3 zoo/home/nfsv4 24.5K 118G 24.5K /export/zfs/nfsv4 zoo/home/spud 24.5K 118G 24.5K /export/zfs/spud zoo/home/tdh 24.5K 118G 24.5K /export/zfs/tdh # uname -a SunOS wont 5.11 snv_55 i86pc i386 i86pc
I then opened a ssh tunnel to it on my Fedora Core 4 box and did a little bit of exploring:
[tdh@adept tdh]> uname -a Linux adept 2.6.15-1.1833_FC4 #1 Wed Mar 1 23:41:37 EST 2006 i686 i686 i386 GNU/Linux [tdh@adept ~/usenix]> ssh -fN -L "5049:wont:2049" wont Password: [tdh@adept ~/usenix]> sudo mount -o port=5049 -t nfs4 localhost:/ /nfs4/wont [tdh@adept ~/usenix]> cd /nfs4/wont [tdh@adept wont]> ls -la total 6 drwxr-xr-x 38 root root 1024 Dec 31 17:49 . drwxr-xr-x 4 root root 4096 Dec 31 18:17 .. drwxr-xr-x 4 root sys 512 Dec 31 17:50 export [tdh@adept wont]> cd export [tdh@adept export]> ls -la total 4 drwxr-xr-x 4 root sys 512 Dec 31 17:50 . drwxr-xr-x 38 root root 1024 Dec 31 17:49 .. drwxr-xr-x 11 root sys 11 Dec 31 17:50 zfs [tdh@adept export]> cd zfs [tdh@adept zfs]> ls -la total 16 drwxr-xr-x 11 root sys 11 Dec 31 17:50 . drwxr-xr-x 4 root sys 512 Dec 31 17:50 .. drwxr-xr-x 2 root sys 2 Dec 31 17:50 braves drwxr-xr-x 2 root sys 2 Dec 31 17:50 kanigix drwxr-xr-x 2 root sys 2 Dec 31 17:50 loghyr drwxr-xr-x 2 root sys 2 Dec 31 17:50 mrx drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv2 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv3 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv4 drwxr-xr-x 2 root sys 2 Dec 31 17:50 spud drwxr-xr-x 2 root sys 2 Dec 31 17:50 tdh [tdh@adept zfs]> cd tdh [tdh@adept tdh]> ls -la total 3 drwxr-xr-x 2 root sys 2 Dec 31 17:50 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 ..
Notice that I only did one mount command. As I crossed down into the exported filesystems, the Linux 2.16 implementation of NFSv4 did the mounts automatically for me in the background. Also, note that since '/' is not exported from wont, this must be a pseudo-fs:
[tdh@adept tdh]> showmount -e wont Export list for wont: /export/zfs (everyone) /export/zfs/nfsv2 (everyone) /export/zfs/nfsv3 (everyone) /export/zfs/nfsv4 (everyone) /export/zfs/tdh (everyone) /export/zfs/loghyr (everyone) /export/zfs/kanigix (everyone) /export/zfs/mrx (everyone) /export/zfs/spud (everyone) /export/zfs/braves (everyone)
Let's export '/' and see what happens:
# share -F nfs -o rw -d "root" /
And on the Linux box:
[tdh@adept tdh]> cd /nfs4/wont [tdh@adept wont]> ls -la total 6 drwxr-xr-x 38 root root 1024 Dec 31 17:49 . drwxr-xr-x 4 root root 4096 Dec 31 18:17 .. drwxr-xr-x 4 root sys 512 Dec 31 17:50 export
What happened? Why didn't we see the root directory on wont? Well, when we did the mount command earlier, we basically got a reference to a file handle in the pseudo-fs. We need to flush this by umounting and remounting:
[tdh@adept wont]> cd [tdh@adept ~]> sudo umount /nfs4/wont/ [tdh@adept ~]> sudo mount -o port=5049 -t nfs4 localhost:/ /nfs4/wont [tdh@adept ~]> cd /nfs4/wont [tdh@adept wont]> ls -la total 67 drwxr-xr-x 38 root root 1024 Dec 31 17:49 . drwxr-xr-x 4 root root 4096 Dec 31 18:17 .. lrwxrwxrwx 1 root root 9 Dec 31 13:17 bin -> ./usr/bin drwxr-xr-x 5 root sys 512 Dec 31 14:12 boot drwxr-xr-x 2 root root 512 Dec 31 14:51 Desktop drwxr-xr-x 24 root sys 4096 Dec 31 14:42 dev drwxr-xr-x 10 root sys 512 Dec 31 14:42 devices drwxr-xr-x 2 root root 512 Dec 31 14:51 Documents drwxr-xr-x 9 root root 512 Dec 31 17:31 .dt -rwxr-xr-x 1 root root 5111 Dec 31 14:51 .dtprofile -rw------- 1 root root 16 Dec 31 17:31 .esd_auth drwxr-xr-x 87 root sys 4608 Dec 31 17:52 etc drwxr-xr-x 4 root sys 512 Dec 31 17:50 export ...
Let's walk down the paths again and see what happens:
[tdh@adept wont]> cd export [tdh@adept export]> ls -la total 5 drwxr-xr-x 4 root sys 512 Dec 31 17:50 . drwxr-xr-x 38 root root 1024 Dec 31 17:49 .. drwxr-xr-x 2 root root 512 Dec 31 13:17 home drwxr-xr-x 11 root sys 11 Dec 31 17:50 zfs [tdh@adept export]> cd zfs [tdh@adept zfs]> ls -la total 16 drwxr-xr-x 11 root sys 11 Dec 31 17:50 . drwxr-xr-x 4 root sys 512 Dec 31 17:50 .. drwxr-xr-x 2 root sys 2 Dec 31 17:50 braves drwxr-xr-x 2 root sys 2 Dec 31 17:50 kanigix drwxr-xr-x 2 root sys 2 Dec 31 17:50 loghyr drwxr-xr-x 2 root sys 2 Dec 31 17:50 mrx drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv2 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv3 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv4 drwxr-xr-x 2 root sys 2 Dec 31 17:50 spud drwxr-xr-x 2 root sys 2 Dec 31 17:50 tdh [tdh@adept zfs]> cd tdh [tdh@adept tdh]> ls -la total 3 drwxr-xr-x 2 root sys 2 Dec 31 17:50 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 ..Let's make sure we are in the right place:
# scp sandman:/export/home/tdh/.tcshrc . Password: .tcshrc 100% |*******************************************************************| 5417 00:00 # chown tdh:staff .tcshrc # ls -la total 18 drwxr-xr-x 2 root sys 3 Dec 31 18:10 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 .. -rw------- 1 tdh staff 5417 Dec 31 18:10 .tcshrc
And on the client:
[tdh@adept tdh]> ls -la total 9 drwxr-xr-x 2 root sys 3 Dec 31 18:10 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 .. -rw------- 1 tdh nobody 5417 Dec 31 18:10 .tcshrc [tdh@adept tdh]> grep 10 /etc/group wheel:x:10:root
The nobody shows up for the group because there is no mapping between the string "staff" and "wheel". In NFSv3, the numeric 10 would have gone across the wire and the ls command would have spit out "wheel".
Okay, let's check to see what the Solaris client would have done:
[tdh@sandman ~]> ssh -fN -L "5049:wont:2049" wont Password: [tdh@sandman ~]> su - Password: Sun Microsystems Inc. SunOS 5.11 snv_54 October 2007 # mkdir -p /nfs4/wont # mount -o port=5049 localhost:/ /nfs4/wont # exit [tdh@sandman ~]> cd /nfs4/wont [tdh@sandman wont]> ls -la total 134 drwxr-xr-x 38 root root 1024 Dec 31 17:49 . drwxr-xr-x 3 root root 512 Dec 31 18:17 .. ... drwxr-xr-x 2 root root 512 Dec 31 14:51 Desktop drwxr-xr-x 2 root root 512 Dec 31 14:51 Documents lrwxrwxrwx 1 root root 9 Dec 31 13:17 bin -> ./usr/bin drwxr-xr-x 5 root sys 512 Dec 31 14:12 boot drwxr-xr-x 24 root sys 4096 Dec 31 14:42 dev drwxr-xr-x 10 root sys 512 Dec 31 14:42 devices drwxr-xr-x 87 root sys 4608 Dec 31 17:52 etc drwxr-xr-x 4 root sys 512 Dec 31 17:50 export ... [tdh@sandman wont]> cd export [tdh@sandman export]> ls -la total 9 drwxr-xr-x 4 root sys 512 Dec 31 17:50 . drwxr-xr-x 38 root root 1024 Dec 31 17:49 .. drwxr-xr-x 2 root root 512 Dec 31 13:17 home drwxr-xr-x 11 root sys 11 Dec 31 17:50 zfs [tdh@sandman export]> cd zfs [tdh@sandman zfs]> ls -la total 5 drwxr-xr-x 11 root sys 11 Dec 31 17:50 . drwxr-xr-x 4 root sys 512 Dec 31 17:50 ..
Okay, we have hit the crux of the problem for Mirror Mounts. We have a filesystem crossing on the server which needs to be mirrored on the client. We have to do this manually (or with an automounter if the ports are open):
[tdh@sandman zfs]> cd [tdh@sandman ~]> su - Password: Sun Microsystems Inc. SunOS 5.11 snv_54 October 2007 # mount -o port=5049 localhost:/export/zfs /nfs4/wont/export/zfs # ls -la /nfs4/wont/export/zfs total 32 drwxr-xr-x 11 root sys 11 Dec 31 17:50 . drwxr-xr-x 4 root sys 512 Dec 31 17:50 .. drwxr-xr-x 2 root sys 2 Dec 31 17:50 braves drwxr-xr-x 2 root sys 2 Dec 31 17:50 kanigix drwxr-xr-x 2 root sys 2 Dec 31 17:50 loghyr drwxr-xr-x 2 root sys 2 Dec 31 17:50 mrx drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv2 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv3 drwxr-xr-x 2 root sys 2 Dec 31 17:50 nfsv4 drwxr-xr-x 2 root sys 2 Dec 31 17:50 spud drwxr-xr-x 2 root sys 3 Dec 31 18:10 tdh # tcsh # ls -la /nfs4/wont/export/zfs/tdh total 6 drwxr-xr-x 2 root sys 3 Dec 31 18:10 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 .. # mount -o port=5049 localhost:/export/zfs/tdh /nfs4/wont/export/zfs/tdh # ls -la /nfs4/wont/export/zfs/tdh total 18 drwxr-xr-x 2 root sys 3 Dec 31 18:10 . drwxr-xr-x 11 root sys 11 Dec 31 17:50 .. -rw------- 1 tdh staff 5417 Dec 31 18:10 .tcshrc
Notice how the '/export/zfs' gave information about the child filesystems whereas '/' did not. Also, note how we get the correct group name because the '/etc/group' is the same on the two Solaris hosts. Finally, even with zfs presenting up the child filesystems, we did have to manually mount the child in order to peer into it.
So the Mirror Mounts project in the NFSv4 development team is going to fix all of this. Under the hood, the client is going to understand it is about to traverse to a different filesystem and do the equivalent of a NFSv3 mount.