« March 2008 »
SunMonTueWedThuFriSat
      
1
2
3
4
5
7
8
9
10
11
12
14
15
16
17
18
21
22
23
24
25
26
27
30
31
     
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20080329 Saturday March 29, 2008
Using Crossbow to get a host interface network for VirtualBox

Okay, my snv 73 box is now a snv 85 box. Everything is working except for my punchin, but only because I need to bypass my Sun Ray 1G (and the Sun Ray Server 4.0 was dead easy to install). But the vnic_setup.sh script is still not working:

# ./vnic_setup.sh 0:1:4a:f2:31:34
Invalid link name: LINK
# LD_LIBRARY_PATH=/opt/VirtualBox:/opt/VirtualBox/qtgcc/lib:. ; export LD_LIBRARY_PATH
# ./vnic_setup.sh 0:1:4a:f2:31:34
Invalid link name: LINK
# ./vnic_setup.sh 0:1:4a:f2:31:34 vnic1
vnic1
# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
bge0: flags=201000843 mtu 1500 index 2
	inet 192.168.2.130 netmask ffffff00 broadcast 192.168.2.255
	ether 0:a:e4:34:2f:da 
lo0: flags=2002000849 mtu 8252 index 1
	inet6 ::1/128 

Well the VirtualBox manual tells me how to do it manually:

# /usr/lib/vna bge0 0:1:4a:f2:31:34
vnic0
# ifconfig vnic0 plumb
# ifconfig -a 
lo0: flags=2001000849 mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
bge0: flags=201000843 mtu 1500 index 2
	inet 192.168.2.130 netmask ffffff00 broadcast 192.168.2.255
	ether 0:a:e4:34:2f:da 
vnic0: flags=201000842 mtu 1500 index 3
	inet 0.0.0.0 netmask 0 
	ether 0:1:4a:f2:31:34 
lo0: flags=2002000849 mtu 8252 index 1
	inet6 ::1/128 
#  /usr/lib/vna bge0 0:1:4a:f2:31:36
vnic1
# /usr/lib/vna bge0 0:1:4a:f2:31:38
vnic2
# ifconfig vnic1 plumb
# ifconfig vnic2 plumb
# ifconfig vnic0 192.168.2.150 destination 192.168.2.160 netmask 255.255.255.0 up
# ifconfig vnic1 192.168.2.151 destination 192.168.2.161  netmask 255.255.255.0 up
# ifconfig vnic2 192.168.2.152 destination 192.168.2.162 netmask 255.255.255.0 up
# ifconfig -a
lo0: flags=2001000849 mtu 8232 index 1
	inet 127.0.0.1 netmask ff000000 
bge0: flags=201000843 mtu 1500 index 2
	inet 192.168.2.130 netmask ffffff00 broadcast 192.168.2.255
	ether 0:a:e4:34:2f:da 
vnic0: flags=201000851 mtu 1500 index 3
	inet 192.168.2.150 --> 192.168.2.160 netmask ffffff00 
	ether 0:1:4a:f2:31:34 
vnic1: flags=201000851 mtu 1500 index 4
	inet 192.168.2.151 --> 192.168.2.161 netmask ffffff00 
	ether 0:1:4a:f2:31:36 
vnic2: flags=201000851 mtu 1500 index 5
	inet 192.168.2.152 --> 192.168.2.162 netmask ffffff00 
	ether 0:1:4a:f2:31:38 
lo0: flags=2002000849 mtu 8252 index 1
	inet6 ::1/128 

And when the system comes up, it has an IP of 192.168.2.29. And I can't ping any of the 3.

It looks like I need to learn the CLI for VirtualBox. Here is a related article: Internal network does not work for OpenSolaris guests


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080328 Friday March 28, 2008
Installing a host network for VirtualBox

As mentioned, I have a minimal install for my OpenSolaris test machines. Evidently I am missing something quite important for getting a host network up and running:

# ./vnic_setup.sh 0:1:4a:f2:31:34
./vnic_setup.sh: line 42: /usr/lib/vna: No such file or directory

Hmm, I can either load the iso and get the real files or learn how to work with the NAT mode.So either I do a complete reinstall or I figure how to get just a couple of files over. Sounds like I should learn about NAT anyway.

With two machines and NAT, they are both getting the same address. I could use the Internal Network option, but I'm still going to have to reinstall. Hmm, when I do a machine, I select Core Configure and add the following:

BIND DNS Name server and tools
BIND Name server Manifest
Filebench
Freeware Compression Tools
Perl-Compatible Regular Expressions
Freeware shells
Freeware Other Utilities
GLIB
XCU4 Utilities
GNU Autotools
Secure Shell
GNU which
gcmn - Common GNU package
gdb
ggrep
gtar
Lint Libraries (root)
Lint Libraries (usr)
GNU binutils
GNU diffutils
Perl 5.6.1 (core)
Perl 5.6.1 (non-core)
GNU textinfo
Libevent
Get all of System and Network Admin
Live Upgrade Software
MDB (root)
Programming Tools
resource pool (root)
Resource Pools in core software for resource pools
Solaris Zones
Vi IMproved
autoconf
bcc
coreutils
rsync
ROCSEC_GSS
Kerberos V5 KDC (root)
Kerberos V5 Master KDC (root)
Kerberos Version 5 support (kernel)
NIS Server for Solaris (root)
NIS Server for Solaris (usr)
Interprocess Communication

Time to find what I need to add. First we need to look in the ISO image:

[tdh@warlock ~]> sudo lofiadm -a /zoo/isos/x86/snv85/solarisdvd.iso 
/dev/lofi/1
[tdh@warlock ~]> sudo mount -F hsfs /dev/lofi/1 /mnt
[tdh@warlock Product]> cd /mnt/Solaris_11/Product/
[tdh@warlock Product]> grep lib/vna */pkgmap
SUNWcsu/pkgmap:1 f none usr/lib/vna 0555 root bin 12592 13767 1204942578

Hmm, wait, I need to find that on my host system and not the guest machine. D'Oh!

[tdh@warlock lib]> uname -a
SunOS warlock 5.11 snv_73 i86pc i386 i86pc
[tdh@warlock lib]> sudo lofiadm -d /zoo/isos/x86/snv7
snv79/ 

I have the DVD, but 73 is ancient! But I'll check:

[tdh@warlock Product]> grep lib/vna SUNWcsu/pkgmap
[tdh@warlock Product]> 

Ugh, the biggest hassle is that I use this machine as a Sun Ray Server. Okay, time for a reinstall!


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Do not need to get qt libaries with beta of VirtualBox

RTFM - at least the one which comes with the distribution. I think the online one does not have OpenSolaris support. And, I am right it does not.

Anyway, you do not need to retrieve the qt libraries and build, as I just did, it is supplied for you.

cd /opt/VirtualBox
LD_LIBRARY_PATH=/opt/VirtualBox:/opt/VirtualBox/qtgcc/lib:. ./VirtualBox

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Inital interactions with VirtualBox

Downloading and installing VirtualBox on my OpenSolaris box was a snap. Running and installing on it was a bit harder. First off, you need qt installed. Easy instructions are at VirtualBox on OpenSolaris. Then you need to tell it that something other than Right-Ctrl is your host key - you need to do this because Sun keyboards do not have that key . I chose my Right-Meta key. Imagine VirtualBox has your mouse and keyboard, your screensaver kicks in, and you can't get the system to understand you want to bring it to life.

Telling the tool how to load my OpenSolaris iso image was a bit counterintuitive. But the biggest problem I ended up having was only giving 8M to my graphics card. The Developer's install seemed okay with that, but the normal graphical install would puke on it. I was able to do a text install, until the point the install decided I was doing a NFS install and did not like the path I was giving. It also would not see the attached iso as a cdrom.

Kicking both tyres together - VirtualBox and Indiana was very helpful for this issue, Alan casually mentions you need 32M of Video Ram. I gave the system that and could then do the graphical install. And it correctly identified the iso as a cdrom. I'll bring that value down once I finish the install. I'm not going to install a GUI on this machine.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Followup om pNFS testing under VMWare

I got a pNFS community and client to run under VMWare on a XP box. Okay, so I made sure to make independent clones of the same machine as before. And this time I went from 512M to 1G of available RAM. The other thing I changed is that when building the cthon tests, I changed the config a bit.

I went from 512M to 1G because the archive updates were taking forever on the clones with 512M but went fast on the one with about 2G:

updating /platform/i86pc/boot_archive
updating /platform/i86pc/amd64/boot_archive

I'm talking 45 minutes or more. Once I pushed the memory up, these updated much faster.

I used the following configuration for using gcc on a 64bit OpenSolaris:

[tdh@m-client cthon04]> diff tests.init ~/cthon04/tests.init
57c57
< PATH=/opt/SUNWspro/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
---
> #PATH=/opt/SUNWspro/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
61c61
< #PATH=/opt/gnu/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
---
> PATH=/usr/sfw/bin:/usr/ccs/bin:/sbin:/bin:/usr/bin:/usr/ucb:/etc:.
133c133
< CC=/opt/SUNWspro/bin/cc
---
> #CC=/opt/SUNWspro/bin/cc
135c135
< #CC=/opt/gnu/bin/gcc
---
> CC=/usr/sfw/bin/gcc
138c138
< CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG`
---
> #CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG`
145c145
< #CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG -m64`
---
> CFLAGS=`echo -DSVR4 -DMMAP -DSOLARIS2X -DSTDARG -m64 -D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE`
150c150
< LOCKTESTS=`echo tlocklfs tlock64`
---
> #LOCKTESTS=`echo tlocklfs tlock64`
152c152
< #LOCKTESTS=`echo tlocklfs`
---
> LOCKTESTS=`echo tlocklfs`

In the previous run, I hadn't set -D_LARGEFILE64_SOURCE and I didn't fix LOCKTESTS correctly. While the -D_LARGEFILE64_SOURCE might have been what was killing me, I don't think so.

The DS hung during the write/read of the 30 MB file. It was unresponsive on the console. I heard the disk chugging, I killed off Thunderbird and Firefox. And it did come back. My guess is that the 512M on the earlier systems was insufficient. I've had problems in the past with virtual machines that were trying large IO. (A real machine wrote a 50G file to a NFS simulator and they complained about the speed. They shut up when I had them go against the real box.)

So the experiment works. I'm working on VirtualBox on the side still.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily
Trying to do pNFS testing under VMWare

I'm trying to get a NFSv4.1 (aka pNFS) DS, MDS, and client all running as VMWare machines on my XP box. I took a base nevada 85 system (with a Core Custom load - which only eats up 1.4G of disk space) and loaded the on-pnfs-draft19-onnv85-bfu-20080324.i386.tar.bz2 bfu bits (check out our pNFS Download page in OpenSolaris: http://www.opensolaris.org/os/project/nfsv41/downloads/).

I then cloned the resulting system into m-client, m-ds, and m-mds. I was able to configure everything up okay, but system is locking up during the NFS Cthon tests:


write/read 30 MB file

After some investigation, I don't think this is a pNFS issue. The m-ds machine is hanging, consistently. It will hang even if I don't run the test. It isn't dropping into kmdb and is totally unresponsive on the console.

Either I didn't clone the original machine correctly or I'm running out of resources. I've had at least 3 machines running concurrently in the past, so I doubt it is resources. Also, the machines each are limited to 512M of memory.

I may play with this a bit more or try VirtualBox, which can be hosted under Solaris and OpenSolaris. I can run it on my w2100z. It now has a whopping 16G of RAM and should be able to handle plenty of virtual machines.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080320 Thursday March 20, 2008
auth records will not load if no share enabled

If you fail to share a zfs pool on the mds, then nfs will not be enabled and therefore the mds auth records will not load.

#  mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
Mar 20 20:37:41 pnfs-3-15 nfs: NFS Server not loaded
# zpool create -f nippy /dev/dsk/c1t0d0s7
#  echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

mdb: failed to perform walk: unknown walk name

Above we see mdsadm clearly telling us we have an issue. And the fact that we can't use our simple mdb command to see the auth records is a worry.Just as I rebooted, I realized what the problem was. And we can check it out:

#  mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
Mar 20 20:42:54 pnfs-3-15 nfs: NFS Server not loaded
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
mdb: failed to perform walk: unknown walk name
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
mdb: failed to perform walk: unknown walk name
# zfs set sharenfs=rw,anon=0 nippy
#  echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

We can see that once the share is enabled, we can walk the structures. We can redo the mdsadm command:

# mdsadm -o add -t auth -a ip=10.1.233.117
adding: IP Addr - 10.1.233.117
# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k

What is wrong now? We need to enable the ds:

# dservadm enable

And:

# echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
{
    dbe = 0xffffff02f4156f08
    dev_addr = {
        na_r_netid = 0xffffff02f61bfc80 "tcp"
        na_r_addr = 0xffffff02efde8a88 "10.1.233.117.147.49"
    }
    dev_flags = 0x3
    dev_infop = 0xffffff02f4157f78
    dev_list_next = {
        list_next = 0xffffff02f4157fb8
        list_prev = 0xffffff02f4157fb8
    }
}

Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080319 Wednesday March 19, 2008
Some behind the scenes tips for getting a community working

In no particular order...

  • Do a banner of mds, client, and ds (or ds1, ds2, etc) and place it in /etc/motd for the respective hosts. It will help you more than you think. Especially after a core dump and logging in to the machine.
  • The first clue as to whether your community is setup correctly is whether or not you have an auth record on the mds.

    Here is a bad result

    mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
    mds #
    

    And here is a good result:

    # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
    {
        dbe = 0xffffff02fbb21f08
        dev_addr = {
            na_r_netid = 0xffffff02f6d17b20 "tcp"
            na_r_addr = 0xffffff02ec14c000 "10.1.233.119.217.62"
        }
        dev_flags = 0x3
        dev_infop = 0xffffff02fbb22f78
        dev_list_next = {
            list_next = 0xffffff02fbb22fb8
            list_prev = 0xffffff02fbb22fb8
        }
    }
    

    Unfortunately, many problems have this symptom.

  • The only way to tell that your client is talking to the ds is to use snoop. I.e., you may have the client and mds telling you that pNFS is enabled, but until you see traffic to the ds, it is not working!
    client # mount pnfs-3-14:/ /mnt
    Mar 19 11:29:21 pnfs-3-12 nfs: NOTICE: enabling pNFS on pnfs-3-14
    client # cd /mnt
    client # cd
    client # snoop pnfs-3-14 pnfs-3-13
    Using device bge0 (promiscuous mode)
    ^Cclient # cd  
    

    I typically open another window and do simple copies from /etc:

    [pnfs-3-12 zippy]> cp /etc/motd .
    [pnfs-3-12 zippy]> cp /etc/motd kkk
    

    It is only when I see valid traffic that I know I am set.

    # snoop pnfs-3-12 pnfs-3-13
    Using device bge0 (promiscuous mode)
    Mar 19 11:39:07 pnfs-3-12 nfs: NOTICE: enabling pNFS on pnfs-3-14
       pnfs-3-12 -> pnfs-3-13    TCP D=55614 S=1014 Ack=93767047 Seq=74796989 Len=0 Win=49640
       pnfs-3-12 -> pnfs-3-13    NFS C 4 (exchange_id ) EXCHANGE_ID Verf=11E050B247E141B2 COID=F6C0 PNFS_DS  NONE 
    
  • Make sure that you share your pool on the DS.
    ds # zpool create -f dpool /dev/dsk/c1t0d0s7
    ds # dservadm addmds 10.1.233.120.8.1
    ds # dservadm addpool dpool
    ds # dservadm enable
    

    If you don't, then you won't see traffic going from the client to the DS.

    client # snoop pnfs-3-12 pnfs-3-13
    Using device bge0 (promiscuous mode)
    Mar 19 11:19:26 pnfs-3-12 last message repeated 2 times
    

    So go back and do:

    ds # zfs set sharenfs=on dpool
    

    Note you will also not see a record for the auth in the mds:

    mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
    
  • You need to add the mds auth record every time the mds reboots.
    • Make sure that you disable the ds before you enable the mds:
      ds # dservadm disable
      

      And

      # mdsadm -o add -t auth -a ip=10.1.233.119
      adding: IP Addr - 10.1.233.119
      
      ds # dservadm enable
      
    • You might see two auth records if the ds is not disabled:
      mds # mdsadm -o add -t auth -a ip=10.1.233.119
      adding: IP Addr - 10.1.233.119
      mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
      

      Whoops, forgot to disable the ds!

      ds # dservadm disable
      

      Still nothing!

      mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
      

      Okay, issues again:

      mds # mdsadm -o add -t auth -a ip=10.1.233.119
      adding: IP Addr - 10.1.233.119
      nfssys:: File exists
      mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
      

      And finally:

      ds # dservadm enable
      

      Which yields:

      # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
      {
          dbe = 0xffffff02f4493f08
          dev_addr = {
              na_r_netid = 0xffffff02f76f28e0 "tcp"
              na_r_addr = 0xffffff02f51ec820 "10.1.233.119.254.189"
          }
          dev_flags = 0x3
          dev_infop = 0xffffff02f4494f78
          dev_list_next = {
              list_next = 0xffffff02f4494fb8
              list_prev = 0xffffff02f4494fb8
          }
      }
      

      Hmm, I wasn't able to reproduce the double record thing... Oh yes I was!

      ds # dservadm disable
      ds # dservadm enable
      

      And

      mds # echo '::walk Device_entry_cache | ::print struct rfs4_dbe data | ::print mds_device_t' | mdb -k
      {
          dbe = 0xffffff02f4493e18
          dev_addr = {
              na_r_netid = 0xffffff02f76f2fc0 "tcp"
              na_r_addr = 0xffffff02f51ecb20 "10.1.233.119.212.217"
          }
          dev_flags = 0x3
          dev_infop = 0xffffff02f4494f78
          dev_list_next = {
              list_next = 0xffffff02f4494fb8
              list_prev = 0xffffff02f4494fb8
          }
      }
      {
          dbe = 0xffffff02f4493f08
          dev_addr = {
              na_r_netid = 0xffffff02f76f28e0 "tcp"
              na_r_addr = 0xffffff02f51ec820 "10.1.233.119.254.189"
          }
          dev_flags = 0x3
          dev_infop = 0xffffff02f4494f78
          dev_list_next = {
              list_next = 0
              list_prev = 0
          }
      }
      

      The point is with multiple auth records you may not get the results you want. I'd reboot them both at this point. :->

  • When testing connectathon as root, make sure your share on the mds has either root= or anon=0 set:
    client # ./server -p /nippy -m /mnt/pnfs-3-14 pnfs-3-14
    Start tests on path /mnt/pnfs-3-14/pnfs-3-12.test [y/n]? y
    
    sh ./runtests  -a -t /mnt/pnfs-3-14/pnfs-3-12.test
    
    Starting BASIC tests: test directory /mnt/pnfs-3-14/pnfs-3-12.test (arg: -t)
    mkdir: Failed to make directory "/mnt/pnfs-3-14/pnfs-3-12.test"; Permission denied
    Can't make directory /mnt/pnfs-3-14/pnfs-3-12.test
    basic tests failed
    Tests failed, leaving /mnt/pnfs-3-14 mounted
    

    So:

    mds #share
    -@nippy         /nippy   rw   ""  
    mds # zfs set sharenfs=rw,anon=0 nippy
    

  • Originally posted on Kool Aid Served Daily
    Copyright (C) 2008, Kool Aid Served Daily

20080313 Thursday March 13, 2008
Using mdb to enable error injection

I've got a bug for which I do not have a reproducible test case. But I'm pretty confident I found what is going wrong. I can run regression tests to show I haven't broken anything - but those same regression tests never tripped the bug in the first place.

The fix is:

------- usr/src/uts/common/fs/nfs/nfs4_stub_vnops.c -------

Index: usr/src/uts/common/fs/nfs/nfs4_stub_vnops.c
23c23
<  * Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
---
>  * Copyright 2008 Sun Microsystems, Inc.  All rights reserved.
27c27
< #pragma ident      "@(#)nfs4_stub_vnops.c  1.3     07/10/25 SMI"
---
> #pragma ident      "%Z%%M% %I%     %E% SMI"
1751a1757,1769
>             * Someone is already working on it. We
>             * need to back off and let them proceed.
>             *
>             * We return EBUSY so that the caller knows
>             * something is going on. Note that by that
>             * time, the umount in the other thread
>             * may have already occured.
>             */
>            if (was_locked) {
>                    return (EBUSY);
>            }
> 
>            /*
1762,1763c1780
<            if (was_locked == FALSE &&
<                !mutex_tryenter(&net->net_tree_lock)) {
---
>            if (!mutex_tryenter(&net->net_tree_lock)) {
1814c1831
<            } else if (was_locked == FALSE) {
---
>            } else {

In English, the lock detection used to only handle when the lock was not being held.

What I want to do is force was_locked to be true at this point in both the original code (to verify I can trigger the panic at will) and also in my fix (to verify I have fixed the correct bug).

I can do that by adding the following code:

# wx diffs

------- usr/src/uts/common/fs/nfs/nfs4_stub_vnops.c -------

122a123,124
> int   nfsv4_mm_was_locked = FALSE;
> 
1750a1753,1755
>               if (nfsv4_mm_was_locked)
>                       was_locked = TRUE;

This is a global in the nfs module which by default does not force was_locked to be set. I can use mdb to change it on the fly:

# mdb -kw
Loading modules: [ unix genunix specfs dtrace cpu.generic cpu_ms.AuthenticAMD.15 uppc pcplusmp scsi_vhci ufs mpt ip hook neti sctp arp usba fctl nca lofs cpc random zfs nfs fcip logindmux ptm sppp ]
> nfsv4_mm_was_locked::print
0
> nfsv4_mm_was_locked/W 1
nfsv4_mm_was_locked:            0               =       0x1
> nfsv4_mm_was_locked::print
0x1
> $q

Note that I do not want to add special code to check the environment, add something in /etc/default/nfs, or anything else which requires changing anything on a system. I leave it entirely in the kernel and I use mdb to control it.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily

20080306 Thursday March 06, 2008
sped not being called

So I have a new daemon and I have it loaded:

# ./sped
10, 16, 64000, 1647890964l uid == 501
20, 32, 2000, 1647890964l uid == 1066
30, 64, 1000, 1647890964l uid == 0
40, 2, 2000, 1647890964l subnet ==  192.168.2.0/24

And I have a door upcall which starts in do_rfs4_op_mknod(). A door is an IPC mechanism used to communicate from the kernel to user land. Also, since the sped hands out policies for open creation, this should be a good point to call it from. Notice the foreshadowing invoked by the word "should" and remember that foreshadowing is the sign of good journalism (I think this is a quote from Bloom County?).

Okay, it didn't work - the sped is not being called. I can whip out kmdb and struggle through trying to see if the door upcall is invoked. But first, we should do a sanity check to see if do_rfs4_op_mknod() is even being called:

# dtrace -m nfssrv > /tmp/dtrace.log
dtrace: description 'nfssrv' matched 2262 probes
^C# grep do /tmp/dtrace.log
 0  51547              mds_do_lookup:entry
 0  51548             mds_do_lookup:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 0  52379          mds_findopenowner:entry
 0  52380         mds_findopenowner:return
 0  51593            mds_do_opennull:entry
 0  52127          do_rfs4_set_attrs:entry
 0  52128         do_rfs4_set_attrs:return
 0  51777             vop_fid_pseudo:entry
 0  51778            vop_fid_pseudo:return
 0  52641           do_41_deleg_hack:entry
 0  52642          do_41_deleg_hack:return
 0  51591                mds_do_open:entry
 0  51592               mds_do_open:return
 0  51594           mds_do_opennull:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 0  52741         do_rfs4_op_getattr:entry
 0  52742        do_rfs4_op_getattr:return
 1  52127          do_rfs4_set_attrs:entry
 1  52128         do_rfs4_set_attrs:return
 1  51547              mds_do_lookup:entry
 1  51548             mds_do_lookup:return
 1  52741         do_rfs4_op_getattr:entry
 1  52742        do_rfs4_op_getattr:return 

Remember, the rfs component of the function name means that is in the nfssrv module and not the nfs module. Also, note that you won't see "mds_" names in code drops from Nevada. You will need to download the pNFS tree over at: OpenSolaris Project: NFS version 4.1 pNFS

Okay, it was never called. So either dtrace is hosed or something is going wrong. What is being called? mds_createfile()

 0  51593            mds_do_opennull:entry
 0  51987             mds_createfile:entry
 0  52647            nfsauth4_access:entry
 0  52377             nfsauth_access:entry
 0  52378            nfsauth_access:return
 0  52648           nfsauth4_access:return
 0  52399       nfs4_ntov_table_init:entry
 0  52400      nfs4_ntov_table_init:return 

I'm off to build a new nfssrv to see if making my upcall from mds_createfile() works.


Originally posted on Kool Aid Served Daily
Copyright (C) 2008, Kool Aid Served Daily