« December 2009
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  
       
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20090725 Saturday July 25, 2009
And we have the pee in pNFS!

That last bug was actually pretty easy to solve. It turned out to be more code which had never been tested before. If we look at the code in usr/src/uts/common/fs/nfs/dserv_server.c (which will be modified by the time you read this), we see:

 711         mutex_enter(&inst->dmi_content_lock);
 712         error = find_open_root_objset(inst, sid, &root_objset);
 713         if (error) {
 714                 error = (error == ENOENT) ? EIO : error;
 715                 goto out;
 716         }
 717 
 718         error = find_open_mdsfs_objset(inst, dataset_id, root_objset,
 719             &(dnd->dnd_objset));
 720         if (error == 0 || error != ENOENT) {
 721                 if (error == 0)
 722                         dnd->dnd_flags |= DSERV_NNODE_FLAG_OBJSET;
 723                 goto out;
 724         }

The EAGAIN in Hard day of debugging the hard stuff, but now have multiple datasets on the same DS was returned on 712 by find_open_root_objset(). The new panic was occurring down in the call to find_open_mdsfs_objset() on 718. Before my changes, find_open_root_objset() always returned the first pNFS dataset root. So if you had two, you only ever saw the one.

I decided to skip right to the code and it ended up being easy to spot in find_open_root_objset(), I started there because I hadn't changed find_open_mdsfs_objset() and it was known to work.

 541 static int
 542 find_open_root_objset(dserv_mds_instance_t *inst, mds_sid mds_sid,
 543     open_root_objset_t **root_objset)
 544 {
...
 603         /*
 604          * Find the root pNFS object set.
 605          */
 606         for (tmp_root = list_head(&inst->dmi_datasets); tmp_root != NULL;
 607             tmp_root = list_next(&inst->dmi_datasets, tmp_root)) {
 608                 if (ds_guid.dg_zpool_guid ==
 609                     tmp_root->oro_ds_guid.dg_zpool_guid &&
 610                     ds_guid.dg_objset_guid ==
 611                     tmp_root->oro_ds_guid.dg_objset_guid) {
 612                         /*
 613                          * This is our root pNFS object set!
 614                          */
 615                         found_root_objset = 1;
 616                         break;
 617                 }
 618         }

*root_objset is never set when it is found. So, when we use it on 718, it is garbage. The fix is a simple assignment. Hmm, I could replace tmp_root with *root_objset, which is what the original author seemed to think was going on here.

But in any event, here is the 'pee' in pNFS:

[root@pnfs-17-21 ~]> mount -o vers=4 pnfs-17-24:/pnfs2/pnfs /pnfs/pnfs-17-24
[root@pnfs-17-21 ~]> cp /etc/passwd /pnfs/pnfs-17-24/qwhoei
[root@pnfs-17-21 ~]> nfsstat -l /pnfs/pnfs-17-24/qwhoei
Number of layouts: 1
Proxy I/O count: 0
DS I/O count: 1
Layout [0]:
        Layout obtained at: Sat Jul 25 02:20:00:343367 2009
        status: UNKNOWN, iomode: LAYOUTIOMODE_RW
        offset: 0, length: EOF
        num stripes: 4, stripe unit: 32768
        Stripe [0]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [1]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [2]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK
        Stripe [3]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK
[root@pnfs-17-21 ~]> ls -la /pnfs/pnfs-17-24/qwhoei
-rw-r--r--   1 root     root         881 Jul 25 02:20 /pnfs/pnfs-17-24/qwhoei

Well, actually, there really isn't any parallel activity going on here. The file is 881 bytes and the default stripe size is 32k. So it all goes to stripe 0. But there is a lot going on in the background which was touched by my changes.

I could either start testing the kspe to get a smaller stripe size or I can use mkfile to get a file of 2*4*32k such that there are two writes to each stripe.

[root@pnfs-17-21 ~]> mkfile 256k chunky
[root@pnfs-17-21 ~]> ls -la chunky 
-rw------T   1 root     root      262144 Jul 25 02:34 chunky
[root@pnfs-17-21 ~]> cp chunky /pnfs/pnfs-17-24/pChunky
[root@pnfs-17-21 ~]> ls -la /pnfs/pnfs-17-24/pChunky
-rw-------   1 root     root      262144 Jul 25 02:34 /pnfs/pnfs-17-24/pChunky
[root@pnfs-17-21 ~]>  nfsstat -l /pnfs/pnfs-17-24/pChunky
Number of layouts: 1
Proxy I/O count: 0
DS I/O count: 8
Layout [0]:
        Layout obtained at: Sat Jul 25 02:34:22:325669 2009
        status: UNKNOWN, iomode: LAYOUTIOMODE_RW
        offset: 0, length: EOF
        num stripes: 4, stripe unit: 32768
        Stripe [0]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [1]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [2]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK
        Stripe [3]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK

But I have no idea if the file is identical. I'll test that later...

I can test that with using a large text file, say 440k:

[root@pnfs-17-21 ~]>  cp nfs4_vnops.c /pnfs/pnfs-17-24/Pnfs4_vnops.c
[root@pnfs-17-21 ~]> nfsstat -l /pnfs/pnfs-17-24/Pnfs4_vnops.c
Number of layouts: 1
Proxy I/O count: 0
DS I/O count: 13
Layout [0]:
        Layout obtained at: Sat Jul 25 02:55:17:818063 2009
        status: UNKNOWN, iomode: LAYOUTIOMODE_RW
        offset: 0, length: EOF
        num stripes: 4, stripe unit: 32768
        Stripe [0]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [1]:
                tcp:pnfs-17-22.Central.Sun.COM:10.1.233.192:47009 OK
        Stripe [2]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK
        Stripe [3]:
                tcp:pnfs-17-23.Central.Sun.COM:10.1.233.193:47009 OK
[root@pnfs-17-21 ~]> ls -la  /pnfs/pnfs-17-24/Pnfs4_vnops.c
-rw-r--r--   1 root     root      409540 Jul 25 02:55 /pnfs/pnfs-17-24/Pnfs4_vnops.c
[root@pnfs-17-21 ~]> diff nfs4_vnops.c /pnfs/pnfs-17-24/Pnfs4_vnops.c

So we have pNFS!


Originally posted on Kool Aid Served Daily
Copyright (C) 2009, Kool Aid Served Daily

Trackback URL: http://blogs.sun.com/tdh/entry/and_we_have_the_pee
Comments:

Post a Comment:

Name:
E-Mail:
URL:

Your Comment:

HTML Syntax: NOT allowed