« August 2009 »
SunMonTueWedThuFriSat
      
3
4
5
6
7
8
9
11
13
14
15
17
18
19
20
21
22
23
24
25
26
28
29
31
     
Today
XML

Neat blogs

Navigation

Editing

Powered by Roller Weblogger.

statcounter.com

clustrmaps.com

Locations of visitors to this page

technorati.com

20090812 Wednesday August 12, 2009
Regression testing saves my bacon

So, my unit testing is all done and it is time to integrate. Bzzt! Wrong answer - we have to do regression testing. In our case, it is called pNFS/miniPIT and was mainly crafted by Helen Chao. And it was bombing on my new code.

First of all, the core was confusing:

[root@pnfs-minipit2-4 ~]> 
panic[cpu0]/thread=30008a61560: 000000047ad
BAD TRAP: type=31 rp=2a100f596d0 addr=14 mmu_fsr=0 occurred in module "nfssrv" due to a NULL pointer dereference

nfsd: trap type = 0x31
addr=0x14
pid=100528, pc=0x7afb3d64, sp=0x2a100f58f71, tstate=0x4414001604, context=0x29b
g1-g7: 7be42af4, 2, 2000, 0, 0, 29b, 30008a61560

The NULL pointer wasn't anything to do with the function being called. Secondly, I focused on trying to reproduce it inside the test harness - it wasn't until I made a simple test case that I progressed.

Third, I focused on the server being 32 bit instead of the real difference - it was not setup as a MDS - therefore it had no DSes reporting to it. I'll show why that is important:

4463         error = nfs41_spe_allocate(&spe_va, claddr,
4464             vp->v_path, &lc, TRUE);
4465         if (error) {
4466                 /*
4467                  * XXX: Until we get the SMF code
4468                  * in place, we handle all errors by
4469                  * using the default layout of the
4470                  * old prototype code
4471                  *
4472                  * At that point, we should return the
4473                  * given error.
4474                  *
4475                  * XXX: Any way *plo is NULL here?
4476                  */
4477                 *plo = mds_gen_default_layout(cs->instp);
4478 
4479                 /*
4480                  * Record the layout, don't get bent out of shape
4481                  * if it fails, we'll try again at checkstate time.
4482                  */
4483                 (void) mds_put_layout(*plo, vp);
4484 
4485                 return (NFS4_OK);
4486         }
4487 
4488         /*
4489          * XXX: Any way *plo is NULL here?
4490          */
4491         *plo = mds_add_layout(&lc);

The fourth, and major problem, is that I had valid comments and I ignored them. Yes, it was certainly valid for the *plo to be NULL after 4477 but probably not 4491. But they are both easy enough to code for!

The issue is that a NFSv4.1 server, i.e., no pNFS, also goes through this code path. We need to handle not having a layout:

4474                  * At that point, we should return the
4475                  * given error.
4476                  */
4477                 *plo = mds_gen_default_layout(cs->instp);
4478                 if (*plo == NULL) {
4479                         status = NFS4ERR_LAYOUTUNAVAILABLE;
4480                 } else {
4481                         /*
4482                          * Record the layout, don't get
4483                          * bent out of shape if it fails,
4484                          * we'll try again at checkstate time.
4485                          */
4486                         (void) mds_put_layout(*plo, vp);
4487                 }
4488 
4489                 return (status);

Which appears okay, except with just this change, files will not be created. We want to return NFS4ERR_LAYOUTUNAVAILABLE, mainly for DTrace probing, but in the caller, we want to do:

4892         } else {
4893                 status = mds_createfile_get_layout(req, vp, cs, &ct, plo);
4894 
4895                 /*
4896                  * Allow mds_createfile_get_layout() to be verbose
4897                  * in what it presents as a status, but be aware
4898                  * that it is permissible to not generate a
4899                  * layout.
4900                  */
4901                 if (status == NFS4ERR_LAYOUTUNAVAILABLE) {
4902                         status = NFS4_OK;
4903                 }
4904         }

Originally posted on Kool Aid Served Daily
Copyright (C) 2009, Kool Aid Served Daily